Using nabu from python to reconstruct a dataset with GPU¶
This notebook shows how to use the Nabu software for performing a basic reconstruction of a tomography dataset.
The computations are done on a local machine with a GPU and Cuda available.
This tutorial goes a bit further than nabu_basic_reconstruction.ipynb
:
- GPU implementation of each component is used
- We see how to start from a configuration file and devise a simple processing chain accordingly
The same dataset is used (binned scan of a bamboo stick, thanks Ludovic Broche, ESRF ID19).
1 - Load the dataset informations¶
We must provide nabu
with the the configuration file (nabu.conf
), describing the path to the dataset and the processing steps. This is the equivalent of the .par
file in PyHST2. In this file, no information is given on the detector size, energy, distance, etc: these informations are extracted from the dataset metadata.
import os
from nabu.testutils import utilstest, get_file
from nabu.pipeline.fullfield.processconfig import ProcessConfig
print("Getting dataset (downloading if necessary) ...")
data_path = get_file("bamboo_reduced.nx")
print("... OK")
# Get the configuration file of this dataset
conf_fname = get_file("bamboo_reduced.conf")
# Change directory to the path where the data is located (only useful for this tutorial)
os.chdir(utilstest.data_home)
# Parse this configuration file
conf = ProcessConfig(conf_fname)
Getting dataset (downloading if necessary) ...
... OK Option 'double_flatfield_enabled' has been renamed 'double_flatfield' in [preproc] This is deprecated since version 2025.1.0 and will result in an error in futures versions Browsing dataset Updating dataset information with user configuration Loaded darks from /tmp/nabu_testdata_pierre/bamboo_reduced_darks.hdf5 Loaded flats from /tmp/nabu_testdata_pierre/bamboo_reduced_flats.hdf5 Doing dataset estimations Could not get an initial estimate for center of rotation in data file Estimating center of rotation CenterOfRotationSlidingWindow.find_shift({'side': 'center', 'window_width': None, 'roi_yxhw': None, 'median_filt_shape': None, 'peak_fit_radius': 1, 'high_pass': None, 'low_pass': None, 'return_validity': False, 'return_relative_to_middle': False}) Estimated center of rotation: 338.986 Doing coupled validation
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[2], line 12 9 os.chdir(utilstest.data_home) 11 # Parse this configuration file ---> 12 conf = ProcessConfig(conf_fname) File ~/.venv/py311/lib/python3.11/site-packages/nabu/pipeline/processconfig.py:67, in ProcessConfigBase.__init__(self, conf_fname, conf_dict, dataset_info, create_logger) 64 self._dataset_estimations() 66 # Step (4) ---> 67 self._coupled_validation() 69 # Step (5) 70 self._build_processing_steps() File ~/.venv/py311/lib/python3.11/site-packages/nabu/pipeline/fullfield/processconfig.py:255, in ProcessConfig._coupled_validation(self) 253 def _coupled_validation(self): 254 self.logger.debug("Doing coupled validation") --> 255 self._dataset_validator = FullFieldDatasetValidator(self.nabu_config, self.dataset_info) 256 # Not so ideal to propagate fields like this 257 for what in ["rec_params", "rec_region", "binning"]: File ~/.venv/py311/lib/python3.11/site-packages/nabu/pipeline/dataset_validator.py:30, in DatasetValidatorBase.__init__(self, nabu_config, dataset_info, logger) 28 self.logger = LoggerOrPrint(logger) 29 self.rec_params = copy_dict_items(self.nabu_config["reconstruction"], self.nabu_config["reconstruction"].keys()) ---> 30 self._validate() File ~/.venv/py311/lib/python3.11/site-packages/nabu/pipeline/fullfield/dataset_validator.py:10, in FullFieldDatasetValidator._validate(self) 8 self._get_output_filename() 9 self._check_can_do_flatfield() ---> 10 self._check_can_do_phase() 11 self._check_can_do_reconstruction() 12 self._check_slice_indices() File ~/.venv/py311/lib/python3.11/site-packages/nabu/pipeline/fullfield/dataset_validator.py:38, in FullFieldDatasetValidator._check_can_do_phase(self) 36 if self.nabu_config["phase"]["method"] is None: 37 return ---> 38 self.dataset_info.check_defined_attribute("distance") 39 self.dataset_info.check_defined_attribute("pixel_size") File ~/.venv/py311/lib/python3.11/site-packages/nabu/resources/dataset_analyzer.py:280, in DatasetAnalyzer.check_defined_attribute(self, name, error_msg) 276 """ 277 Utility function to check that a given attribute is defined. 278 """ 279 if getattr(self, name, None) is None: --> 280 raise ValueError(error_msg or str("No information on %s was found in the dataset" % name)) ValueError: No information on distance was found in the dataset
Note that ProcessConfig
will do quite a few things under the hood:
- Parse the configuration file and check parameters correctness
- Browse the dataset
- Get or compute the reduced flats/darks
- Estimate the center of rotation
The resulting object contains all necessary information to process the dataset.
# We can easily get information on the processing steps.
nabu_config = conf.nabu_config
from pprint import pprint
pprint(nabu_config)
# The same can be done with the dataset structure
dataset_info = conf.dataset_info
# print([getattr(dataset_info, attr) for attr in ["energy", "distance", "n_angles", "radio_dims"]])
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[3], line 2 1 # We can easily get information on the processing steps. ----> 2 nabu_config = conf.nabu_config 3 from pprint import pprint 4 pprint(nabu_config) NameError: name 'conf' is not defined
2 - Chunk processing¶
Nabu processes data by chunks of radios (see the documentation for more explanations).
In a first step, we define how to read chunks of radios.
from nabu.io.reader import NXTomoReader
What is the largest chunk size we can process ?
The answer is given by inspecting the current GPU memory, and the processing steps.
from nabu.cuda.utils import get_gpu_memory
from nabu.pipeline.fullfield.computations import estimate_max_chunk_size
/home/pierre/.venv/py311/lib/python3.11/site-packages/pytools/persistent_dict.py:52: RecommendedHashNotFoundWarning: Unable to import recommended hash 'siphash24.siphash13', falling back to 'hashlib.sha256'. Run 'python3 -m pip install siphash24' to install the recommended hash. warn("Unable to import recommended hash 'siphash24.siphash13', "
chunk_size = estimate_max_chunk_size(
get_gpu_memory(0),
conf
)
print("Chunk_size = %d" % chunk_size)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[6], line 3 1 chunk_size = estimate_max_chunk_size( 2 get_gpu_memory(0), ----> 3 conf 4 ) 5 print("Chunk_size = %d" % chunk_size) NameError: name 'conf' is not defined
# Load the first 'chunk_size' lines of all the radios
# i.e do projections_data[:, 0:chunk_size, :]
sub_region = (
slice(None),
slice(0, chunk_size),
slice(None)
)
projections_reader = NXTomoReader(
data_path,
sub_region=sub_region,
)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[7], line 5 1 # Load the first 'chunk_size' lines of all the radios 2 # i.e do projections_data[:, 0:chunk_size, :] 3 sub_region = ( 4 slice(None), ----> 5 slice(0, chunk_size), 6 slice(None) 7 ) 8 projections_reader = NXTomoReader( 9 data_path, 10 sub_region=sub_region, 11 ) NameError: name 'chunk_size' is not defined
# Load the current chunk
print("Loading data", end="")
projections = projections_reader.load_data() # takes some time
print("... OK")
Loading data
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[8], line 3 1 # Load the current chunk 2 print("Loading data", end="") ----> 3 projections = projections_reader.load_data() # takes some time 4 print("... OK") NameError: name 'projections_reader' is not defined
3 - Initialize the GPU¶
Most of the processing can be done on GPU (or many-core CPU if using OpenCL).
With pycuda.gpuarray
(or its OpenCL counterpart pyopencl.array
), we manipulate array objects with memory residing on device. This allows to avoid extraneous host <-> device copies.
import pycuda.gpuarray as garray
from nabu.cuda.utils import get_cuda_context
import numpy as np
# Create a Cuda context on device ID 0
# By default, all following GPU processings will be bound on this context
ctx = get_cuda_context(device_id=0)
n_angles, n_z, n_x = projections.shape
# transfer the chunk on GPU
d_radios = garray.to_gpu(projections)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[11], line 1 ----> 1 n_angles, n_z, n_x = projections.shape 2 # transfer the chunk on GPU 3 d_radios = garray.to_gpu(projections) NameError: name 'projections' is not defined
4 - Pre-processing¶
Pre-processing utilities are available in the nabu.preproc
module.
Utilities available with the cuda backend are implemented in a module with a _cuda
suffix.
4.1 - Flat-field¶
from nabu.preproc.flatfield_cuda import CudaFlatField
radios_indices = sorted(conf.dataset_info.projections.keys())
# Configure the `FlatField` processor
cuda_flatfield = CudaFlatField(
d_radios.shape,
dataset_info.get_reduced_flats(sub_region=sub_region),
dataset_info.get_reduced_darks(sub_region=sub_region),
radios_indices=radios_indices,
)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[13], line 1 ----> 1 radios_indices = sorted(conf.dataset_info.projections.keys()) 2 # Configure the `FlatField` processor 3 cuda_flatfield = CudaFlatField( 4 d_radios.shape, 5 dataset_info.get_reduced_flats(sub_region=sub_region), 6 dataset_info.get_reduced_darks(sub_region=sub_region), 7 radios_indices=radios_indices, 8 ) NameError: name 'conf' is not defined
# Perform the normalization on GPU
if nabu_config["preproc"]["flatfield"]:
print("Doing flat-field", end="")
cuda_flatfield.normalize_radios(d_radios)
print("... OK")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[14], line 2 1 # Perform the normalization on GPU ----> 2 if nabu_config["preproc"]["flatfield"]: 3 print("Doing flat-field", end="") 4 cuda_flatfield.normalize_radios(d_radios) NameError: name 'nabu_config' is not defined
4.2 - Phase retrieval¶
from nabu.preproc.phase_cuda import CudaPaganinPhaseRetrieval
energy = dataset_info.energy
# Phase retrieval is done on each radio individually, with the sub-region specified above
if (nabu_config["phase"]["method"] or "").lower() == "paganin":
print("Doing phase retrieval", end="")
cudapaganin = CudaPaganinPhaseRetrieval(
(n_z, n_x),
distance=dataset_info.distance,
energy=energy,
delta_beta=nabu_config["phase"]["delta_beta"],
pixel_size=dataset_info.pixel_size * 1e6,
)
for i in range(n_angles):
cudapaganin.apply_filter(d_radios[i], output=d_radios[i])
print("... OK")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[16], line 1 ----> 1 energy = dataset_info.energy 2 # Phase retrieval is done on each radio individually, with the sub-region specified above 3 if (nabu_config["phase"]["method"] or "").lower() == "paganin": NameError: name 'dataset_info' is not defined
4.3 - Logarithm¶
from nabu.preproc.ccd_cuda import CudaLog
if nabu_config["preproc"]["take_logarithm"]:
print("Taking logarithm", end="")
cuda_log = CudaLog(d_radios.shape, clip_min=0.01)
cuda_log.take_logarithm(d_radios)
print("... OK")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[18], line 1 ----> 1 if nabu_config["preproc"]["take_logarithm"]: 2 print("Taking logarithm", end="") 3 cuda_log = CudaLog(d_radios.shape, clip_min=0.01) NameError: name 'nabu_config' is not defined
5 - Reconstruction¶
We use the filtered backprojection with nabu.reconstruction.fbp
from nabu.reconstruction.fbp import Backprojector
rec_options = conf.processing_options["reconstruction"]
B = Backprojector(
(n_angles, n_x),
angles=rec_options["angles"],
rot_center=rec_options["rotation_axis_position"],
padding_mode="edges",
# extra_options={"use_textures": False}
)
d_recs = garray.zeros((n_z, n_x, n_x), "f")
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[20], line 1 ----> 1 rec_options = conf.processing_options["reconstruction"] 2 B = Backprojector( 3 (n_angles, n_x), 4 angles=rec_options["angles"], (...) 7 # extra_options={"use_textures": False} 8 ) 9 d_recs = garray.zeros((n_z, n_x, n_x), "f") NameError: name 'conf' is not defined
print("Reconstructing...", end="")
for i in range(n_z):
B.fbp(d_radios[:, i, :], output=d_recs[i])
recs = d_recs.get()
print(" ... OK")
Reconstructing...
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[21], line 2 1 print("Reconstructing...", end="") ----> 2 for i in range(n_z): 3 B.fbp(d_radios[:, i, :], output=d_recs[i]) 4 recs = d_recs.get() NameError: name 'n_z' is not defined
6 - Visualize¶
import matplotlib.pyplot as plt
plt.figure()
plt.imshow(recs[0], cmap="gray")
plt.show()
--------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[23], line 2 1 plt.figure() ----> 2 plt.imshow(recs[0], cmap="gray") 3 plt.show() NameError: name 'recs' is not defined
<Figure size 640x480 with 0 Axes>