{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Modeling of the thickness of the sensor\n", "\n", "In this notebook we will re-use the experiment done at ID28 and previously calibrated and model in 3D the detector.\n", "\n", "This detector is a Pilatus 1M with a 450µm thick silicon sensor. Let's first have a look at the absorption coefficients of this sensor material: \n", "\n", "Reference absorption coeficients are available from:\n", "https://physics.nist.gov/PhysRefData/XrayMassCoef/ElemTab/z14.html\n", "\n", "First we retieve the results of the previous step, then calculate the absorption efficiency:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "# use `widget` instead of `inline` for better user-experience. `inline` allows to store plots into notebooks." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using pyFAI vesrion: 2025.10.0-dev0 on a AMD Ryzen Threadripper PRO 3975WX 32-Cores with 64 threads\n", "wavelength: 6.968e-11m,\t dist: 2.845e-01m,\t poni1: 8.865e-02m,\t poni2: 1.779e+01m,\t energy: 17.793keV\n" ] } ], "source": [ "import time\n", "import os\n", "from matplotlib.pyplot import subplots\n", "import numpy\n", "from scipy.sparse import save_npz\n", "import pyFAI\n", "import pyFAI.units\n", "import pyFAI.detectors\n", "from pyFAI.detectors.sensors import Si_MATERIAL\n", "import json\n", "from scipy.sparse import csr_matrix, linalg\n", "try:\n", " import cpuinfo\n", "except ImportError:\n", " cpu = \"\"\n", "else:\n", " cpu = cpuinfo.cpuinfo.get_cpu_info().get(\"brand_raw\")\n", "\n", "start_time = time.perf_counter()\n", "print(f\"Using pyFAI vesrion: {pyFAI.version} on a {cpu} with {os.cpu_count()} threads\")\n", "with open(\"id28.json\") as f:\n", " calib = json.load(f)\n", "\n", "thickness = 450e-6\n", "wavelength = calib[\"wavelength\"]\n", "dist = calib[\"param\"][calib[\"param_names\"].index(\"dist\")]\n", "poni1 = calib[\"param\"][calib[\"param_names\"].index(\"poni1\")]\n", "poni2 = calib[\"param\"][calib[\"param_names\"].index(\"poni2\")]\n", "energy = pyFAI.units.hc / (wavelength * 1e10)\n", "print(f\"wavelength: {wavelength:.3e}m,\\t \"\n", " f\"dist: {dist:.3e}m,\\t \"\n", " f\"poni1: {poni1:.3e}m,\\t \"\n", " f\"poni2: {energy:.3e}m,\\t \"\n", " f\"energy: {energy:.3f}keV\")\n", "mask = numpy.load(\"mask.npy\").astype(numpy.int8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Absorption coefficient at 17.8 keV" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "µ = 16.4 cm^-1 hence absorption efficiency for 450µm: 52.3 %\n" ] } ], "source": [ "print(\n", " f\"µ = {Si_MATERIAL.mu(energy, unit='cm'):.1f} cm^-1 \"\n", " f\"hence absorption efficiency for 450µm: {100 * Si_MATERIAL.absorbance(energy, thickness):.1f} %\"\n", ")\n", "mu = Si_MATERIAL.mu(energy, unit=\"m\")" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from pyFAI.detectors.sensors import Si_MATERIAL\n", "\n", "depth = numpy.linspace(0, 1000, 100)\n", "res = [1 - Si_MATERIAL.absorbance(energy, d, \"µm\") for d in depth]\n", "fig, ax = subplots()\n", "ax.plot(depth, res, \"-\")\n", "ax.set_xlabel(\"Depth (µm)\")\n", "ax.set_ylabel(\"Residual signal\")\n", "ax.set_title(f\"Silicon @ {energy:.1} keV\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is consistent with:\n", "[henke.lbl.gov](http://henke.lbl.gov/optical_constants/filter2.html) or \n", "[web-docs.gsi.de](https://web-docs.gsi.de/~stoe_exp/web_programs/x_ray_absorption/index.php)\n", "\n", "Now we can model the detector\n", "\n", "## Modeling of the detector:\n", "\n", "The detector is represented as a 2D array of voxels. Let vox, voy, and voz denote the dimensions of the detector along the three axes.\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Detector Pilatus 1M\tPixelSize= 172µm, 172µm\t BottomRight(3)\n", "Voxel size: (x:0.000172, y:0.000172, z:0.00045)\n" ] } ], "source": [ "detector = pyFAI.detector_factory(calib[\"detector\"])\n", "print(detector)\n", "\n", "vox = detector.pixel2 # this is not a typo\n", "voy = detector.pixel1 # x <--> axis 2\n", "voz = thickness\n", "\n", "print(f\"Voxel size: (x:{vox}, y:{voy}, z:{voz})\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The intensity grabbed in this voxel is the triple integral of the absorbed signal coming from this pixel or from the neighboring ones.\n", "\n", "There are 3 ways to perform this intergral:\n", "* Volumetric analytic integral. Looks feasible with a change of variable in the depth\n", "* Slice per slice, the remaining intensity depand on the incidence angle + pixel splitting between neighbooring pixels\n", "* raytracing: the decay can be solved analytically for each ray, one has to throw many ray to average out the signal.\n", "\n", "For sake of simplicity, this integral will be calculated numerically using this raytracing algorithm.\n", "http://www.cse.yorku.ca/~amana/research/grid.pdf\n", "\n", "Knowing the input position for a X-ray on the detector and its propagation vector, this algorithm allows us to calculate the length of the path in all voxel it crosses in a fairly efficient way.\n", "\n", "To speed up the calculation, we will use a few tricks:\n", "* One ray never crosses more than 16 pixels, which is reasonable considering the incidance angle \n", "* we use numba to speed-up the calculation of loops in python\n", "* We will allocate the needed memory by chuncks of 1 million elements\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: numba in /users/kieffer/.venv/py313/lib/python3.13/site-packages (0.62.1)\n", "Requirement already satisfied: cython in /users/kieffer/.venv/py313/lib/python3.13/site-packages (3.0.12)\n", "Requirement already satisfied: llvmlite<0.46,>=0.45.0dev0 in /users/kieffer/.venv/py313/lib/python3.13/site-packages (from numba) (0.45.1)\n", "Requirement already satisfied: numpy<2.4,>=1.22 in /users/kieffer/.venv/py313/lib/python3.13/site-packages (from numba) (2.3.1)\n" ] } ], "source": [ "! pip install numba cython" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Raytracing using numba" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "BLOCK_SIZE = 1 << 20 # 1 million\n", "BUFFER_SIZE = 16\n", "BIG = numpy.finfo(numpy.float32).max" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we are able to perform raytracing for any ray entering the detector, we can calculate the contribution to the neighboring pixels, using the absorption law (the length traveled is already known). \n", "To average-out the signal, we will sample a few dozens of rays per pixel to get an approximation of the volume integral. \n", "\n", "Now we need to store the results so that this transformation can be represented as a sparse matrix multiplication:\n", "\n", "b = M.a\n", "\n", "Where b is the recorded image (blurred) and a is the \"perfect\" signal. \n", "M being the sparse matrix where every pixel of a gives a limited number of contribution to b.\n", "\n", "Each pixel in *b* is represented by one line in *M* and we store the indices of *a* of interest with the coefficients of the matrix.\n", "So if a pixel i,j contributes to (i,j), (i+1,j), (i+1,j+1), there are only 3 elements in the line. \n", "This is advantageous for storage.\n", "\n", "We will use the CSR sparse matrix representation:\n", "https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_row_.28CSR.2C_CRS_or_Yale_format.29\n", "where there are 3 arrays:\n", "* data: containing the actual non zero values\n", "* indices: for a given line, it contains the column number of the associated data (at the same indice)\n", "* idptr: this array contains the index of the start of every line.\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "from numba.experimental import jitclass\n", "from numba import int8, int32, int64, float32, float64\n", "\n", "spec = [\n", " (\"vox\", float64),\n", " (\"voy\", float64),\n", " (\"voz\", float64),\n", " (\"mu\", float64),\n", " (\"dist\", float64),\n", " (\"poni1\", float64),\n", " (\"poni2\", float64),\n", " (\"width\", int64),\n", " (\"height\", int64),\n", " (\"mask\", int8[:, :]),\n", " (\"sampled\", int64),\n", " (\"data\", float32[:]),\n", " (\"indices\", int32[:]),\n", " (\"idptr\", int32[:]),\n", "]\n", "\n", "\n", "@jitclass(spec)\n", "class ThickDetector(object):\n", " \"Calculate the point spread function as function of the geometry of the experiment\"\n", "\n", " def __init__(self, vox, voy, thickness, mask, mu, dist, poni1, poni2):\n", " \"\"\"Constructor of the class:\n", "\n", " :param vox, voy: detector pixel size in the plane\n", " :param thickness: thickness of the sensor in meters\n", " :param mask:\n", " :param mu: absorption coefficient of the sensor material\n", " :param dist: sample detector distance as defined in the geometry-file\n", " :param poni1, poni2: coordinates of the PONI as defined in the geometry\n", " \"\"\"\n", " self.vox = vox\n", " self.voy = voy\n", " self.voz = thickness\n", " self.mu = mu\n", " self.dist = dist\n", " self.poni1 = poni1\n", " self.poni2 = poni2\n", " self.width = mask.shape[-1]\n", " self.height = mask.shape[0]\n", " self.mask = mask\n", " self.sampled = 0\n", " self.data = numpy.zeros(BLOCK_SIZE, dtype=numpy.float32)\n", " self.indices = numpy.zeros(BLOCK_SIZE, dtype=numpy.int32)\n", " self.idptr = numpy.zeros(self.width * self.height + 1, dtype=numpy.int32)\n", "\n", " def calc_one_ray(self, entx, enty):\n", " \"\"\"For a ray entering at position (entx, enty), with a propagation vector (kx, ky,kz),\n", " calculate the length spent in every voxel where energy is deposited from a bunch of photons entering the detector\n", " at a given position and and how much energy they deposit in each voxel.\n", "\n", " Direct implementation of http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.42.3443&rep=rep1&type=pdf\n", "\n", " :param entx, enty: coordinate of the entry point in meter (2 components, x,y)\n", " :return: coordinates voxels in x, y and length crossed when leaving the associated voxel\n", " \"\"\"\n", " array_x = numpy.empty(BUFFER_SIZE, dtype=numpy.int32)\n", " array_x[:] = -1\n", " array_y = numpy.empty(BUFFER_SIZE, dtype=numpy.int32)\n", " array_y[:] = -1\n", " array_len = numpy.empty(BUFFER_SIZE, dtype=numpy.float32)\n", "\n", " # normalize the input propagation vector\n", " kx = entx - self.poni2\n", " ky = enty - self.poni1\n", " kz = self.dist\n", " n = numpy.sqrt(kx * kx + ky * ky + kz * kz)\n", " kx /= n\n", " ky /= n\n", " kz /= n\n", "\n", " step_X = -1 if kx < 0.0 else 1\n", " step_Y = -1 if ky < 0.0 else 1\n", "\n", " X = int(entx / self.vox)\n", " Y = int(enty / self.voy)\n", "\n", " if kx > 0.0:\n", " t_max_x = ((entx // self.vox + 1) * (self.vox) - entx) / kx\n", " elif kx < 0.0:\n", " t_max_x = ((entx // self.vox) * (self.vox) - entx) / kx\n", " else:\n", " t_max_x = BIG\n", "\n", " if ky > 0.0:\n", " t_max_y = ((enty // self.voy + 1) * (self.voy) - enty) / ky\n", " elif ky < 0.0:\n", " t_max_y = ((enty // self.voy) * (self.voy) - enty) / ky\n", " else:\n", " t_max_y = BIG\n", "\n", " # Only one case for z as the ray is travelling in one direction only\n", " t_max_z = self.voz / kz\n", "\n", " t_delta_x = abs(self.vox / kx) if kx != 0 else BIG\n", " t_delta_y = abs(self.voy / ky) if ky != 0 else BIG\n", " t_delta_z = self.voz / kz\n", "\n", " finished = False\n", " last_id = 0\n", " array_x[last_id] = X\n", " array_y[last_id] = Y\n", "\n", " while not finished:\n", " if t_max_x < t_max_y:\n", " if t_max_x < t_max_z:\n", " array_len[last_id] = t_max_x\n", " last_id += 1\n", " X += step_X\n", " array_x[last_id] = X\n", " array_y[last_id] = Y\n", " t_max_x += t_delta_x\n", " else:\n", " array_len[last_id] = t_max_z\n", " last_id += 1\n", " finished = True\n", " else:\n", " if t_max_y < t_max_z:\n", " array_len[last_id] = t_max_y\n", " last_id += 1\n", " Y += step_Y\n", " array_x[last_id] = X\n", " array_y[last_id] = Y\n", " t_max_y += t_delta_y\n", " else:\n", " array_len[last_id] = t_max_z\n", " last_id += 1\n", " finished = True\n", " if last_id >= array_len.size - 1:\n", " print(\"resize arrays\")\n", " old_size = len(array_len)\n", " new_size = (old_size // BUFFER_SIZE + 1) * BUFFER_SIZE\n", " new_array_x = numpy.empty(new_size, dtype=numpy.int32)\n", " new_array_x[:] = -1\n", " new_array_y = numpy.empty(new_size, dtype=numpy.int32)\n", " new_array_y[:] = -1\n", " new_array_len = numpy.empty(new_size, dtype=numpy.float32)\n", " new_array_x[:old_size] = array_x\n", " new_array_y[:old_size] = array_y\n", " new_array_len[:old_size] = array_len\n", " array_x = new_array_x\n", " array_y = new_array_y\n", " array_len = new_array_len\n", " return array_x[:last_id], array_y[:last_id], array_len[:last_id]\n", "\n", " def one_pixel(self, row, col, sample):\n", " \"\"\"calculate the contribution of one pixel to the sparse matrix and populate it.\n", "\n", " :param row: row index of the pixel of interest\n", " :param col: column index of the pixel of interest\n", " :param sample: Oversampling rate, 10 will cast 10x10 ray per pixel\n", "\n", " :return: the extra number of pixel allocated\n", " \"\"\"\n", " if self.mask[row, col]:\n", " return (\n", " numpy.empty(0, dtype=numpy.int32),\n", " numpy.empty(0, dtype=numpy.float32),\n", " )\n", "\n", " counter = 0\n", " tmp_size = 0\n", " last_buffer_size = BUFFER_SIZE\n", " tmp_idx = numpy.empty(last_buffer_size, dtype=numpy.int32)\n", " tmp_idx[:] = -1\n", " tmp_coef = numpy.zeros(last_buffer_size, dtype=numpy.float32)\n", "\n", " pos = row * self.width + col\n", " start = self.idptr[pos]\n", " for i in range(sample):\n", " posx = (col + 1.0 * i / sample) * vox\n", " for j in range(sample):\n", " posy = (row + 1.0 * j / sample) * voy\n", " array_x, array_y, array_len = self.calc_one_ray(posx, posy)\n", "\n", " rem = 1.0\n", " for i in range(array_x.size):\n", " x = array_x[i]\n", " y = array_y[i]\n", " l = array_len[i]\n", " if (x < 0) or (y < 0) or (y >= self.height) or (x >= self.width):\n", " break\n", " elif self.mask[y, x]:\n", " continue\n", " idx = x + y * self.width\n", " dos = numpy.exp(-self.mu * l)\n", " value = rem - dos\n", " rem = dos\n", " for j in range(last_buffer_size):\n", " if tmp_size >= last_buffer_size:\n", " # Increase buffer size\n", " new_buffer_size = last_buffer_size + BUFFER_SIZE\n", " new_idx = numpy.empty(new_buffer_size, dtype=numpy.int32)\n", " new_coef = numpy.zeros(new_buffer_size, dtype=numpy.float32)\n", " new_idx[:last_buffer_size] = tmp_idx\n", " new_idx[last_buffer_size:] = -1\n", " new_coef[:last_buffer_size] = tmp_coef\n", " last_buffer_size = new_buffer_size\n", " tmp_idx = new_idx\n", " tmp_coef = new_coef\n", "\n", " if tmp_idx[j] == idx:\n", " tmp_coef[j] += value\n", " break\n", " elif tmp_idx[j] < 0:\n", " tmp_idx[j] = idx\n", " tmp_coef[j] = value\n", " tmp_size += 1\n", " break\n", " return tmp_idx[:tmp_size], tmp_coef[:tmp_size]\n", "\n", " def calc_csr(self, sample):\n", " \"\"\"Calculate the CSR matrix for the whole image\n", " :param sample: Oversampling factor\n", " :return: CSR matrix\n", " \"\"\"\n", " size = self.width * self.height\n", " allocated_size = BLOCK_SIZE\n", " idptr = numpy.zeros(size + 1, dtype=numpy.int32)\n", " indices = numpy.zeros(allocated_size, dtype=numpy.int32)\n", " data = numpy.zeros(allocated_size, dtype=numpy.float32)\n", " self.sampled = sample * sample\n", " pos = 0\n", " start = 0\n", " for row in range(self.height):\n", " for col in range(self.width):\n", " line_idx, line_coef = self.one_pixel(row, col, sample)\n", " line_size = line_idx.size\n", " if line_size == 0:\n", " new_size = 0\n", " pos += 1\n", " idptr[pos] = start\n", " continue\n", "\n", " stop = start + line_size\n", "\n", " if stop >= allocated_size:\n", " new_buffer_size = allocated_size + BLOCK_SIZE\n", " new_idx = numpy.zeros(new_buffer_size, dtype=numpy.int32)\n", " new_coef = numpy.zeros(new_buffer_size, dtype=numpy.float32)\n", " new_idx[:allocated_size] = indices\n", " new_coef[:allocated_size] = data\n", " allocated_size = new_buffer_size\n", " indices = new_idx\n", " data = new_coef\n", "\n", " indices[start:stop] = line_idx\n", " data[start:stop] = line_coef\n", " pos += 1\n", " idptr[pos] = stop\n", " start = stop\n", "\n", " last = idptr[-1]\n", " self.data = data\n", " self.indices = indices\n", " self.idptr = idptr\n", " return (self.data[:last] / self.sampled, indices[:last], idptr)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 3.95 s, sys: 236 ms, total: 4.19 s\n", "Wall time: 4.2 s\n" ] }, { "data": { "text/plain": [ "(array([0., 0., 0., ..., 0., 0., 0.], shape=(1902583,), dtype=float32),\n", " array([ 2, 2, 4, ..., 1023180, 1023181, 1023182],\n", " shape=(1902583,), dtype=int32),\n", " array([ 0, 0, 0, ..., 1902581, 1902582, 1902583],\n", " shape=(1023184,), dtype=int32))" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbthick = ThickDetector(\n", " vox, voy, thickness=thickness, mu=mu, dist=dist, poni1=poni1, poni2=poni2, mask=mask\n", ")\n", "%time nbthick.calc_csr(1)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 3.68 s, sys: 7.78 ms, total: 3.69 s\n", "Wall time: 3.68 s\n" ] }, { "data": { "text/plain": [ "(array([0.17449115, 0.10933631, 0.17465518, ..., 0.24611568, 0.03161056,\n", " 0.24604346], shape=(3410359,), dtype=float32),\n", " array([ 2, 2, 3, ..., 1023181, 1023182, 1023182],\n", " shape=(3410359,), dtype=int32),\n", " array([ 0, 0, 0, ..., 3410356, 3410358, 3410359],\n", " shape=(1023184,), dtype=int32))" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time nbthick.calc_csr(4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Same implementation using Cython\n", "\n", "Cython is an ahead of time compiler for Python code. Thus it requires a compiler (gcc, msvc, ...) needs to be installed on your computer and available. This is usually already the case under linux but requires some work under windows or macos." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "%load_ext Cython" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "%%cython --compile-args=-fopenmp --link-args=-fopenmp \n", "#cython: embedsignature=True, language_level=3, binding=True\n", "#cython: boundscheck=False, wraparound=False, cdivision=True, initializedcheck=False,\n", "## This is for developping:\n", "## cython: profile=True, warn.undeclared=True, warn.unused=True, warn.unused_result=False, warn.unused_arg=True\n", "##\n", "\n", "import cython\n", "import numpy\n", "from libc.math cimport sqrt, exp\n", "from cython.parallel import prange\n", "from libc.stdint cimport int8_t, uint8_t, int16_t, uint16_t, \\\n", " int32_t, uint32_t, int64_t, uint64_t\n", "\n", "ctypedef double float64_t\n", "ctypedef float float32_t\n", "\n", "cdef int32_t BUFFER_SIZE = 16\n", "cdef float64_t BIG = numpy.finfo(numpy.float32).max\n", "\n", "\n", "cdef class CythonThickDetector:\n", " \"Calculate the point spread function as function of the geometry of the experiment\"\n", " cdef:\n", " public float64_t vox\n", " public float64_t voy\n", " public float64_t voz\n", " public float64_t mu\n", " public float64_t dist\n", " public float64_t poni1\n", " public float64_t poni2\n", " public int oversampling\n", " public int buffer_size\n", " public int width\n", " public int height\n", " public int size\n", " public int8_t[:, ::1] mask\n", " \n", " def __init__(self, \n", " float64_t vox, \n", " float64_t voy, \n", " float64_t thickness, \n", " mask, \n", " float64_t mu, \n", " float64_t dist, \n", " float64_t poni1, \n", " float64_t poni2,\n", " int buffer_size=BUFFER_SIZE):\n", " \"\"\"Constructor of the class:\n", " \n", " :param vox, voy: detector pixel size in the plane\n", " :param thickness: thickness of the sensor in meters\n", " :param mask: \n", " :param mu: absorption coefficient of the sensor material\n", " :param dist: sample detector distance as defined in the geometry-file\n", " :param poni1, poni2: coordinates of the PONI as defined in the geometry \n", " \"\"\"\n", " self.vox = float(vox)\n", " self.voy = float(voy)\n", " self.voz = float(thickness)\n", " self.mu = float(mu)\n", " self.dist = float(dist)\n", " self.poni1 = float(poni1)\n", " self.poni2 = float(poni2)\n", " self.width = int(mask.shape[1])\n", " self.height = int(mask.shape[0])\n", " self.size = self.width*self.height\n", " self.mask = numpy.ascontiguousarray(mask, numpy.int8)\n", " self.oversampling = 1\n", " self.buffer_size = int(buffer_size)\n", " \n", " def calc_one_ray(self, entx, enty):\n", " \"\"\"For a ray entering at position (entx, enty), with a propagation vector (kx, ky,kz),\n", " calculate the length spent in every voxel where energy is deposited from a bunch of photons entering the detector \n", " at a given position and and how much energy they deposit in each voxel. \n", "\n", " Direct implementation of http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.42.3443&rep=rep1&type=pdf\n", "\n", " :param entx, enty: coordinate of the entry point in meter (2 components, x,y)\n", " :return: coordinates voxels in x, y and length crossed when leaving the associated voxel\n", " \"\"\"\n", " \n", " cdef:\n", " int last_id\n", " float64_t _entx = float(entx)\n", " float64_t _enty = float(enty)\n", " int32_t[::1] array_x = numpy.empty(self.buffer_size, dtype=numpy.int32)\n", " int32_t[::1] array_y = numpy.empty(self.buffer_size, dtype=numpy.int32)\n", " float32_t[::1] array_len = numpy.empty(self.buffer_size, dtype=numpy.float32)\n", " with nogil: \n", " last_id = self._calc_one_ray(_entx, _enty, \n", " array_x, array_y, array_len)\n", " if last_id>self.buffer_size:\n", " raise RuntimeError(f\"Temporary buffer size ({last_id}) larger than expected ({self.buffer_size})\")\n", " return (numpy.asarray(array_x[:last_id]), \n", " numpy.asarray(array_y[:last_id]), \n", " numpy.asarray(array_len[:last_id]))\n", " \n", " cdef int _calc_one_ray(self, \n", " float64_t entx, \n", " float64_t enty,\n", " int32_t[::1] array_x,\n", " int32_t[::1] array_y,\n", " float32_t[::1] array_len\n", " )noexcept nogil:\n", " \"\"\"Return number of entries in the array_x[:last_id], array_y[:last_id], array_len[:last_id]\"\"\"\n", " cdef:\n", " float64_t kx, ky, kz, n, t_max_x, t_max_y, t_max_z, t_delta_x, t_delta_y#, t_delta_z\n", " int step_X, step_Y, X, Y, last_id\n", " bint finished\n", "\n", " # reset arrays\n", " array_x[:] = -1\n", " array_y[:] = -1\n", " array_len[:] = 0.0\n", "\n", " # normalize the input propagation vector\n", " kx = entx - self.poni2\n", " ky = enty - self.poni1\n", " kz = self.dist\n", " n = sqrt(kx*kx + ky*ky + kz*kz)\n", " kx /= n\n", " ky /= n\n", " kz /= n\n", "\n", " step_X = -1 if kx<0.0 else 1\n", " step_Y = -1 if ky<0.0 else 1\n", "\n", " X = int(entx/self.vox)\n", " Y = int(enty/self.voy)\n", "\n", " if kx>0.0:\n", " t_max_x = ((entx//self.vox+1)*(self.vox)-entx)/ kx\n", " elif kx<0.0:\n", " t_max_x = ((entx//self.vox)*(self.vox)-entx)/ kx\n", " else:\n", " t_max_x = BIG\n", "\n", " if ky>0.0:\n", " t_max_y = ((enty//self.voy+1)*(self.voy)-enty)/ ky\n", " elif ky<0.0:\n", " t_max_y = ((enty//self.voy)*(self.voy)-enty)/ ky\n", " else:\n", " t_max_y = BIG\n", "\n", " #Only one case for z as the ray is travelling in one direction only\n", " t_max_z = self.voz / kz\n", "\n", " t_delta_x = abs(self.vox/kx) if kx!=0 else BIG\n", " t_delta_y = abs(self.voy/ky) if ky!=0 else BIG\n", " # t_delta_z = self.voz/kz\n", "\n", " finished = False\n", " last_id = 0\n", " array_x[last_id] = X\n", " array_y[last_id] = Y\n", "\n", " while not finished:\n", " if t_max_x < t_max_y:\n", " if t_max_x < t_max_z:\n", " array_len[last_id] = t_max_x\n", " last_id = last_id + 1\n", " X = X + step_X\n", " array_x[last_id] = X\n", " array_y[last_id] = Y\n", " t_max_x = t_max_x + t_delta_x\n", " else:\n", " array_len[last_id] = t_max_z\n", " last_id = last_id + 1\n", " finished = True\n", " else:\n", " if t_max_y < t_max_z:\n", " array_len[last_id] = t_max_y\n", " last_id = last_id +1\n", " Y = Y + step_Y\n", " array_x[last_id] = X\n", " array_y[last_id] = Y \n", " t_max_y = t_max_y + t_delta_y\n", " else:\n", " array_len[last_id] = t_max_z\n", " last_id = last_id +1\n", " finished = True\n", " if last_id>=self.buffer_size:\n", " return self.buffer_size\n", " return last_id\n", "\n", "\n", " def one_pixel(self, row, col, sample=0):\n", " \"\"\"calculate the contribution of one pixel to the sparse matrix and populate it.\n", "\n", " :param row: row index of the pixel of interest\n", " :param col: column index of the pixel of interest\n", " :param sample: Oversampling rate, 10 will cast 10x10 ray per pixel\n", " \"\"\"\n", " cdef:\n", " int entries = 0\n", " int _row = int(row)\n", " int _col = int(col) \n", " int32_t[::1] tmp_idx = numpy.empty(self.buffer_size, dtype=numpy.int32)\n", " float32_t[::1] tmp_coef = numpy.empty(self.buffer_size, dtype=numpy.float32)\n", " int32_t[::1] array_x = numpy.empty(self.buffer_size, dtype=numpy.int32)\n", " int32_t[::1] array_y = numpy.empty(self.buffer_size, dtype=numpy.int32)\n", " float32_t[::1] array_len = numpy.empty(self.buffer_size, dtype=numpy.float32)\n", "\n", " if sample:\n", " self.oversampling = sample\n", " with nogil:\n", " entries = self._one_pixel(_row, _col, tmp_idx, tmp_coef, array_x, array_y, array_len)\n", " if entries=self.height) or (x>=self.width):\n", " break\n", " elif (self.mask[y, x]):\n", " continue\n", " idx = x + y*self.width\n", " dos = exp(-self.mu*l)\n", " value = rem - dos\n", " rem = dos\n", " for j in range(self.buffer_size): \n", " if tmp_idx[j] == idx:\n", " tmp_coef[j] = tmp_coef[j] + value\n", " break\n", " elif tmp_idx[j] < 0:\n", " tmp_idx[j] = idx\n", " tmp_coef[j] = value\n", " tmp_size = tmp_size + 1\n", " break\n", " if tmp_size >= self.buffer_size:\n", " break\n", " return tmp_size\n", "\n", " def calc_csr(self, sample=0, int threads=0):\n", " \"\"\"Calculate the content of the sparse matrix for the whole image\n", " :param sample: Oversampling factor\n", " :param threads: number of threads to be used\n", " :return: spase matrix?\n", " \"\"\"\n", " cdef:\n", " int pos, i, current, next, size\n", " int32_t[::1] sizes, indptr = numpy.zeros(self.size+1, dtype=numpy.int32)\n", " int32_t[:, ::1] indices\n", " float32_t[:, ::1] data\n", " int32_t[::1] csr_indices\n", " float32_t[::1] csr_data\n", " int32_t[:, ::1] array_x\n", " int32_t[:, ::1] array_y\n", " float32_t[:, ::1] array_len\n", "\n", " if sample:\n", " self.oversampling = sample\n", " self.oversampling = sample\n", " data = numpy.zeros((self.size, self.buffer_size), dtype=numpy.float32)\n", " indices = numpy.zeros((self.size, self.buffer_size),dtype=numpy.int32)\n", " sizes = numpy.zeros(self.size, dtype=numpy.int32)\n", "\n", " #single threaded version:\n", " array_x = numpy.empty((self.size, self.buffer_size), dtype=numpy.int32)\n", " array_y = numpy.empty((self.size, self.buffer_size), dtype=numpy.int32)\n", " array_len = numpy.empty((self.size, self.buffer_size), dtype=numpy.float32)\n", "\n", " for pos in prange(self.size, num_threads=threads, nogil=True):\n", " size = sizes[pos] = self._one_pixel(pos//self.width, pos%self.width, \n", " indices[pos], data[pos], \n", " array_x[pos], array_y[pos], array_len[pos])\n", " size = numpy.sum(sizes)\n", " csr_indices = numpy.empty(size, numpy.int32)\n", " csr_data = numpy.empty(size, numpy.float32)\n", " current = 0\n", " for i in range(self.size):\n", " size = sizes[i]\n", " next = current + size\n", " indptr[i+1] = next\n", " csr_indices[current:next] = indices[i,:size]\n", " csr_data[current:next] = data[i,:size]\n", " current = next\n", " return (numpy.asarray(csr_data)/(self.oversampling*self.oversampling), \n", " numpy.asarray(csr_indices), \n", " numpy.asarray(indptr))" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Performance of Cython implementation:\n", "CPU times: user 1.01 s, sys: 47.9 ms, total: 1.05 s\n", "Wall time: 1.05 s\n" ] }, { "data": { "text/plain": [ "(array([0.17449115, 0.10933631, 0.17465518, ..., 0.35667214, 0.05905784,\n", " 0.35653958], shape=(3328700,), dtype=float32),\n", " array([ 2, 2, 3, ..., 1023181, 1023182, 1023182],\n", " shape=(3328700,), dtype=int32),\n", " array([ 0, 0, 0, ..., 3328697, 3328699, 3328700],\n", " shape=(1023184,), dtype=int32))" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cythick = CythonThickDetector(\n", " 172e-6,\n", " 172e-6,\n", " thickness=thickness,\n", " mu=mu,\n", " dist=dist,\n", " poni1=poni1,\n", " poni2=poni2,\n", " mask=mask,\n", ")\n", "print(\"Performance of Cython implementation:\")\n", "%time cythick.calc_csr(4, threads=1)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Performance of Numba implementation:\n", "CPU times: user 3.65 s, sys: 28 ms, total: 3.68 s\n", "Wall time: 3.67 s\n" ] }, { "data": { "text/plain": [ "(array([0.17449115, 0.10933631, 0.17465518, ..., 0.24611568, 0.03161056,\n", " 0.24604346], shape=(3410359,), dtype=float32),\n", " array([ 2, 2, 3, ..., 1023181, 1023182, 1023182],\n", " shape=(3410359,), dtype=int32),\n", " array([ 0, 0, 0, ..., 3410356, 3410358, 3410359],\n", " shape=(1023184,), dtype=int32))" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Comparison with Numba, Cython is usually slightly faster\n", "print(\"Performance of Numba implementation:\")\n", "%time nbthick.calc_csr(4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Validation of the CSR matrix obtained:\n", "\n", "For this we will build a simple 2D image with one pixel in a regular grid and calculate the effect of the transformation calculated previously on it. " ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 11.1 s, sys: 403 ms, total: 11.5 s\n", "Wall time: 358 ms\n" ] } ], "source": [ "%time csr = csr_matrix(cythick.calc_csr(10))" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "dummy_image = numpy.ones(mask.shape, dtype=\"float32\")\n", "dummy_image[::5, ::5] = 10\n", "\n", "dummy_blurred = csr.T.dot(dummy_image.ravel()).reshape(mask.shape)\n", "fix, ax = subplots(2, 2, figsize=(8, 8))\n", "ax[0, 0].imshow(dummy_image)\n", "ax[0, 0].set_title(\"Original image\")\n", "ax[0, 1].imshow(dummy_blurred)\n", "ax[0, 1].set_title(\"Convolved image (i.e. blurred)\")\n", "ax[1, 1].imshow(csr.dot(dummy_blurred.ravel()).reshape(mask.shape))\n", "ax[1, 1].set_title(\"Retro-projected of the blurred\")\n", "ax[0, 0].set_xlim(964, 981)\n", "ax[0, 0].set_ylim(0, 16)\n", "ax[0, 1].set_xlim(964, 981)\n", "ax[0, 1].set_ylim(0, 16)\n", "ax[1, 1].set_xlim(964, 981)\n", "ax[1, 1].set_ylim(0, 16);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Least squares refinement of the pseudo-inverse" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/users/kieffer/.venv/py313/lib/python3.13/site-packages/scipy/sparse/linalg/_isolve/lsmr.py:407: RuntimeWarning: overflow encountered in cast\n", " condA = max(maxrbar, rhotemp) / min(minrbar, rhotemp)\n", "/users/kieffer/.venv/py313/lib/python3.13/site-packages/scipy/sparse/linalg/_isolve/lsmr.py:406: RuntimeWarning: overflow encountered in cast\n", " minrbar = min(minrbar, rhobarold)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 26.8 s, sys: 4.04 ms, total: 26.8 s\n", "Wall time: 455 ms\n" ] }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "blured = dummy_blurred.ravel()\n", "\n", "# Invert this matrix: see https://arxiv.org/abs/1006.0758\n", "\n", "%time res = linalg.lsmr(csr.T, blured)\n", "restored = res[0].reshape(mask.shape)\n", "\n", "fix, ax = subplots(2, 2, figsize=(8, 8))\n", "ax[0, 0].imshow(dummy_image)\n", "ax[0, 0].set_title(\"Original image\")\n", "ax[0, 1].imshow(dummy_blurred)\n", "ax[0, 1].set_title(\"Convolved image (i.e. blurred)\")\n", "ax[1, 1].imshow(csr.dot(dummy_blurred.ravel()).reshape(mask.shape))\n", "ax[1, 1].set_title(\"Retro-projected of the blurred\")\n", "ax[0, 0].set_xlim(964, 981)\n", "ax[0, 0].set_ylim(0, 16)\n", "ax[0, 1].set_xlim(964, 981)\n", "ax[0, 1].set_ylim(0, 16)\n", "ax[1, 1].set_xlim(964, 981)\n", "ax[1, 1].set_ylim(0, 16)\n", "ax[1, 0].imshow(restored)\n", "ax[1, 0].set_title(\"Restored LSMR\")\n", "ax[1, 0].set_xlim(964, 981)\n", "ax[1, 0].set_ylim(0, 16);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Pseudo inverse with positivitiy constrain and poissonian noise (MLEM)\n", "\n", "The MLEM algorithm was initially developed within the framework of reconstruction of\n", "images in emission tomography [Shepp and Vardi, 1982], [Vardi et al., 1985], [Lange and\n", "Carson, 1984]. Nowadays, this algorithm is employed in numerous tomographic reconstruction\n", "problems and often associated to regularization techniques. It is based on the iterative\n", "maximization of the log-likelihood function." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_3870469/3638533218.py:4: RuntimeWarning: divide by zero encountered in divide\n", " norm = 1 / R.T.dot(numpy.ones_like(F))\n", "/tmp/ipykernel_3870469/3638533218.py:5: RuntimeWarning: invalid value encountered in divide\n", " cor = R.T.dot(M / R.dot(F))\n", "/tmp/ipykernel_3870469/3638533218.py:6: RuntimeWarning: invalid value encountered in multiply\n", " res = norm * F * cor\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "0 1.7501588\n", "100 0.0014371872\n", "200 0.00016927719\n", "228 9.930134e-05\n", "CPU times: user 2.66 s, sys: 19.9 ms, total: 2.67 s\n", "Wall time: 2.67 s\n" ] } ], "source": [ "def iterMLEM_scipy(F, M, R):\n", " \"Implement one step of MLEM\"\n", " # res = F * (R.T.dot(M))/R.dot(F)# / M.sum(axis=-1)\n", " norm = 1 / R.T.dot(numpy.ones_like(F))\n", " cor = R.T.dot(M / R.dot(F))\n", " res = norm * F * cor\n", " res[numpy.isnan(res)] = 1.0\n", " return res\n", "\n", "\n", "def deconv_MLEM(csr, data, thres=0.2, maxiter=1000):\n", " R = csr.T\n", " msk = data < 0\n", " img = data.astype(\"float32\")\n", " img[msk] = 0.0 # set masked values to 0, negative values could induce errors\n", " M = img.ravel()\n", " # F0 = numpy.random.random(data.size)#M#\n", " F0 = R.T.dot(M)\n", " F1 = iterMLEM_scipy(F0, M, R)\n", " delta = abs(F1 - F0).max()\n", " for i in range(maxiter):\n", " if delta < thres:\n", " break\n", " F2 = iterMLEM_scipy(F1, M, R)\n", " delta = abs(F1 - F2).max()\n", " if i % 100 == 0:\n", " print(i, delta)\n", " F1 = F2\n", " i += 1\n", " print(i, delta)\n", " return F2.reshape(img.shape)\n", "\n", "\n", "%time res = deconv_MLEM(csr, dummy_blurred, 1e-4)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fix, ax = subplots(2, 2, figsize=(8, 8))\n", "ax[0, 0].imshow(dummy_image)\n", "ax[0, 1].imshow(dummy_blurred)\n", "ax[1, 1].imshow(csr.dot(dummy_blurred.ravel()).reshape(mask.shape))\n", "ax[0, 0].set_xlim(964, 981)\n", "ax[0, 0].set_ylim(0, 16)\n", "ax[0, 0].set_title(\"Original image\")\n", "ax[0, 1].set_xlim(964, 981)\n", "ax[0, 1].set_ylim(0, 16)\n", "ax[0, 1].set_title(\"Convolved image (i.e. blurred)\")\n", "ax[1, 1].set_xlim(964, 981)\n", "ax[1, 1].set_ylim(0, 16)\n", "ax[1, 1].set_title(\"Retro-projected of the blurred\")\n", "# ax[1,0].set_title(\"Corrected image\");\n", "ax[1, 0].imshow(res)\n", "ax[1, 0].set_xlim(964, 981)\n", "ax[1, 0].set_ylim(0, 16)\n", "ax[1, 0].set_title(\"Corrected image: MLEM\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Performance measurements ... multi-threaded" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1 threads: 3.47 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 2 threads: 1.87 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 4 threads: 1.07 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 8 threads: 679 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 16 threads: 516 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 32 threads: 392 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 64 threads: 325 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", "128 threads: 312 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], "source": [ "perfs = {}\n", "for i in range(8):\n", " j = 1 << i\n", " print(f\"{j:3d} threads: \", end=\"\")\n", " perfs[(j, cythick.buffer_size)] = %timeit -o -n1 -r1 cythick.calc_csr(8, threads=j)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1 threads: 3.55 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 2 threads: 1.93 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 4 threads: 1.1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 8 threads: 685 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 16 threads: 532 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 32 threads: 402 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", " 64 threads: 347 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n", "128 threads: 337 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], "source": [ "buf = 32\n", "cythick = CythonThickDetector(\n", " 172e-6,\n", " 172e-6,\n", " thickness=thickness,\n", " mu=mu,\n", " dist=dist,\n", " poni1=poni1,\n", " poni2=poni2,\n", " mask=mask,\n", " buffer_size=buf,\n", ")\n", "cythick.calc_csr(1)\n", "for i in range(8):\n", " j = 1 << i\n", " print(f\"{j:3d} threads: \", end=\"\")\n", " perfs[(j, cythick.buffer_size)] = %timeit -o -n1 -r1 cythick.calc_csr(8, threads=j)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "\n", " \n", "fig, ax = subplots(1,2, figsize=(12,6))\n", "t = [i[0] for i in perfs if i[1] == 16]\n", "y16 = [v.best for k,v in perfs.items() if k[1] == 16]\n", "y32 = [v.best for k,v in perfs.items() if k[1] == 32]\n", "p16 = [8e-6*cythick.size/v.best for k,v in perfs.items() if k[1] == 16]\n", "p32 = [8e-6*cythick.size/v.best for k,v in perfs.items() if k[1] == 32]\n", "ax[0].plot(t, y16, label=\"16 voxel/ray buffer\")\n", "ax[0].plot(t, y32, label=\"32 voxel/ray buffer\")\n", "ax[0].set_title(\"Execution time\")\n", "ax[0].set_xlabel(\"Number of threads\")\n", "ax[0].set_ylabel(\"Runtime (s)\")\n", "ax[1].plot(t, p16, label=\"16 voxel/ray\")\n", "ax[1].plot(t, p32, label=\"32 voxel/ray\")\n", "ax[1].set_xlabel(\"Number of threads\")\n", "ax[1].set_ylabel(\"Millions of rays per second\")\n", "ax[1].set_title(cpu)\n", "ax[0].legend();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion of the raytracing part:\n", "\n", "We are able to simulate the path and the absorption of the photon in the thickness of the detector. \n", "Numba/Cython helped substentially to make the raytracing calculation much faster. \n", "The signal of each pixel is indeed spread on the neighboors, depending on the position of the PONI and this effect can be inverted using sparse-matrix pseudo-inversion. \n", "The MLEM can garanteee that the total signal is conserved and that no pixel gets negative value.\n", "\n", "We will now save this sparse matrix to file in order to be able to re-use it in next notebook. But before saving it, it makes sense to spend some time in generating a high quality sparse matrix in throwing thousand rays per pixel in a grid of 64x64 (4 billions rays launched)." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 20min 57s, sys: 961 ms, total: 20min 58s\n", "Wall time: 22.7 s\n" ] } ], "source": [ "%time pre_csr = cythick.calc_csr(128, threads=os.cpu_count())\n", "hq_csr = csr_matrix(pre_csr)\n", "save_npz(\"csr.npz\", hq_csr)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total execution time: 61.854 s\n" ] } ], "source": [ "print(f\"Total execution time: {time.perf_counter() - start_time:.3f} s\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.1" } }, "nbformat": 4, "nbformat_minor": 4 }