# Performances of 2D integration vs 1D integration

This is dependant on:
* Number of azimuthal bins
* Pixel splitting
* Algorithm
* Implementation (i.e. programming language)
* Hardware used

Thus there is no general answer. But here is a quick benchmark to evaluate the penality on performances:

In [1]:
import sys
import os
import time
import numpy
import fabio
import pyFAI
from pyFAI.test.utilstest import UtilsTest
import pyFAI.method_registry
import pyFAI.integrator.azimuthal
print(f"Python version: {sys.version}")
print(f"PyFAI version: {pyFAI.version}")
start_time = time.perf_counter()

Python version: 3.13.1 | packaged by conda-forge | (main, Jan 13 2025, 09:53:10) [GCC 13.3.0]
PyFAI version: 2025.4.0-dev0


In [3]:
print(len(pyFAI.method_registry.IntegrationMethod.list_available()))

81


In [4]:
ai = pyFAI.load(UtilsTest.getimage("Pilatus1M.poni"))
img = fabio.open(UtilsTest.getimage("Pilatus1M.edf")).data
ai

Detector Pilatus 1M	 PixelSize= 172µm, 172µm	 BottomRight (3)
Wavelength= 1.000000e-10 m
SampleDetDist= 1.583231e+00 m	PONI= 3.341702e-02, 4.122778e-02 m	rot1=0.006487 rot2=0.007558 rot3=0.000000 rad
DirectBeamDist= 1583.310 mm	Center: x=179.981, y=263.859 pix	Tilt= 0.571° tiltPlanRotation= 130.640° 𝛌= 1.000Å

In [5]:
%%time
#Tune those parameters to match your needs:
kw1 = {"data": img, "npt":1000}
kw2 = {"data": img, "npt_rad":1000}
#Actual benchmark:
res = {}
for k,v in pyFAI.method_registry.IntegrationMethod._registry.items():
 print(k)
 if k.dim == 1:
 res[k] = %timeit -o ai.integrate1d(method=v, **kw1)
 else:
 res[k] = %timeit -o ai.integrate2d(method=v, **kw2)

Method(dim=1, split='no', algo='histogram', impl='python', target=None)
32.2 ms ± 153 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Method(dim=2, split='no', algo='histogram', impl='python', target=None)
99.7 ms ± 314 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Method(dim=1, split='no', algo='histogram', impl='cython', target=None)
12.4 ms ± 17.9 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Method(dim=2, split='no', algo='histogram', impl='cython', target=None)
21.1 ms ± 159 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Method(dim=1, split='bbox', algo='histogram', impl='cython', target=None)
26.9 ms ± 41.4 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Method(dim=2, split='bbox', algo='histogram', impl='cython', target=None)
35.7 ms ± 208 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Method(dim=1, split='full', algo='histogram', impl='cython', target=None)
152 ms ± 639 μs per loop (mean ± std. dev. of 7 runs, 10 loops each

 prg.build(options_bytes, [devices[i] for i in to_be_built_indices])


11.1 ms ± 497 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=2, split='no', algo='histogram', impl='opencl', target=(1, 0))




7.08 ms ± 698 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=1, split='bbox', algo='csr', impl='opencl', target=(0, 0))
661 μs ± 837 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Method(dim=2, split='bbox', algo='csr', impl='opencl', target=(0, 0))
2.59 ms ± 59.1 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=1, split='no', algo='csr', impl='opencl', target=(0, 0))
621 μs ± 3.02 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Method(dim=2, split='no', algo='csr', impl='opencl', target=(0, 0))
2.55 ms ± 1.57 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Method(dim=1, split='bbox', algo='csr', impl='opencl', target=(0, 1))
1.2 ms ± 29.7 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=2, split='bbox', algo='csr', impl='opencl', target=(0, 1))
6.09 ms ± 39.1 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=1, split='no', algo='csr', impl='opencl', target=(0, 1))
1.04 ms ± 295 ns per loop

 _create_built_program_from_source_cached(
 prg.build(options_bytes, devices)


2.47 ms ± 366 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=2, split='no', algo='csr', impl='opencl', target=(1, 0))
81.4 ms ± 1.45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=1, split='full', algo='csr', impl='opencl', target=(0, 0))
663 μs ± 1.44 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Method(dim=2, split='full', algo='csr', impl='opencl', target=(0, 0))
2.59 ms ± 78 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=1, split='full', algo='csr', impl='opencl', target=(0, 1))
1.18 ms ± 986 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Method(dim=2, split='full', algo='csr', impl='opencl', target=(0, 1))
6.1 ms ± 35.6 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=1, split='full', algo='csr', impl='opencl', target=(1, 0))
2.68 ms ± 122 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Method(dim=2, split='full', algo='csr', impl='opencl', target=(1, 0))
82.5 ms ± 80 μs per loop (

In [10]:
print("-"*80)
print(f"{'Split':5s} | {'Algo':9s} | {'Impl':6s}| {'1d (ms)':8s} | {'2d (ms)':8s} | {'ratio':6s} | Device")
print("-"*80)
for k in res:
 if k.dim == 1:
 k1 = k
 k2 = k._replace(dim=2)
 if k2 in res:
 print(f"{k1.split:5s} | {k1.algo:9s} | {k1.impl:6s}| {res[k1].best*1000:8.3f} | {res[k2].best*1000:8.3f} | {res[k2].best/res[k1].best:6.1f} | ",
 end="")
 if k.target:
 print(pyFAI.method_registry.IntegrationMethod._registry.get(k).target_name)
 else:
 print()
print("-"*80)

--------------------------------------------------------------------------------
Split | Algo | Impl | 1d (ms) | 2d (ms) | ratio | Device
--------------------------------------------------------------------------------
no | histogram | python| 31.905 | 99.215 | 3.1 | 
no | histogram | cython| 12.397 | 20.965 | 1.7 | 
bbox | histogram | cython| 26.785 | 35.607 | 1.3 | 
full | histogram | cython| 150.790 | 214.578 | 1.4 | 
no | csr | cython| 7.120 | 7.573 | 1.1 | 
bbox | csr | cython| 7.213 | 8.136 | 1.1 | 
no | csr | python| 10.184 | 15.043 | 1.5 | 
bbox | csr | python| 13.421 | 18.183 | 1.4 | 
no | csc | cython| 7.973 | 10.450 | 1.3 | 
bbox | csc | cython| 10.585 | 14.801 | 1.4 | 
no | csc | python| 11.299 | 14.427 | 1.3 | 
bbox | csc | python| 14.845 | 21.879 | 1.5 | 
bbox | lut | cython| 6.978 | 11.554 | 1.7 | 
no | lut | cython| 6.939 | 7.410 | 1.1 | 
full | lut | cython| 7.070 | 11.674 | 1.7 | 
full | csr | cython| 7.035 | 8.100 | 1.2 | 
full | csr | python| 12.507 | 16.947 | 1.4 |

In [9]:
print(f"Total runtime: {time.perf_counter()-start_time:.3f}s")

Total runtime: 618.791s
