GPU-Accelerated Computed Tomography
Fang Xu and Klaus Mueller

Center for Visual Computing, Computer Science Department, Stony Brook University, Stony Brook, NY 11794


Abstract


The task of reconstructing an object from its projections via tomographic methods is a time-consuming process due to the vast complexity of the data. For this reason, manufacturers of equipment for medical computed tomography (CT) rely mostly on special ASICs to obtain the fast reconstruction times required in clinical settings. Although modern CPUs have gained sufficient power in recent years to be competitive for 2D reconstruction, this is not the case for 3D reconstructions, especially not when iterative algorithms must be applied. Incidentally, this has prevented some very effective algorithms to be applied in clinical practice, as well as in general research. However, the recent evolution of commodity PC computer graphics boards (GPUs) has the potential to change this picture in a very dramatic way. We have shown that the new GPUs can be exploited to perform both analytical and iterative reconstruction from X-ray and functional imaging data at clinical rates and high quality. We have decomposed three popular 3D reconstruction algorithms into a common set of base modules, which all can be executed on the GPU and their output linked internally. The data never leave the GPU, which eliminates previous costly GPU-CPU bottlenecks. Visualization of the reconstructed object is easily achieved since the object already resides in the graphics hardware, allowing one to run a visualization module at any time to view the reconstruction results. Our implementation allows speedups of 1-2 orders of magnitude over software implementations, at comparable image quality.


Downloads


[paper
s]


[slides]

  • Poster on IEEE Medical Imaging Conference, 2003 (pdf)
  • Poster on IEEE International Symposium on Biomedical Imaging, 2004 (pdf)

Configurations

  • Platform: Pentium 4 2.66GHz PC, 512MB RAM, GeForce FX 5900 (256MB RAM)
  • Experiment Data: 3D Shepp Logan Phantom
    • contrast levels: 0.5%, 1% and 2%
    • resolutions: 1283 and 2563
    • cone angle: 15 degrees
  • Algorithms:
    • SART (Simultaneous Algebraic Reconstruction Technique)
    • ML-EM/OS-EM (Maximum Likelihood - Expectation Maximization / Ordered-Subset Expectation Maximization)
    • FDK-FBP (Feldkamp-Davis-Kress Filtered Backprojection)
  • Supported Geometries:
    • Cone-beam
    • Fan-beam
    • Parallel-beam

Timings

GPU FDK

Projections
Volume Size
Precision
Time
160  x  1282
1283
float (32-bit)
10s
160  x  1282
1283
byte (8-bit)
1s
160  x  1282
1283
2-byte (16-bit)
1.7s
160  x   2562 
2563
2-byte (16-bit)
12s
        160  x  1024x768
256
2-byte (16-bit)
12s
        360  x  1024x768
2563
2-byte (16-bit)
44s

SART / EM / FDK

Platform
Algorithm
Projections
Volume Size
Precision
Time for Projections
Time for Backprojections
Iteration
Total
SGI-hardware
SART
80 x 1282
1283
12-bit (extended)
N/A
N/A
1.1 min
3.1 min
PC - CPU
SART
80 x 1282
1283
floating point
75s
75s
2.5 min
7.5 min
PC - GPU
SART
80 x 1282
1283
floating point
0.4s
9s
12s
36s
PC - GPU
OS-EM
80 x 1282
1283
floating point
0.9s
17s
21s
63s
PC - GPU
Feldkamp
80 x 1282
1283
floating point
N/A
5s
N/A
5s
PC - GPU
Feldkamp
160 x 1282
1283
floating point
N/A
9s
N/A
9s

Results
  • A slice across the reconstructed 3D Shepp Logan Phantom is presented to examine the reconstruction quality
  • A plot on the right shows intensity profiles across the center of three small ellipsoids near the bottom

GPU FDK

1283 grid

floating point, 0.5% contrast, 80 projections floating point, 0.5% contrast, 160 projections

GPU FDK

2563 grid

byte, 0.5% contrast, 160 proj. Plot byte, 1% contrast, 160 proj. byte, 2% contrast, 160 proj.
2-byte, 0.5% contrast, 160 proj. Plot 2-byte, 1% contrast, 160 proj. 2-byte, 1% contrast, 160 proj.
 

GPU FDK

Axial slices of the Shepp Logan Phantom, 2563 grid, 160 projections

  y-z plane x-z plane x-y plane

byte precision

0.5% contrast

2-byte precision

0.5% contrast

2-byte precision

1% contrast

2-byte precision

2% contrast

 

GPU SART

1283 grid, 3 iterations

floating point, 0.5% contrast, 80 projections floating point, 0.5% contrast, 160 projections


GPU OS-EM

1283 grid, 3 iterations

Original EM (clean), floating point, 80 projections

Software (Possion noise added), floating point, 80 projections EM (Poisson noise added), floating point, 80 projections
 

Evaluations
  • Overal CC: Correlation Coefficient of the phantom within the skull
  • Tumor CC: Correlation Coefficient of three small tumors at the bottom
  • CV: Coefficient of Variation over four ellipsoidal regions with uniform contents


Real Datasets

same transfer functions

  Original

Reconstructed

CT Head

Engine