Recent developments in 3D low-light level CCD (L3CCD) image capture have resulted in vast volumes of data being produced in real time which require image registration. The amount of data involved means that acceleration of the processing is essential. One of the key steps in one iterative registration algorithm is the application of an affine transform to all the planes of a 3D image. This paper presents details and performance results for a number of parallelized implementations of the affine transform on the NVIDIA 8800 GPU series, and shows that the transform runs 128 times faster on the GPU than a C++ version on a PC, or 54 times faster when data transfer between the GPU and the host PC is included.
|Title of host publication||Proceedings of 2009 13th International Machine Vision and Image Processing Conference|
|Place of Publication||Piscataway NJ 08855-1331|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||5|
|Publication status||Published - 04 Sep 2009|