Skip to Main Content
With their parallel multi-core architecture, Programmable Graphics Processing Units (GPUs) are well suited for implementing biologically-inspired visual processing algorithms, such as Gabor filtering. We compare several GPU implementations of Gabor filtering. On the same graphics card (an NVIDIA GeForce 9800 GTX+) and for convolution kernel radii from 8 to 48 pixels, an algorithm that decomposes Gabor filtering into a number of simpler steps results in an algorithm that is 2.2 to 33 times faster than direct 2D convolution and 2.8 to 6.6 times faster than a FFT based approach. Surprisingly, in comparison with an optimized algorithm for Gabor filtering running on a PC (Core2 Duo 3.16GHz), it is only 4-10 times faster. The PC can efficiently implement a recursive 1D filter, which requires far fewer arithmetic operations than convolution. However, due to data dependencies, this recursive filter typically runs slower than 1D convolution on the GPU. This highlights the importance of simultaneously considering both arithmetic and memory operations in porting algorithms to GPUs.