Skip to Main Content
Facial recognition techniques are of interest for tracking and identification in densely populated areas where security is an important concern. Traditional recognition techniques have yielded acceptable results with high repeatability but require special conditions such as a voluntary and stationary subject, close proximity, and appropriate lighting. Because no single algorithm yields robust results under uncontrolled conditions, more than one algorithm must be considered. Three popular template-based algorithms involved in facial recognition include Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Independent Component Analysis (ICA). Since facial recognition algorithms require processor-intensive and complicated matrix calculations, these three algorithms could be improved with hardware that can accelerate these calculations. The PCA algorithm studied in this research is a common mathematical method that has been parallelized for other applications, but not for the purposes of facial recognition using General Purpose Graphics Processing Units (GPGPUs). NVIDIA's CUDA parallel computing architecture is employed to implement the PCA algorithm for the GPGPU device. A C implementation of the PCA algorithm has been optimized specifically for use with CUDA kernels and future parallelization in mind. The results provided in this paper show that the parallel GPGPU implementation outperforms the multithreaded C implementation on a general purpose CPU.