Skip to Main Content
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler that automatically generates C code enhanced with short vector instructions for digital signal processing (DSP) transforms, such as the fast Fourier transform (FFT). The input to our compiler is a concise mathematical description of a DSP algorithm in the language SPL. SPL is used in the SPIRAL system (http://wwwece.cmu.edu/~spira/) to generate highly optimized architecture adapted implementations of DSP transforms. Interfacing our compiler with SPIRAL yields speed-ups of more than a factor of 2 in several important cases including the FFT and the discrete cosine transform (DCT) used in the JPEG compression standard. For the FFT our automatically generated code is competitive with the hand-coded Intel Math Kernel Library.