Skip to Main Content
The discrete wavelet transform (DWT) is a powerful signal processing technique used in the JPEG 2000 image compression standard. The multi-resolution sub-band encoding provided by DWT allows for higher compression ratios, avoids blocking artifacts and enables progressive transmission of images. However, these advantages come at the expense of additional computational complexity. Achieving real-time or interactive compression/de-compression speeds, therefore, requires a fast implementation of DWT that leverages emerging parallel hardware systems. In this paper, we develop an optimized parallel implementation of the lifting-based DWT algorithm using the recently proposed Open Computing Language (OpenCL). OpenCL is a standard for cross-platform parallel programming of heterogeneous systems comprising of multi-core CPUs, GPUs and other accelerators. We explore the potential of OpenCL in accelerating the DWT computation and analyze the programmability, portability and performance aspects of this language. Our experimental analysis is done using NVIDIA's and AMD's drivers that support OpenCL.