Skip to Main Content
The JPEG2000 image coding standard provides many superior features compared to JPEG and other compression standards. However, the relatively slow performance of JPEG2000, especially in software implementations, is a critical drawback of the standard. Moreover, as image sizes rapidly grow in size, higher demands on performance for image coding and processing are introduced, making the slow performance of JPEG2000 even further pronounced. While much effort over the past decade has been devoted to accelerating the JPEG2000 encoder, there have been very few studies focusing on improving the performance of the JPEG2000 decoder, despite the fact that the performance of the decoder is just as critical as the encoder. This paper proposes a high-performance JPEG2000 decoder that efficiently exploits the recent improvements of modern parallel programming models and hardware architectures. Specifically, a parallel streaming decoder running on a GPGPU-CPU heterogeneous system is developed to fully exploit the flexibility of the high-performance multi-core CPUs and the massively parallel capability of GPGPUs. In addition, a new task scheduling strategy is developed that exploits the soft-heterogeneity in OpenCL and C/C++ at runtime in order to gain a significant performance boost. Running on a heterogeneous configuration of one Nvidia GTX 480 GPU and one Intel Core i7 CPU, the parallel streaming decoder gains more than 8X speedup in runtime compared to the JasPer JPEG2000 software implementation.