In this paper we present a load adaptive parallel algorithm and implementation to compute 2D Discrete Wavelet Transform (DWT) on multithreading machines. In a 2D DWT computation, the problem sizes reduces at every decomposition level and the lengths of the emerging computation paths also vary. The parallel algorithm proposed in this paper dynamically scales itself to the varying problem size. Experimental results are reported based on the implementations of the proposed algorithm on a 2D node multithreading emulation platform, EARTH-MANNA. We show that multithreading implementations of the proposed algorithm are at least 2 times faster than the MPI based message passing implementations reported in the literature. We further show that the proposed algorithm and implementations scale linearly with respect to problem and machine sizes
Date of Conference: 12-16 Apr 1999