Skip to Main Content
The authors consider time-frequency multiresolution analysis based on wavelets, as it applies to speech/audio and image/video signal compression. They compare the wavelet analysis to the traditional short-window techniques used in signal compression. The performance of the discrete wavelet transform in terms of the bit rates and signal quality is comparable to that for other techniques such as the discrete cosine transform (DCT) for images and code-excited linear predictive coding (CELP) for speech, but with much less computational burden. Experiments with an image and Daubechies's four-coefficient wavelet show that truncation of wavelet coefficients as high as 90% still produces 30-dB peak signal-to-noise ratio (PSNR) quality. This is better than DCT. In an experiment on a male spoken sentence, the scheme reaches a 12.82-dB segmental signal-to-noise ratio (SEGSNR) at a rate of less than 4.8 kb/s. In comparison, the state-of-the-art CELP coding at 4.8 kbit/s can attain SEGSNR of 10-13 dB. Other experiments with images and Haar two-coefficient wavelet are also highlighted.