In this paper, we propose two scalable architectures (say, ArcJ and Arc2*) that perform the discrete wavelet transform (DWT) of an N0-sample sequence in only N0/2 clock cycles. Therefore, they are at least twice as fast as the other known architectures. Also, they have an AT2 parameter that is approximately 1/2 that of already existing devices. This result has been achieved by means of a carefully balanced pipelining, and it has two “faces.” First, ArcJ and Arc2* can be employed for performing two times faster processing than allowed by other architectures working at the same clock frequency (highspeed utilization). Second, they can be employed even using a two times lower clock frequency but reaching the same performance as other architectures. This second possibility allows for reducing the supply voltage and the power dissipation, respectively, by a factor of two and four with respect to other architectures (low-power utilization). As a final result, we show that a parallel architecture implementing an L-tap filter-based DWT with J decomposition levels [say, ArCOPT(J, L)] can be defined, aiming at having an excellent efficiency (say, eff[ArcOPT(J, L)]) for any value of J and L. For instance, the average value of eff[ArcOPT(J, L)] [computed in very wide set Σ' of “points” (J, L)] is 99.1%. The minimum value of eff[ArcOPT(J, L)] in Σ' is 93.84, and, except for five “points.” in all the others, eff[ArcOPT(J, L)] is not lower than 96.99
Published in:
Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on
(Volume:47
,
Issue:
12
)
Date of Publication: Dec 2000