Skip to Main Content
In this paper, we propose a parallel hardware architecture for 2-D discrete wavelet transform (DWT) based on the lifting scheme. The architecture is consisted with two row processors and one column processor. The proposed architecture has the 4-way data scheduling, which decreased the initial latency of column direction transform. The transform results of the row processors are directly feed to the column processor, and this method can reduce the internal memory, which stored row processor's results. The column processor, which has the buffer memory to store the filter's intermediate values, can calculate the filter values every clock cycles. This architecture shows that the internal memory size is reduced with around 30% compare to the previous works, and total computational timing improved to N2/4+a. Finally, the architecture has been implemented in behavioral VHDL. The estimated timing of the maximum clock is 100 MHz in altera stratix device.