In this paper, distributed arithmetic (DA) has been used to implement a fully parallel LUT-based DA wavelet filterbank with interlaced input registers. In our scheme, decimation has been seamlessly integrated into the filter structure to achieve the same throughput performance as polyphase-based filterbanks. However, because partitioning of the filters is avoided, our scheme gives more flexibility to implement the LUT-DA structure, and consequently, lets designers maximize area utilization on LUT-based FPGAs. Our architecture has been designed for orthonormal and biorthogonal wavelets, and implemented on an Altera Stratix II FPGA. Significant reduction in terms of area requirements and increased throughput performance are achieved when compared to other DWT filterbanks based on DA, convolution or the lifting scheme.