Skip to Main Content
In this paper, a fast 16-point DCT is implemented using a multiplier-less architecture. The 16-point DCT matrix is decomposed into sparse sub-matrices in order to reduce the multiplications and finally the multiplications are completely eliminated using the lifting scheme. Therefore, the computational complexity of the architecture is much lower than the direct implementation of 16-point DCT. In software implementation, 45 dB of PSNR is achieved for the “Lena” image. The VLSI implementation has been carried out for a 90-nm standard cell technology at a clock frequency of 150 MHz.