Skip to Main Content
This paper presents a new design approach for the VLSI implementation of a prime-length discrete cosine transform DCT based on a new hardware algorithm for DCT that can be implemented using a multi-port ROM-based systolic array. The proposed algorithm is based on the idea of reformulating prime-length DCT into several cycle convolutions having the same length and similar structures. Using the proposed approach we can efficiently exploit the inherent parallelism thus doubling the throughput without to double the hardware and I/O cost but only slightly increasing them. Moreover, the proposed VLSI implementation preserves all the other advantages of the VLSI algorithms based on circular correlations or cycle convolutions such as modular and regular structures with local interconnection topology..