Skip to Main Content
In this paper, a reconfigurable macro-pipelined (RMP) accelerator is proposed to speed up the Discrete Cosine Transform (DCT) and Inverse Discrete Cosine Transform (IDCT). The accelerator can be reconfigured to compute fixed-point or floating-point, one-dimensional or multi-dimensional DCT/IDCT according to different system requirements. The prototype is implemented on Xilinx ML605 experiment board with 64 PEs. It takes 64 cycles at 200MHz to complete an 8×8 2D DCT and gets a peak performance of 25.6 GFLOPS for the floating-point DCT. The excellent scalability of this architecture enables the accelerator to scale up to an extremely high performance.