In high-performance DSP systems, the memory bandwidth can be improved using high-density interconnect technology and appropriate memory mapping. High-density MCM and flip-chip solder bump technology is used to achieve a system with an I/O bandwidth of 100 Gb/s/cm2 die. The use of DRAMs in these systems usually make the performance of these systems poor, and some algorithms make it difficult to fully utilize the available memory bandwidth. This paper presents the design of a fast Fourier transform (FFT) engine that gives SRAM-like performance in a DRAM-based system. It uses almost 100% of the available burst-mode memory bandwidth. This FFT engine can compute a million-point FFT in 1.31 ms at a sustained computation rate of 8.64 × 1010 floating-point operations per second (FLOPS). This is at least an order of magnitude better than conventional systems.
Published in:
Advanced Packaging, IEEE Transactions on
(Volume:28
,
Issue:
2
)
Date of Publication: May 2005