Skip to Main Content
This paper presents improved architectures for a fused floating-point add-subtract unit. The fused floating-point add-subtract unit is useful for digital signal processing (DSP) applications such as fast Fourier transform (FFT) and discrete cosine transform (DCT) butterfly operations. To improve the performance of the fused floating-point add-subtract unit, a dual-path algorithm and pipelining are employed. The proposed designs are implemented for both single and double precision and synthesized with a 45-nm standard-cell library. The fused floating-point add-subtract unit saves 40% of the area and power consumption compared to a discrete floating-point add-subtract unit. The proposed dual-path design reduces the latency by 30% compared to the discrete design with area and power consumption between that of the discrete and fused designs. Based on a data flow analysis, the proposed fused dual-path floating-point add-subtract unit can be split into two pipeline stages. Since the latencies of two pipeline stages are fairly well balanced, the throughput is increased by 80% compared to the nonpipelined dual-path design.