Abstract:
This work presents the design and implementation of a single-precision 32-bit floating point unit (FPU) in 28-nm bulk CMOS technology. The FPU includes arithmetic operati...Show MoreMetadata
Abstract:
This work presents the design and implementation of a single-precision 32-bit floating point unit (FPU) in 28-nm bulk CMOS technology. The FPU includes arithmetic operations compliant with the RISC-V instruction set architecture, allowing the definition of the rounding method and exceptions defined by the standard. The included floating point (FP) operations are addition, multiplication, division, and square root, which require, respectively, 7, 14, 15 and 44 clock cycles. A denormalized FP input causes an overhead of 2 clock cycles in each case except for the multiplication, where the overhead is 5. Each operation is based on a respective 24-bit integer version that works on the fractional part of the number. For the computation of the FP result, the integer operators share the common stages of input preparation and output rounding and normalization. The multiplier is based on the Dadda multiplier architecture, the divider implements the radix-16 digit recurrence division algorithm, and the square root algorithm used is the Mr. Woo's abacus algorithm. The FPU has been synthesized both on a Xilinx Spartan-7 XC7S25 FPGA and in TSMC CMOS 28-nm HPC+ technology by 55k gates in an area of 40 µm2 and 5,32 µW of dynamic power and 0,03 µW of static power consumption. The behavioral code is publicly available.
Date of Conference: 09-12 June 2024
Date Added to IEEE Xplore: 25 June 2024
ISBN Information: