By Topic

Low-Power Multiple-Precision Iterative Floating-Point Multiplier with SIMD Support

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Dimitri Tan ; Advanced Micro Devices Inc., Austin ; Carl E. Lemonds ; Michael J. Schulte

The demand for improved SIMD floating-point performance on general-purpose x86-compatible microprocessors is rising. At the same time, there is a conflicting demand in the low-power computing market for a reduction in power consumption. Along with this, there is the absolute necessity of backward compatibility for x86-compatible microprocessors, which includes the support of x87 scientific floating-point instructions. The combined effect is that there is a need for low-power, low-cost floating-point units that are still capable of delivering good SIMD performance while maintaining full x86 functionality. This paper presents the design of an x86-compatible floating-point multiplier (FPM) that is compliant with the IEEE-754 Standard for Binary Floating-Point Arithmetic and is specifically tailored to provide good SIMD performance in a low-cost, low-power solution while maintaining full x87 backward compatibility. The FPM efficiently supports multiple precisions using an iterative rectangular multiplier. The FPM can perform two parallel single-precision multiplies every cycle with a latency of two cycles, one double-precision multiply every two cycles with a latency of four cycles, or one extended-double-precision multiply every three cycles with a latency of five cycles. The iterative FPM also supports division, square-root, and transcendental functions. Compared to a previous design with similar functionality, the proposed iterative FPM has 60 percent less area and 59 percent less dynamic power dissipation.

Published in:

IEEE Transactions on Computers  (Volume:58 ,  Issue: 2 )