High-speed double-precision computation of reciprocal, division, square root, and inverse square root | IEEE Journals & Magazine | IEEE Xplore

High-speed double-precision computation of reciprocal, division, square root, and inverse square root


Abstract:

A new method for the high-speed computation of double-precision floating-point reciprocal, division, square root, and inverse square root operations is presented in this ...Show More

Abstract:

A new method for the high-speed computation of double-precision floating-point reciprocal, division, square root, and inverse square root operations is presented in this paper. This method employs a second-degree minimax polynomial approximation to obtain an accurate initial estimate of the reciprocal and the inverse square root values, and then performs a modified Goldschmidt iteration. The high accuracy of the initial approximation allows us to obtain double-precision results by computing a single Goldschmidt iteration, significantly reducing the latency of the algorithm. Two unfolded architectures are proposed: the first one computing only reciprocal and division operations, and the second one also including the computation of square root and inverse square root. The execution times and area costs for both architectures are estimated, and a comparison with other multiplicative-based methods is presented. The results of this comparison show the achievement of a lower latency than these methods, with similar hardware requirements.
Published in: IEEE Transactions on Computers ( Volume: 51, Issue: 12, December 2002)
Page(s): 1377 - 1388
Date of Publication: 06 January 2003

ISSN Information:


1 Introduction

Reciprocal, division, square root, and inverse square root are important operations for several applications such as digital signal processing, multimedia, computer graphics, and scientific computing [7], [12], [18], [22]. Although they are less frequent than the two basic arithmetic operations, the poor performance of many processors when computing these operations can make their overall execution time comparable to the time spent performing addition and multiplication [16].

Contact IEEE to Subscribe

References

References is not available for this document.