By Topic

Computer Arithmetic, 1993. Proceedings., 11th Symposium on

Date June 29 1993-July 2 1993

Filter Results

Displaying Results 1 - 25 of 35
  • Proceedings of IEEE 11th Symposium on Computer Arithmetic

    Publication Year: 1993
    Request permission for commercial reuse | PDF file iconPDF (60 KB)
    Freely Available from IEEE
  • Fast implementations of RSA cryptography

    Publication Year: 1993, Page(s):252 - 259
    Cited by:  Papers (70)  |  Patents (20)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    The authors detail and analyze the critical techniques that may be combined in the design of fast hardware for RSA cryptography: chinese remainders, star chains, Hensel's odd division (also known as Montgomery modular reduction), carry-save representation, quotient pipelining, and asynchronous carry completion adders. A fully operational PAM (programmable active memory) implementation of RSA that ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exact rounding of certain elementary functions

    Publication Year: 1993, Page(s):138 - 145
    Cited by:  Papers (18)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (584 KB)

    An algorithm is described which produces exactly rounded results for the functions of reciprocal, square root, 2x, and log 2 x. Hardware designs based on this algorithm are presented for floating point numbers with 16- and 24-b significands. These designs use a polynomial approximation in which coefficients are originally selected based on the Chebyshev series approximation a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Floating point Cordic

    Publication Year: 1993, Page(s):130 - 137
    Cited by:  Papers (16)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (540 KB)

    A full-precision floating-point Cordic algorithm, suitable for the implementation of a word-serial Cordic architecture, is presented. The extension to existing block floating-point Cordic algorithms is in a floating-point representation for the angle. The angle is represented as a combination of exponent, microrotation bits, and two bits to indicate prerotations over π2 and π radians. Repres... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware starting approximation for the square root operation

    Publication Year: 1993, Page(s):103 - 111
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (688 KB)

    A method for obtaining high-precision approximations of high-order arithmetic operations is presented. These approximations provide an accurate starting approximation for high-precision iterative algorithms, which translates into few iterations and a short overall latency. The method uses a partial product array to describe an approximation and sums the array on an existing multiplier. By reusing ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient multiprecision floating point multiplication with optimal directional rounding

    Publication Year: 1993, Page(s):228 - 233
    Cited by:  Papers (3)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (324 KB)

    An algorithm is described for multiplying multiprecision floating-point numbers. The algorithm can produce either the smallest floating-point number greater than or equal to the true product, or the greatest floating-point number smaller than or equal to the true product. Software implementations of multiprecision floating-point multiplication can reduce the computation time by a factor of two if ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Measuring the accuracy of ROM reciprocal tables

    Publication Year: 1993, Page(s):95 - 102
    Cited by:  Papers (2)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (668 KB)

    It is proved that a conventional ROM reciprocal table construction algorithm generates tables that minimize the relative error. The worst case relative errors realized for such optimally computed k-bits-in, m-bits-out ROM reciprocal tables are then determined for all table sizes 3 ⩽ k, m ⩽ 12. It is then proved that the table construction algorithm always generates a k-bits-in, k-bits-out ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A lazy exact arithmetic

    Publication Year: 1993, Page(s):242 - 249
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (604 KB)

    Systems based on exact arithmetic are very slow. In practical situations, very few computations need be performed exactly as approximating the results is very often sufficient. Unfortunately, it is impossible to know at the time when the computation is called for whether an exact evaluation will be necessary or not. The arithmetic library presented here achieves laziness by postponing any exact co... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The design of a 64-bit integer multiplier/divider unit

    Publication Year: 1993, Page(s):171 - 178
    Cited by:  Papers (3)  |  Patents (27)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB)

    The highlights of the design of an integer multiplier/divider unit for a 64-b processor are presented. The final design is the result of a compromise between performance, complexity, and transistor count. It is optimized for two specific operations with the same hardware being shared by the remaining operations. Thus, for example, the multiplier can be configured for the execution of several diffe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A modular multiplication algorithm with triangle additions

    Publication Year: 1993, Page(s):272 - 276
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (300 KB)

    An algorithm for multiple-precision modular multiplication is proposed. In the algorithm, the upper half triangle of the whole partial products is first added up, and then the residue of the sum is calculated. Next, the sum of the lower half triangle of the whole partial products is added to the residue, and then the residue of the total amount is calculated. An efficient procedure for residue cal... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Division with speculation of quotient digits

    Publication Year: 1993, Page(s):87 - 94
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (596 KB)

    The speed of SRT-type dividers is mainly determined by the complexity of the quotient-digit selection, so that implementations are limited to low-radix stages. A scheme is presented in which the quotient-digit is speculated and, when this speculation is incorrect, a rollback or a partial advance is performed. This results in a division operation with a shorter cycle time and a variable number of c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An underflow-induced graphics failure solved by SLI arithmetic

    Publication Year: 1993, Page(s):10 - 17
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (520 KB)

    Floating-point underflow is often regarded as either harmless or as an indication that the computational algorithm is in need of scaling. A counterexample to this view is given of a function for which contour plotting is difficult due to floating-point underflow. The function arose as an asymptotic solution to a model problem in turbulent combustion in which two chemical species (fuel and oxidizer... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BKM: A new hardware algorithm for complex elementary functions

    Publication Year: 1993, Page(s):146 - 153
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    An algorithm for computing complex logarithms and exponentials is proposed. The algorithm is based on shift-and-add elementary steps, and it generalizes the Cordic algorithm. It can compute the usual real elementary functions. This algorithm is more suitable for computations in a redundant number system than Cordic, since there is no scaling factor for computation of trigonometric functions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Gauss machine: A Galois-enhanced quadratic residue number system systolic array

    Publication Year: 1993, Page(s):156 - 162
    Cited by:  Papers (6)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (476 KB)

    The Gauss machine is a SIMD systolic array architecture that takes advantage of the Galois-enhanced residue number system (GEQRNS) to form reduced-complexity arithmetic elements. The Gauss machine is targeted at front-end signal and image processing applications. A discrete prototype that achieves a peak rating of 320 million complex arithmetic operations per second while operating at 10 MHz has b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithms and multi-valued circuits for the multioperand addition in the binary stored-carry number system

    Publication Year: 1993, Page(s):194 - 201
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (472 KB)

    Algorithms for the sum of two (three and four) digits in the binary stored-carry number system, using the smallest set of values for the positional sum, are presented. The corresponding adders, which use multivalued current-mode circuits, are also presented. The implementation of multioperand additions using these adders is compared with the usual binary implementation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On squaring and multiplying large integers

    Publication Year: 1993, Page(s):260 - 271
    Cited by:  Papers (4)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (552 KB)

    Methods of squaring large integers are discussed. The obvious O(n 2) method turns out to be best for small numbers. The existing ≈ O(n1.585) method becomes better as the numbers get bigger. New methods that are ≈ O(n1.465) and ≈ O(n 2.404) are presented. All of these methods can be generalized to multiplication and turn out to be faster than a f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Very high radix division with selection by rounding and prescaling

    Publication Year: 1993, Page(s):112 - 119
    Cited by:  Papers (8)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (496 KB)

    A division algorithm in which the quotient-digit selection is performed by rounding the shifted residual in carry-save form is presented. To allow the use of this simple function, the divisor (and dividend) is prescaled to a range close to one. The implementation presented results in a fast iteration because of the use of carry-save forms and suitable recodings. The execution time is calculated, a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Faster numerical algorithms via exception handling

    Publication Year: 1993, Page(s):234 - 241
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (604 KB)

    An attractive paradigm for building fast numerical algorithms is the following: (1) try a fast but occasionally unstable algorithm, (2) test the accuracy of the computed answer, and (3) recompute the answer slowly and accurately in the unlikely event it is necessary. This is especially attractive on parallel machines where the fastest algorithms may be less stable than the best serial algorithms. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparing several GCD algorithms

    Publication Year: 1993, Page(s):180 - 185
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (304 KB)

    The execution times of several algorithms for computing the GCD of arbitrary precision integers are compared. These algorithms are the known ones (Euclidean, binary, plus-minus), and the improved variants of these for multidigit computation (Lehmer and similar), as well as new algorithms introduced by the author: an improved Lehmer algorithm using two digits in partial consequence computation, and... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive beamforming using RNS arithmetic

    Publication Year: 1993, Page(s):36 - 43
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (696 KB)

    The adaptive beamforming problem is solved using an algorithm-architecture-arithmetic combination that can be used for a small platform such as are found on aircraft or sonobuoys. The arithmetic used is the RNS system implemented on an array of processors that can be reassigned as the algorithm proceeds. The underlying algorithm is a modified Gaussian elimination. The (non-RNS) division operations... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An accurate LNS arithmetic unit using interleaved memory function interpolator

    Publication Year: 1993, Page(s):2 - 9
    Cited by:  Papers (8)  |  Patents (17)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (556 KB)

    A logarithmic number system (LNS) arithmetic unit using a new method for polynomial interpolation in hardware is described. The use of an interleaved memory reduces storage requirements by allowing each stored function value to be used in interpolation across several segments. This strategy always uses fewer words of memory than an optimized polynomial with stored polynomial coefficients. Many acc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimating the power consumption of CMOS adders

    Publication Year: 1993, Page(s):210 - 216
    Cited by:  Papers (28)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (600 KB)

    Six types of adders are examined in an attempt to model their power dissipation. It is shown that the use of a relatively simple model provides results that are qualitatively accurate, when compared to more sophisticated models and to physical implementations of the circuits. The main discrepancy between the simple model and the physical measurements seems to be the assumption that all gates will ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Complex SLI arithmetic: Representation, algorithms and analysis

    Publication Year: 1993, Page(s):18 - 25
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB)

    The extension of the SLI (symmetric level index) system to complex numbers and arithmetic is discussed. The natural form for representation of complex quantities in SLI is in the modulus-argument form, and this can be sensibly packed into a single 64-b word for the equivalent of the 32-b real SLI representation. The arithmetic algorithms prove to be very slightly more complicated than for real SLI... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-parallel convolvers

    Publication Year: 1993, Page(s):70 - 77
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (596 KB)

    A scheme for a convolver design, called a multiparallel convolver, that is based on concurrent processing of p adjacent samples that are input simultaneously to the p-parallel convolver is presented. The scheme uses p units, each of which receives the input samples and produces one convolution every p samples; these are called p-phase subconvolvers. The detailed design of the p-phase subconvolvers... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient complex matrix transformations with CORDIC

    Publication Year: 1993, Page(s):122 - 129
    Cited by:  Papers (5)  |  Patents (23)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (644 KB)

    A two-sided unitary transformation (Q transformation) structured to permit integrated evaluation and application using CORDIC primitives is introduced. The Q transformation is shown to be useful as an atomic operation in parallel arrays for computing the eigenvalue/singular value decomposition of Hermitian/arbitrary matrices, and three specific Q transformations that are needed in such arrays are ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.