Proceedings of IEEE 11th Symposium on Computer Arithmetic

June 29 1993-July 2 1993

Filter Results

Displaying Results 1 - 25 of 35
  • Proceedings of IEEE 11th Symposium on Computer Arithmetic

    Publication Year: 1993
    Request permission for commercial reuse | |PDF file iconPDF (60 KB)
    Freely Available from IEEE
  • Comparing several GCD algorithms

    Publication Year: 1993, Page(s):180 - 185
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (304 KB)

    The execution times of several algorithms for computing the GCD of arbitrary precision integers are compared. These algorithms are the known ones (Euclidean, binary, plus-minus), and the improved variants of these for multidigit computation (Lehmer and similar), as well as new algorithms introduced by the author: an improved Lehmer algorithm using two digits in partial consequence computation, and... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a fast validated dot product operation

    Publication Year: 1993, Page(s):62 - 69
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (488 KB)

    A double precision dot product operation is designed in which the final rounded result is validated by raising exception flags if either the result incurs catastrophic cancellation or the result is not accurate to one unit in the last place (ulp). The design guarantees one ulp accuracy in the absence of catastrophic cancellation. The user can thus obtain validated results at marginal extra cost wi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-radix modular multiplication for cryptosystems

    Publication Year: 1993, Page(s):277 - 283
    Cited by:  Papers (27)  |  Patents (13)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (512 KB)

    Two algorithms for modular multiplication with very large moduli are analyzed specifically for their applicability when a high radix is used for the multiplier. Both algorithms perform modulo reductions interleaved with the addition of partial products; one algorithm is using the standard residue system, whereas the other utilizes a nonstandard system using reductions modulo a power of the base. T... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast evaluation of polynomials and inverses of polynomials

    Publication Year: 1993, Page(s):186 - 192
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (328 KB)

    The parallel and online (i.e., digit serial, most significant digit first) evaluation of polynomials and inverses of polynomials is dealt with. New algorithms and architectures are proposed for such evaluations. A 3-D implementation model is presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-parallel convolvers

    Publication Year: 1993, Page(s):70 - 77
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (596 KB)

    A scheme for a convolver design, called a multiparallel convolver, that is based on concurrent processing of p adjacent samples that are input simultaneously to the p-parallel convolver is presented. The scheme uses p units, each of which receives the input samples and produces one convolution every p samples; these are called p-phase subconvolvers. The detailed design of the p-phase subconvolvers... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integer mapping architectures for the polynomial ring engine

    Publication Year: 1993, Page(s):44 - 51
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (580 KB)

    A finite polynomial ring structure for mapping inner product computations to parallel independent ring computations over 3-b moduli has been introduced by N.M. Wigley et al. (1992). The main algorithmic computation architecture can be implemented using well-established systolic array mapping principles, and a project to construct a Polynomial Ring Engine (PRE) is underway to exploit the VLSI imple... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Faster numerical algorithms via exception handling

    Publication Year: 1993, Page(s):234 - 241
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (604 KB)

    An attractive paradigm for building fast numerical algorithms is the following: (1) try a fast but occasionally unstable algorithm, (2) test the accuracy of the computed answer, and (3) recompute the answer slowly and accurately in the unlikely event it is necessary. This is especially attractive on parallel machines where the fastest algorithms may be less stable than the best serial algorithms. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An accurate LNS arithmetic unit using interleaved memory function interpolator

    Publication Year: 1993, Page(s):2 - 9
    Cited by:  Papers (8)  |  Patents (18)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (556 KB)

    A logarithmic number system (LNS) arithmetic unit using a new method for polynomial interpolation in hardware is described. The use of an interleaved memory reduces storage requirements by allowing each stored function value to be used in interpolation across several segments. This strategy always uses fewer words of memory than an optimized polynomial with stored polynomial coefficients. Many acc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithms and multi-valued circuits for the multioperand addition in the binary stored-carry number system

    Publication Year: 1993, Page(s):194 - 201
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (472 KB)

    Algorithms for the sum of two (three and four) digits in the binary stored-carry number system, using the smallest set of values for the positional sum, are presented. The corresponding adders, which use multivalued current-mode circuits, are also presented. The implementation of multioperand additions using these adders is compared with the usual binary implementation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New algorithms and VLSI architectures for SRT division and square root

    Publication Year: 1993, Page(s):80 - 86
    Cited by:  Papers (24)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (492 KB)

    Radix two algorithms for SRT division and square-rooting are developed. For these schemes, the result digits and the residuals are computed concurrently and the computations in adjacent rows are overlapped. Consequently, their performance should exceed that of the radix 2 SRT methods. VLSI array architectures for implementing the new division and square-rooting methods are also presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast implementations of RSA cryptography

    Publication Year: 1993, Page(s):252 - 259
    Cited by:  Papers (80)  |  Patents (20)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (516 KB)

    The authors detail and analyze the critical techniques that may be combined in the design of fast hardware for RSA cryptography: chinese remainders, star chains, Hensel's odd division (also known as Montgomery modular reduction), carry-save representation, quotient pipelining, and asynchronous carry completion adders. A fully operational PAM (programmable active memory) implementation of RSA that ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Complex SLI arithmetic: Representation, algorithms and analysis

    Publication Year: 1993, Page(s):18 - 25
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (544 KB)

    The extension of the SLI (symmetric level index) system to complex numbers and arithmetic is discussed. The natural form for representation of complex quantities in SLI is in the modulus-argument form, and this can be sensibly packed into a single 64-b word for the equivalent of the 32-b real SLI representation. The arithmetic algorithms prove to be very slightly more complicated than for real SLI... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Estimating the power consumption of CMOS adders

    Publication Year: 1993, Page(s):210 - 216
    Cited by:  Papers (28)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (600 KB)

    Six types of adders are examined in an attempt to model their power dissipation. It is shown that the use of a relatively simple model provides results that are qualitatively accurate, when compared to more sophisticated models and to physical implementations of the circuits. The main discrepancy between the simple model and the physical measurements seems to be the assumption that all gates will ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Measuring the accuracy of ROM reciprocal tables

    Publication Year: 1993, Page(s):95 - 102
    Cited by:  Papers (3)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (668 KB)

    It is proved that a conventional ROM reciprocal table construction algorithm generates tables that minimize the relative error. The worst case relative errors realized for such optimally computed k-bits-in, m-bits-out ROM reciprocal tables are then determined for all table sizes 3 ⩽ k, m ⩽ 12. It is then proved that the table construction algorithm always generates a k-bits-in, k-bits-out ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Floating point Cordic

    Publication Year: 1993, Page(s):130 - 137
    Cited by:  Papers (16)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (540 KB)

    A full-precision floating-point Cordic algorithm, suitable for the implementation of a word-serial Cordic architecture, is presented. The extension to existing block floating-point Cordic algorithms is in a floating-point representation for the angle. The angle is represented as a combination of exponent, microrotation bits, and two bits to indicate prerotations over π2 and π radians. Repres... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The design of a 64-bit integer multiplier/divider unit

    Publication Year: 1993, Page(s):171 - 178
    Cited by:  Papers (4)  |  Patents (27)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (432 KB)

    The highlights of the design of an integer multiplier/divider unit for a 64-b processor are presented. The final design is the result of a compromise between performance, complexity, and transistor count. It is optimized for two specific operations with the same hardware being shared by the remaining operations. Thus, for example, the multiplier can be configured for the execution of several diffe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • n × n carry-save multipliers without final addition

    Publication Year: 1993, Page(s):54 - 61
    Cited by:  Papers (4)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (584 KB)

    Carry-save multipliers require an adder at the last step to convert the carry-sum representation of the most significant half of the result into an irredundant form. A multiplication scheme where by this conversion is performed with a circuit operating in parallel with the carry-save array is presented. The resulting implementation, when a radix-2 adder array is used, produces a result on 2n bits ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A modular multiplication algorithm with triangle additions

    Publication Year: 1993, Page(s):272 - 276
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (300 KB)

    An algorithm for multiple-precision modular multiplication is proposed. In the algorithm, the upper half triangle of the whole partial products is first added up, and then the residue of the sum is calculated. Next, the sum of the lower half triangle of the whole partial products is added to the residue, and then the residue of the total amount is calculated. An efficient procedure for residue cal... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive beamforming using RNS arithmetic

    Publication Year: 1993, Page(s):36 - 43
    Cited by:  Papers (3)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (696 KB)

    The adaptive beamforming problem is solved using an algorithm-architecture-arithmetic combination that can be used for a small platform such as are found on aircraft or sonobuoys. The arithmetic used is the RNS system implemented on an array of processors that can be reassigned as the algorithm proceeds. The underlying algorithm is a modified Gaussian elimination. The (non-RNS) division operations... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient multiprecision floating point multiplication with optimal directional rounding

    Publication Year: 1993, Page(s):228 - 233
    Cited by:  Papers (4)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (324 KB)

    An algorithm is described for multiplying multiprecision floating-point numbers. The algorithm can produce either the smallest floating-point number greater than or equal to the true product, or the greatest floating-point number smaller than or equal to the true product. Software implementations of multiprecision floating-point multiplication can reduce the computation time by a factor of two if ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A lazy exact arithmetic

    Publication Year: 1993, Page(s):242 - 249
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (604 KB)

    Systems based on exact arithmetic are very slow. In practical situations, very few computations need be performed exactly as approximating the results is very often sufficient. Unfortunately, it is impossible to know at the time when the computation is called for whether an exact evaluation will be necessary or not. The arithmetic library presented here achieves laziness by postponing any exact co... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An underflow-induced graphics failure solved by SLI arithmetic

    Publication Year: 1993, Page(s):10 - 17
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (520 KB)

    Floating-point underflow is often regarded as either harmless or as an indication that the computational algorithm is in need of scaling. A counterexample to this view is given of a function for which contour plotting is difficult due to floating-point underflow. The function arose as an asymptotic solution to a model problem in turbulent combustion in which two chemical species (fuel and oxidizer... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On digit-recurrence division implementations for field programmable gate arrays

    Publication Year: 1993, Page(s):202 - 209
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (600 KB)

    The flexibility of field programmable gate arrays (FPGAs) can provide arithmetic-intensive programs with the benefits of custom hardware but without the high cost of custom silicon implementations. Efficient mappings are key to fast arithmetic implementations on FPGAs. A process for developing such mappings with lookup table based FPGAs is explored. The development process is illustrated with SRT ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Division with speculation of quotient digits

    Publication Year: 1993, Page(s):87 - 94
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (596 KB)

    The speed of SRT-type dividers is mainly determined by the complexity of the quotient-digit selection, so that implementations are limited to low-radix stages. A scheme is presented in which the quotient-digit is speculated and, when this speculation is incorrect, a rollback or a partial advance is performed. This results in a division operation with a shorter cycle time and a variable number of c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.