By Topic

2011 IEEE 20th Symposium on Computer Arithmetic

25-27 July 2011

Filter Results

Displaying Results 1 - 25 of 45
  • [Front cover]

    Publication Year: 2011, Page(s): C1
    Request permission for commercial reuse | PDF file iconPDF (176 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2011, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (19 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2011, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (63 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2011, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (122 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2011, Page(s):v - viii
    Request permission for commercial reuse | PDF file iconPDF (526 KB)
    Freely Available from IEEE
  • Foreword

    Publication Year: 2011, Page(s): ix
    Request permission for commercial reuse | PDF file iconPDF (65 KB) | HTML iconHTML
    Freely Available from IEEE
  • Dedication

    Publication Year: 2011, Page(s):x - xiv
    Request permission for commercial reuse | PDF file iconPDF (129 KB) | HTML iconHTML
    Freely Available from IEEE
  • Steering Committee

    Publication Year: 2011, Page(s): xv
    Request permission for commercial reuse | PDF file iconPDF (65 KB)
    Freely Available from IEEE
  • Symposium Committee

    Publication Year: 2011, Page(s): xvi
    Request permission for commercial reuse | PDF file iconPDF (62 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2011, Page(s): xvii
    Request permission for commercial reuse | PDF file iconPDF (79 KB)
    Freely Available from IEEE
  • Additional Reviewers

    Publication Year: 2011, Page(s): xviii
    Request permission for commercial reuse | PDF file iconPDF (54 KB)
    Freely Available from IEEE
  • Corporate Sponsors

    Publication Year: 2011, Page(s): xix
    Request permission for commercial reuse | PDF file iconPDF (160 KB)
    Freely Available from IEEE
  • High Intelligence Computing: The New Era of High Performance Computing

    Publication Year: 2011, Page(s): 3
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (69 KB) | HTML iconHTML

    This paper discusses about High Performance Computing including the introduction of the fused multiply-add dataflow, and innovations in vector computing and multi processing. This has led to a new era in high performance that has created human intelligence in computers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Short Division of Long Integers

    Publication Year: 2011, Page(s):7 - 14
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (239 KB) | HTML iconHTML

    We consider the problem of short division - i.e., approximate quotient - of multiple-precision integers. We present ready-to-implement algorithms that yield an approximation of the quotient, with tight and rigorous error bounds. We exhibit speedups of up to 30% with respect to GMP division with remainder, and up to 10% with respect to GMP short division, with room for further improvements. This wo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High Degree Toom'n'Half for Balanced and Unbalanced Multiplication

    Publication Year: 2011, Page(s):15 - 22
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (412 KB) | HTML iconHTML

    Some hints and tricks to automatically obtain high degree Toom-Cook implementations, i.e. functions for integer or polynomial multiplication with a reduced complexity. The described method generates quite an efficient sequence of operations and the memory footprint is kept low by using a new strategy: mixing evaluation, interpolation and recomposition phases. It is possible to automatise the whole... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Augmented Precision Square Roots and 2-D Norms, and Discussion on Correctly Rounding sqrt(x^2+y^2)

    Publication Year: 2011, Page(s):23 - 30
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (297 KB) | HTML iconHTML

    Define an "augmented precision" algorithm as an algorithm that returns, in precision-p floating-point arithmetic, its result as the unevaluated sum of two floating-point numbers, with a relative error of the order of 2-2p. Assuming an FMA instruction is available, we perform a tight error analysis of an augmented precision algorithm for the square root, and introduce two slightly differ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards a Quaternion Complex Logarithmic Number System

    Publication Year: 2011, Page(s):33 - 42
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (266 KB) | HTML iconHTML

    The well-known generalization of real to complex arithmetic (two reals) extends further to more obscure quaternion arithmetic (four reals), which has applications in signal processing, aerospace, graphics and virtual reality. Quaternion multiplication implements 3D rotation, but is expensive (usually 16 floating-point multiplications and 12 additions). This paper proposes an alternative quaternion... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ROM-less LNS

    Publication Year: 2011, Page(s):43 - 51
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (3129 KB) | HTML iconHTML

    The logarithmic number system has been proposed as an alternative to floating-point arithmetic. Multiplication, division and square-root operations are accomplished with fixed-point methods, but addition and subtraction are considerably more challenging. Recent work has demonstrated that these operations too can be done with similar speed and accuracy to their FP equivalents, but the necessary cir... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Composite Iterative Algorithm and Architecture for q-th Root Calculation

    Publication Year: 2011, Page(s):52 - 61
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (299 KB) | HTML iconHTML

    An algorithm for the q-th root extraction, q being any integer, is presented in this paper. The algorithm is based on an optimized implementation of X1/q = 2(1/q)log2(X) by a sequence of parallel and/or overlapped operations: (1) reciprocal, (2) digit-recurrence logarithm, (3) left-to-right carry-free multiplication and (4) on-line exponential. A detailed error analysis and t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Fixed-Point Accuracy Analysis and Optimization of FFT Units with CORDIC Multipliers

    Publication Year: 2011, Page(s):62 - 69
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (396 KB) | HTML iconHTML

    Fixed-point Fast Fourier Transform (FFT) units are widely used in digital communication systems. The twiddle multipliers required for realizing large FFTs are typically implemented with the Coordinate Rotation Digital Computer (CORDIC) algorithm to restrict memory requirements. Recent approaches aiming to optimize the bit-widths of FFT units while satisfying a given maximum bound on Mean-Square-Er... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self Checking in Current Floating-Point Units

    Publication Year: 2011, Page(s):73 - 76
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (271 KB) | HTML iconHTML

    High performance microprocessors are protected against transient and early end of life failures using a variety of error detection and fault isolation technologies. Execution units can be protected with duplication, parity prediction, or residue checking. Residue checking has an advantage due to its small size. A modulus is selected based on the radix of the numbers being checked. In a decimal flo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • How to Square Floats Accurately and Efficiently on the ST231 Integer Processor

    Publication Year: 2011, Page(s):77 - 81
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (256 KB) | HTML iconHTML

    We consider the problem of computing IEEE floating-point squares by means of integer arithmetic. We show how to exploit the specific properties of squaring in order to design and implement algorithms that have much lower latency than those for general multiplication, while still guaranteeing correct rounding. Our algorithms are parameterized by the floating-point format, aim at high instruction-le... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A 1.5 Ghz VLIW DSP CPU with Integrated Floating Point and Fixed Point Instructions in 40 nm CMOS

    Publication Year: 2011, Page(s):82 - 86
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB) | HTML iconHTML

    A next generation VLIW DSP Central Processing Unit (CPU) which has an integrated fixed point and floating point Instruction Set Architecture (ISA) is presented. It is designed to meet a 1.5 GHz core clock frequency in a 40nm process with aggressive area and power goals. In this paper, the benchmarking process and benefits of newly defined instructions such as complex matrix multiply is explained. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The POWER7 Binary Floating-Point Unit

    Publication Year: 2011, Page(s):87 - 91
    Cited by:  Papers (2)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (306 KB) | HTML iconHTML

    The binary Floating-Point Unit (FPU) of the POWER7 processor is a 5.5 cycle Fused Multiply-Add (FMA) design, fully compliant with the IEEE 754-2008 standard. Unlike previous PowerPC designs, the POWER7 FPU merges the scalar and vector FPUs into a single unit executing three floating-point instruction sets: the single and double precision scalar set, the single precision VMX vector set, and the new... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Accelerating Computations on FPGA Carry Chains by Operand Compaction

    Publication Year: 2011, Page(s):95 - 102
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (284 KB) | HTML iconHTML

    This work describes the carry-compact addition (CCA), a novel addition scheme that allows the acceleration of carry-chain computations on contemporary FPGA devices. While based on concepts known from the carry-look ahead addition and from parallel prefix adders, their adaptation by the CCA takes the context of an FPGA as implementation environment into account. These typically provide carry-chain ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.