Proceedings The International Conference on Application Specific Array Processors

24-26 July 1995

Filter Results

Displaying Results 1 - 25 of 40
  • Proceedings The International Conference on Application Specific Array Processors

    Publication Year: 1995
    Request permission for commercial reuse | PDF file iconPDF (154 KB)
    Freely Available from IEEE
  • Index of Authors

    Publication Year: 1995
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (51 KB)

    Presents an index of the authors whose papers are published in the conference. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of parallel arithmetic in a cellular automaton

    Publication Year: 1995, Page(s):238 - 245
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (316 KB)

    We describe an approach to parallel computation using particle propagation and collisions in a one-dimensional cellular automaton using a Particle model-a Particle Machine (PM). Such a machine has the parallelism, structural regularity, and local connectivity of systolic arrays, but is general and programmable. It contains no explicit multipliers, adders, or other fixed arithmetic operations; thes... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interfacing FPGA/VLSI processor arrays

    Publication Year: 1995, Page(s):230 - 237
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (388 KB)

    Mapping DSP algorithms to FPGA/VLSI circuits is an important issue in Application-Specific Array Processor design. Since a DSP algorithm can be abstracted as a graph where each node is a shift-invariant DG (Dependence Graph) and the edges denote the data flow, it is possible to map a DSP algorithm to a set of processor arrays with some interface circuits. The interface design depends on the projec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A design tool for the specification and the simulation of array processors architectures application to image processing: the extraction of regions of interests

    Publication Year: 1995, Page(s):322 - 329
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (388 KB)

    This paper deals with a CAD tool dedicated to the design and the simulation of specific array processor architectures. These architectures are described into a specific notation which includes major characteristics of the VHDL syntax. This language provides a very concise and legible means to specify array processors. A preprocessor generates full standard VHDL code describing the behavior of the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Techniques for yield enhancement of VLSI adders

    Publication Year: 1995, Page(s):222 - 229
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB)

    For VLSI application-specific arrays and other regular VLSI circuits, two techniques are available for yield enhancement, namely defect-tolerance and layout modifications. In this paper, we compare these two yield enhancement approaches by using adders as an example. Our yield projections indicate that the layout modification technique is more efficient when the defect density is low, while reconf... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimizing synchronization overhead in statically scheduled multiprocessor systems

    Publication Year: 1995, Page(s):298 - 309
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    Synchronization overhead can significantly degrade performance in embedded multiprocessor systems. This paper develops techniques to determine a minimal set of processor synchronizations that are essential for correct execution in an embedded multiprocessor implementation. Our study is based in the context of self-timed execution of iterative dataflow programs; dataflow programming in this form ha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A solid translation engine using ray representation

    Publication Year: 1995, Page(s):157 - 165
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (456 KB)

    We describe an extension to the geometric domain of solid modeling to include solids defined by spatial sweeping and Minkowski sums. We develop an efficient, parallel algorithm for the translation of such solid models. An architecture and design of an array processor that implements this algorithm are presented. We discuss some applications of the new computer to solid modeling an CAD/CAM and mode... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Bit level block matching systolic arrays

    Publication Year: 1995, Page(s):214 - 221
    Cited by:  Papers (1)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (360 KB)

    We present two bit-level systolic arrays for block matching which are designed by using a well-known methodology. Hardware complexities and speeds of both bit-level designs and conventional word-level arrays are compared by using synthesis tools. We pay special attention to a class of issues which were somewhat overlooked by previous publications, including power consumption due to high frequency,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Horizontal microcode compaction for programmable systolic accelerators

    Publication Year: 1995, Page(s):85 - 92
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (420 KB)

    This paper addresses the problem of compacting microcode for complex systolic systems used as accelerators for traditional computers. For this sort of system, the purpose is to have a low-level programming paradigm that is simple enough for those users that are not completely aware of hardware details. The microcode should be issued from a high-level language application developed on the host proc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multilayer cellular algorithm for complex number multiplication

    Publication Year: 1995, Page(s):290 - 297
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (420 KB)

    A new multilayer cellular algorithm for complex number multiplication is presented. The upper estimate of the time complexity is obtained. The design is based on an original model of distributed computation which is called Parallel Substitution Algorithm View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic filter for fast DNA similarity search

    Publication Year: 1995, Page(s):145 - 156
    Cited by:  Papers (1)  |  Patents (30)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (464 KB)

    This paper presents a systolic filter for speeding up the scan of DNA databases. The filter acts as a co-processor which performs the more intensive computations occurring during the process. Our validation, based on a FPGA prototype board tightly connected to a workstation, has shown that the filter may boost the performance of the machine by a factor ranging from 50 to 400 over current workstati... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Motion estimation algorithms on fine grain array processors

    Publication Year: 1995, Page(s):204 - 213
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (412 KB)

    Motion estimation plays a key role in video coding, (e.g., video telephone, MPEG, HDTV). Among the previous motion estimation algorithms, full-search block matching algorithms (BMA) are preferred because of their simplicity and lower control overhead when those algorithms are implemented in VLSI array processors. Previous full-search BMAs have considered one block matching at a time. There exist, ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A scalable halftoning coprocessor architecture

    Publication Year: 1995, Page(s):76 - 84
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (436 KB)

    Exact-angle superscreen dithering requires large dither tiles. Since storing precomputed screen elements for each intensity level would require too much memory, dithering must be executed on the fly at halftoning time. For this purpose a dithering coprocessor is presented which generates halftoned images at high speed. The proposed hardware architecture is based on a pipelined and scalable design ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CORDIC architectures with parallel compensation of the scale factor

    Publication Year: 1995, Page(s):258 - 269
    Cited by:  Papers (24)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (392 KB)

    The compensation of scale factor imposes significant computation overhead on the CORDIC algorithm. In this paper we will propose two algorithms and architectures in order to perform the compensation of the scale factor in parallel with the computation of the CORDIC iterations. This way it is not necessary to carry out the final multiplication or add scaling iterations in order to achieve the compe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A processor for staggered interval arithmetic

    Publication Year: 1995, Page(s):104 - 112
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (416 KB)

    The paper presents the design of a high-speed processor which performs staggered interval arithmetic. Each staggered interval is represented as the sum of a set of floating point numbers plus an interval, which consists of two floating point endpoints. Staggered interval arithmetic allows the precision of the computation to be specified and the accuracy of the result to be determined. Efficient ar... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a systolic coprocessor for rational addition

    Publication Year: 1995, Page(s):282 - 289
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (344 KB)

    We design a systolic coprocessor for the addition of signed normalized rational numbers. This is the most complicated rational operation: it involves GCD, exact division, multiplication and addition/subtraction. In particular the implementation of GCD and exact division improve significantly (2 to 4 times) previously known solutions. In contrast to the traditional approach, all operations are perf... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The systolic design of a block regularised parameter estimator using hierarchical signal flow graphs

    Publication Year: 1995, Page(s):141 - 144
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (172 KB)

    Hierarchical Signal Flow Graphs (HSFGs) am used to illustrate the computations and the data flow required for the block regularised parameter estimation algorithm. This algorithm protects the parameter estimation from numerical difficulties associated with insufficiently exciting data or where the behaviour of the underlying model is unknown. Hierarchical signal flow graphs (HSFGs) aid the user's ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VLSI algorithms for compressed pattern search using tree based codes

    Publication Year: 1995, Page(s):133 - 136
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (252 KB)

    Data compression methods are used to reduce the redundancy in data representation in order to decrease the data storage requirements and communication costs. In order to exploit the benefits of data compression to conserve internal processor storage and computation resources, it is desirable to perform operations on compressed data without decompressing it. We present hardware algorithms and VLSI ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The VLSI design and implementation of the array processors of a multilayer vision system architecture

    Publication Year: 1995, Page(s):125 - 128
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (212 KB)

    This paper describes the VLSI design and simulation of the lower layer processors of the KYDON vision system. KYDON is a completely autonomous, hierarchical, multilayered image understanding system. The VLSI design of the individual components as well as the timing simulation results of the processor array have been presented. The system runs at 50 MHz and promises a high processing rate of 300 im... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MOVIE: a building block for the design of real time simulator of moving pictures compression algorithms

    Publication Year: 1995, Page(s):193 - 203
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (576 KB)

    This paper shows how a real-time simulator of moving pictures compression algorithms can be rapidly assembled using a basic building block, here called MOVIE (MOdule for Video Experimentation). The internal architecture of the MOVIE VLSI chip can be compared to a small systolic machine made of a 32-bit I/O processor, a reduced linear array of 16-bit computation processors and data video input/outp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design and implementation of a parallel image processor chip for a SIMD array processor

    Publication Year: 1995, Page(s):66 - 75
    Cited by:  Papers (5)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (464 KB)

    This paper presents the design and implementation of a sliding memory plane (SliM) image processor chip to build a mesh-connected SIMD architecture called a SliM array processor. The SliM image processor chip consists of 5×5 processing elements (PEs) connected by a mesh topology. A set of SliM image processor chips can form the SliM array processor. Due to the idea of sliding, that is, overl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Digit on-line large radix CORDIC rotator

    Publication Year: 1995, Page(s):246 - 257
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (460 KB)

    Many applications figure the evaluation of rotations at high speeds. However there is a trade-off between the chip area and the latency. In this paper we develop a digit on-line pipelined array architecture based on the radix-4 CORDIC algorithm in rotation mode. The radix-4 CORDIC algorithm halves the number of microrotations with respect the traditionally radix-2 algorithm with the drawback of a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Precise tiling for uniform loop nests

    Publication Year: 1995, Page(s):330 - 337
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (352 KB)

    The subject of this article is a hyperplane partitioning problem applied to perfect loop nests. This work is aimed at increasing the computation granularity to reduce the overhead due to communication. This study is different from previous work as it takes redundant communication into account. We propose an algorithm giving the optimal solution and various examples to show the validity of this rep... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Input buffering requirements of a systolic array for the inverse discrete wavelet transform

    Publication Year: 1995, Page(s):166 - 173
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (348 KB)

    The Discrete Wavelet Transform (DWT) is a signal processing technique popularised by its results in data compression. Considerable work has been done in designing novel architectures to perform the DWT, including a systolic architecture designed by the authors, but little attention has been given to the inverse DWT which is needed in applications such as data compression for signal reconstruction.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.