By Topic

Proceedings The International Conference on Application Specific Array Processors

24-26 July 1995

Filter Results

Displaying Results 1 - 25 of 40
  • Proceedings The International Conference on Application Specific Array Processors

    Publication Year: 1995
    Request permission for commercial reuse | PDF file iconPDF (154 KB)
    Freely Available from IEEE
  • Index of Authors

    Publication Year: 1995
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (51 KB)

    Presents an index of the authors whose papers are published in the conference. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Revisiting the decomposition of Karp, Miller and Winograd

    Publication Year: 1995, Page(s):13 - 25
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (592 KB)

    This paper is devoted to the construction of multi-dimensional schedules for a system of uniform recurrence equations. We show that this problem is dual to the problem of computability of a system of uniform recurrence equations. We propose a new study of the decomposition algorithm first proposed by Karp, Miller and Winograd: we base our implementation on linear programming resolutions whose dual... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of parallel arithmetic in a cellular automaton

    Publication Year: 1995, Page(s):238 - 245
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (316 KB)

    We describe an approach to parallel computation using particle propagation and collisions in a one-dimensional cellular automaton using a Particle model-a Particle Machine (PM). Such a machine has the parallelism, structural regularity, and local connectivity of systolic arrays, but is general and programmable. It contains no explicit multipliers, adders, or other fixed arithmetic operations; thes... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The naive execution of affine recurrence equations

    Publication Year: 1995, Page(s):1 - 12
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (532 KB)

    In recognition of the fundamental relation between regular arrays and systems of affine recurrence equations, the ALPHA language was developed as the basis of a computer aided design methodology for regular array architectures. ALPHA is used to initially specify algorithms at a very high algorithmic level. Regular array architectures can then be derived from the algorithmic specification using a t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A solid translation engine using ray representation

    Publication Year: 1995, Page(s):157 - 165
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (456 KB)

    We describe an extension to the geometric domain of solid modeling to include solids defined by spatial sweeping and Minkowski sums. We develop an efficient, parallel algorithm for the translation of such solid models. An architecture and design of an array processor that implements this algorithm are presented. We discuss some applications of the new computer to solid modeling an CAD/CAM and mode... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interfacing FPGA/VLSI processor arrays

    Publication Year: 1995, Page(s):230 - 237
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (388 KB)

    Mapping DSP algorithms to FPGA/VLSI circuits is an important issue in Application-Specific Array Processor design. Since a DSP algorithm can be abstracted as a graph where each node is a shift-invariant DG (Dependence Graph) and the edges denote the data flow, it is possible to map a DSP algorithm to a set of processor arrays with some interface circuits. The interface design depends on the projec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A design tool for the specification and the simulation of array processors architectures application to image processing: the extraction of regions of interests

    Publication Year: 1995, Page(s):322 - 329
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (388 KB)

    This paper deals with a CAD tool dedicated to the design and the simulation of specific array processor architectures. These architectures are described into a specific notation which includes major characteristics of the VHDL syntax. This language provides a very concise and legible means to specify array processors. A preprocessor generates full standard VHDL code describing the behavior of the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The VLSI design and implementation of the array processors of a multilayer vision system architecture

    Publication Year: 1995, Page(s):125 - 128
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (212 KB)

    This paper describes the VLSI design and simulation of the lower layer processors of the KYDON vision system. KYDON is a completely autonomous, hierarchical, multilayered image understanding system. The VLSI design of the individual components as well as the timing simulation results of the processor array have been presented. The system runs at 50 MHz and promises a high processing rate of 300 im... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data alignments for modular time-space mappings of BLAS-like algorithms

    Publication Year: 1995, Page(s):34 - 41
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (384 KB)

    Modular time-space transformations have been recently proposed for algorithm mappings that cannot be described by affine functions. This paper extends affine data alignments to a new class of data alignments, called expanded modular data alignments (EMDAs), for algorithms that are mapped by modular time-space transformations. An EMDA is a set of modular data alignments (MDAs) which are described b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis of multirate VLSI arrays

    Publication Year: 1995, Page(s):310 - 321
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (592 KB)

    Many applications in signal and image processing can be implemented on regular VLSI architectures such as systolic arrays. Multirate arrays, or MRAs are an extension of systolic arrays where different data streams propagate with different clocks. It is known that they can be modelled as systems of uniform recurrence equations over sparse polyhedral domains. Using well known linear index transforma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis of VLSI architectures for two-dimensional discrete wavelet transforms

    Publication Year: 1995, Page(s):174 - 181
    Cited by:  Papers (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (428 KB)

    We propose VLSI architectures with parallel I/O capability to compute the Two-Dimensional Discrete Wavelet Transform. Our design can handle large images arriving at high frame rates. A video codec based on our architecture can support multiple channels in parallel and can provide the needed performance for network based video applications. Our architecture with parallel I/O offers a solution for t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recomputing by operand exchanging: a time-redundancy approach for fault-tolerant neural networks

    Publication Year: 1995, Page(s):54 - 64
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (696 KB)

    The use of neural networks in mission-critical applications requires concurrent error detection and correction at architectural level to provide high consistency and reliability of system's outputs. Time redundancy allows for fault tolerance in digital realizations with low circuit complexity increase. In this paper, we propose the use of REcomputation with eXchanged Operands-an approach based on ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic filter for fast DNA similarity search

    Publication Year: 1995, Page(s):145 - 156
    Cited by:  Papers (1)  |  Patents (26)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (464 KB)

    This paper presents a systolic filter for speeding up the scan of DNA databases. The filter acts as a co-processor which performs the more intensive computations occurring during the process. Our validation, based on a FPGA prototype board tightly connected to a workstation, has shown that the filter may boost the performance of the machine by a factor ranging from 50 to 400 over current workstati... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel sequence comparison and alignment

    Publication Year: 1995, Page(s):137 - 140
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (228 KB)

    Sequence comparisons, a vital research tool in computational biology, is based on a simple O(n2) algorithm that easily maps to a linear array of processors. This paper reviews and compares high-performance sequence analysis on general-purpose supercomputers and single-purpose, reconfigurable, and programmable co-processors. The difficulty of comparing hardware from published performance... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Techniques for yield enhancement of VLSI adders

    Publication Year: 1995, Page(s):222 - 229
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB)

    For VLSI application-specific arrays and other regular VLSI circuits, two techniques are available for yield enhancement, namely defect-tolerance and layout modifications. In this paper, we compare these two yield enhancement approaches by using adders as an example. Our yield projections indicate that the layout modification technique is more efficient when the defect density is low, while reconf... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimizing synchronization overhead in statically scheduled multiprocessor systems

    Publication Year: 1995, Page(s):298 - 309
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    Synchronization overhead can significantly degrade performance in embedded multiprocessor systems. This paper develops techniques to determine a minimal set of processor synchronizations that are essential for correct execution in an embedded multiprocessor implementation. Our study is based in the context of self-timed execution of iterative dataflow programs; dataflow programming in this form ha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The MGAP's programming environment and the *C++ language

    Publication Year: 1995, Page(s):121 - 124
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (200 KB)

    The MGAP is a special-purpose, workstation co-processor board in which the computing elements are fine grain processors implemented as custom ASICs. In this paper we present the language *CC++, used for programming on the MGAP. Using the class concept of C++ we create special parallel data-types like bit, digit, word and array and overload operators to manipulate the parallel data required by the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A processor-time-minimal schedule for 3D rectilinear mesh algorithms

    Publication Year: 1995, Page(s):26 - 33
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB)

    The paper, using a directed acyclic graph (dag) model of algorithms, investigates precedence constrained multiprocessor schedules for the nx×ny×nz directed rectilinear mesh. Its completion requires at least nx+ny +nz-2 multiprocessor steps. Time-minimal multiprocessor schedules that use as few processors as possible are ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Digit on-line large radix CORDIC rotator

    Publication Year: 1995, Page(s):246 - 257
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (460 KB)

    Many applications figure the evaluation of rotations at high speeds. However there is a trade-off between the chip area and the latency. In this paper we develop a digit on-line pipelined array architecture based on the radix-4 CORDIC algorithm in rotation mode. The radix-4 CORDIC algorithm halves the number of microrotations with respect the traditionally radix-2 algorithm with the drawback of a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Input buffering requirements of a systolic array for the inverse discrete wavelet transform

    Publication Year: 1995, Page(s):166 - 173
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (348 KB)

    The Discrete Wavelet Transform (DWT) is a signal processing technique popularised by its results in data compression. Considerable work has been done in designing novel architectures to perform the DWT, including a systolic architecture designed by the authors, but little attention has been given to the inverse DWT which is needed in applications such as data compression for signal reconstruction.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Precise tiling for uniform loop nests

    Publication Year: 1995, Page(s):330 - 337
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (352 KB)

    The subject of this article is a hyperplane partitioning problem applied to perfect loop nests. This work is aimed at increasing the computation granularity to reduce the overhead due to communication. This study is different from previous work as it takes redundant communication into account. We propose an algorithm giving the optimal solution and various examples to show the validity of this rep... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallelizing compilation method for the map-oriented machine

    Publication Year: 1995, Page(s):129 - 132
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (240 KB)

    The paper introduces a novel parallelizing compilation method for the MoM. The MoM (Map-oriented Machine) is an Xputer architecture featuring multiple data sequencers and “soft ALUs”. The compiler accepts C-source, which are restructured and partitioned into structural and sequential code providing parallelism at expression and statement level View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Time-optimal ranking algorithms on sorted matrices

    Publication Year: 1995, Page(s):42 - 53
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    Answering rank queries is a recurring operation in various application domains including geographic data processing, information retrieval, database design, information management, and medical image processing. Many of these applications involve data stored in a matrix satisfying a number of properties. One property that occurs time and again in applications specifies that the rows and the columns... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel implementation of the full search block matching algorithm for motion estimation

    Publication Year: 1995, Page(s):182 - 192
    Cited by:  Papers (16)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (576 KB)

    Motion estimation is a key technique in most algorithms for video compression and particularly in the MPEG and H.261 standards. The most frequently used technique is based on a Full Search Block Matching Algorithm which is highly computing intensive and requires the use of special purpose architectures to obtain real-time performance. We propose an approach to the parallel implementation of the Fu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.