[1990] Proceedings of the International Conference on Application Specific Array Processors

5-7 Sept. 1990

Filter Results

Displaying Results 1 - 25 of 69
  • Calculus of space-optimal mappings of systolic algorithms on processor arrays

    Publication Year: 1990, Page(s):4 - 18
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (665 KB)

    The authors present a method for the mapping of systolic algorithms that use the minimal number of processors. This method is based on geometrical interpretations on convex polyhedra in Z/sup n/. The authors present a recurrence equation model defining the target problems for systolic program derivation. Some geometrical tools on convex polyhedra in Z/sup n/ are given. They are first used to model... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A processor-time minimal systolic array for transitive closure

    Publication Year: 1990, Page(s):19 - 30
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (528 KB)

    A directed acyclic graph (DAG) model of algorithms is used. For a given DAG the authors focus on processor-time minimal multiprocessor schedules: time minimal multiprocessor schedules that use as few processors as possible. The Kung, Lo and Lewis (KLL) algorithm (S.-Y. Kung et al., 1987) for computing the transitive closure of a relation over a set of n elements requires at least 5n-4 steps. Their... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic array implementation of nested loop programs

    Publication Year: 1990, Page(s):31 - 42
    Cited by:  Papers (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (748 KB)

    The authors consider a formal and systematic method to convert a class of nested loop programs to single assignment codes and, when possible, to regular algorithms (RAs) for systolic array implementation. The authors concentrate on the analysis of certain imperative nested loop programs in view of the ultimate objective, which is the (semi)-automatic design of systolic arrays from such initial beh... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The bit-serial systolic back-projection engine (BSSBPE)

    Publication Year: 1990, Page(s):43 - 54
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (424 KB)

    The author presents a machine designed with a two-phase approach. First, the selection of an efficient algorithm, based on the quality of the final image and on the computational efficiency, is undertaken. Second, the algorithm is realized in hardware which incorporates efficient array processing structures, with the aim of creating regular repeated structures. The design is based on the S.Y. Kung... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A database machine based on surrogate files

    Publication Year: 1990, Page(s):55 - 66
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (691 KB)

    Concatenated code word (CCW) surrogate files are useful as indexes for very large knowledge bases to support logic programming inference mechanisms because of their small size and simple maintenance requirements. A parallel back-end database machine is proposed to speed up relation operations based on the CCW surrogate files. The basic idea of the machine is to reduce the amount of fact data to be... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic architectures for decoding Reed-Solomon codes

    Publication Year: 1990, Page(s):67 - 77
    Cited by:  Papers (1)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (417 KB)

    A systolic implementation of a Reed-Solomon decoder is presented which with minor modification is suitable for BCH and Goppa codes. The various operations involved in decoding such codes were analyzed and the results are described. Systolic array architectures are derived for the various steps including the syndrome calculation, key equation solution and error evaluation. Since the throughput of t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping high-dimension wavefront computations to silicon

    Publication Year: 1990, Page(s):78 - 89
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (423 KB)

    The authors present a new template-matching algorithm with good recognition performance. However, this new algorithm exhibits a complex, four-dimensional, wavefront architecture. Thus, for VLSI implementation, reduced architectures with fewer connections and processors need to be derived. For this purpose, the authors develop a systematic reduction methodology to manually map wavefront computation... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic architecture for 2-D rank order filtering

    Publication Year: 1990, Page(s):90 - 99
    Cited by:  Papers (14)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (324 KB)

    The proposed systolic design for 2-D rank order filtering has a wide variety of applications in image processing. It derives its architecture mainly from a systolic design for 1-D rank order filtering proposed previously. The adopted systolic design, called the sample oriented rank order filter design, takes advantage of the evaluated rank values in the current window for the evaluation of the ran... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling affine parameterized recurrences by means of

    Publication Year: 1990, Page(s):100 - 110
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (453 KB)

    The authors present new scheduling techniques for systems of affine recurrence equations. They show that it is possible to extend earlier results on affine scheduling to the case when each variable of the system is scheduled independently of the others by an affine timing-function. This new technique makes it possible to analyze systems of recurrence equations with variables in different index spa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Logic Description Generator

    Publication Year: 1990, Page(s):111 - 120
    Cited by:  Papers (5)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (417 KB)

    The authors describe the Logic Description Generator (LDG), a design tool specifically geared to aid in the implementation of systolic algorithms on reconfigurable logic arrays. It is used to specify designs for Splash, a linear array of Xilinx chips. LDG supports the notion of a logical systolic cell, which may be repetitively layed out across a chip, and whose instances may be interconnected as ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recursive algorithms for AR spectral estimation and their array realizations

    Publication Year: 1990, Page(s):121 - 132
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (437 KB)

    Autoregressive (AR) spectral estimation is widely used in various fields. However, a trade-off between performance and computational complexity is sometimes faced. Two recursive computing algorithms individually applied to the wide-sense stationary and highly nonstationary environments are presented. These algorithms have good numerical properties, high computing parallelism, and data locality. VL... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysing parametrised designs by non-standard interpretation

    Publication Year: 1990, Page(s):133 - 144
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (551 KB)

    The authors consider the use of a nonstandard interpretation to analyze parametrized circuit descriptions, in particular for array based architectures. Various metrics are employed to characterize the performance tradeoffs for generic designs. The objective is to facilitate the comparison of feasible design alternatives at an early stage of development. The research centers on techniques for extra... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic VLSI compiler (SVC) for high performance vector quantisation chips

    Publication Year: 1990, Page(s):145 - 155
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (460 KB)

    An overview is given of a systolic VLSI compiler (SVC) tool currently under development for the automated design of high performance digital signal processing (DSP) chips. Attention is focused on the design of systolic vector quantization chips for use in both speech and image coding systems. The software in question consists of a cell library, silicon assemblers, simulators, test pattern generato... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extensions to linear mapping for regular arrays with complex processing elements

    Publication Year: 1990, Page(s):156 - 167
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (596 KB)

    The optimal architectural design of the processing elements (PEs) for an application specific regular array (RA) is nontrivial if the application has a complex operation set. The authors present an approach that extends the conventional, linear time-space transformation for such cases. In application-specific-integrated-circuit (ASIC) architectures, one has the freedom to fine-tune all aspects of ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of run-time fault-tolerant arrays of self-checking processing elements

    Publication Year: 1990, Page(s):168 - 179
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    A design method for array architectures from regular dependence graphs (DGs) is extended for the design of reconfigurable arrays. The original design method is combined to a single step mapping of the DG with arbitrary dimension n onto the final signal flow graph (SFG) with dimension k. This eliminates the need for recursive application of a mapping which reduces the dimension of the DG by one, an... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GRAPE: a special-purpose computer for N-body problems

    Publication Year: 1990, Page(s):180 - 189
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (492 KB)

    GRAPE (GRAvity PipE) is a special-purpose computer designed to accelerate the numerical integration of the astrophysical N-body problem. The prototype hardware, GRAPE-1, is designed as the backend processor that calculates the gravitational interaction between particles. All other calculations are performed on the host computer connected to GRAPE-1. For large-N calculations (N>or approximately=... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Building blocks for a new generation of application specific computing systems

    Publication Year: 1990, Page(s):190 - 201
    Cited by:  Papers (5)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (768 KB)

    The iWarp processor, which integrates both communication and computation functions on a single VLSI component, is described. The iWarp component and subsystems including it are powerful building blocks for constructing a new generation of application-specific computing systems. These special-purpose systems can achieve very high performance, while maintaining a high degree of flexibility to addres... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reconfigurable vector register windows for fast matrix computation on the orthogonal multiprocessor

    Publication Year: 1990, Page(s):202 - 213
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (644 KB)

    The authors present the concept of vector register windows (VRWs) geared towards large scale matrix computation and image processing applications. The VRWs consist of multiple windows for vector registers providing parallel access and manipulation of large matrix data in the orthogonal multiprocessor (OMP). The number of windows and the number of registers in a window are dynamically reconfigurabl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Massively parallel architecture: application to neural net emulation and image reconstruction

    Publication Year: 1990, Page(s):214 - 225
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (835 KB)

    The authors present two applications of a specific cellular architecture: emulation of the recall and learning for feedforward neural networks and parallel image reconstruction. This architecture is based on a bidimensional array of asynchronous processing elements, the cells, which can communicate between themselves by message transfers. Each cell includes a rotating routing part ensuring the mes... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A real-time software programmable processor for HDTV and stereo scope signals

    Publication Year: 1990, Page(s):226 - 234
    Cited by:  Papers (1)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (674 KB)

    The architecture is an expanded version of a previously reported video signal processor in which a number of parallel processor clusters can be combined in a tandem connection form or in a parallel connection form. The new video signal processor introduces programmable time-expansion and time-compression circuits to A-to-D and D-to-A converters, respectively, for coping with high speed HDTV signal... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping algorithms onto the TUT cellular array processor

    Publication Year: 1990, Page(s):235 - 246
    Cited by:  Papers (3)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (852 KB)

    The Tampere University of Technology Cellular Array (TUTCA) processor array is based on a dynamically configurable logic cell array. It is intended for efficient implementation of the direct mapping dataflow principle with a self-timed, distributed control structure. The architecture of the processor, principles of mapping algorithms on it, and the compiler of the dataflow language are described. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A 3-D wafer scale architecture for early vision processing

    Publication Year: 1990, Page(s):247 - 258
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (649 KB)

    A massively parallel SIMD cellular computer is designed for processing early vision algorithms based on regularization theory and Markov random field (MRF) models. Algorithmic requirements and implementation issues are reviewed in detail for edge detection/surface reconstruction. The development of 3-D wafer scale integration (WSI) technologies that offer an ideal medium for implementing many earl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithmic mapping of neural network models onto parallel SIMD machines

    Publication Year: 1990, Page(s):259 - 271
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (624 KB)

    The authors consider parallel implementation of neural network computations of fine grain SIMD machines. The authors show a mapping of a neural network having n nodes and e connections onto a parallel machine having (n+e) PEs arranged in an array of square root n+e* square root n+e PEs such that routing for each update iteration of the recall phase can be performed in 24( square root n+e-1) elemen... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of systolic algorithms using pipelined functional units

    Publication Year: 1990, Page(s):272 - 283
    Cited by:  Papers (1)  |  Patents (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (711 KB)

    The authors present a method to implement systolic algorithms (SAs) using pipelined functional units (PFUs). This kind of unit makes it possible to improve the throughput of a processor because of the possibility of initiating a new operation before the previous one has been completed. The method permits transformation of a SA so that it can be efficiently executed using PFUs. The method is based ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Array processing on finite polynomial rings

    Publication Year: 1990, Page(s):284 - 295
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (556 KB)

    The disadvantage of computations using finite rings is the need to compute over many different rings in order to produce useful dynamic ranges of computation. By mapping integers into polynomial rings, one can replace the different rings by the replication of the same ring with considerable computational advantages. The authors present the methodology of such a mapping strategy, and discuss the ap... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.