By Topic

Application Specific Array Processors, 1992. Proceedings of the International Conference on

Date 4-7 Aug. 1992

Filter Results

Displaying Results 1 - 25 of 51
  • Proceedings of the International Conference on Application Specific Array Processors (Cat. No.92TH0453-1)

    Publication Year: 1992
    Request permission for commercial reuse | PDF file iconPDF (176 KB)
    Freely Available from IEEE
  • Pipelining: just another transformation

    Publication Year: 1992, Page(s):163 - 175
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (612 KB)

    A simple formulation of pipelining: `Pipelining with N stages is equivalent to retiming where the number of delays on all inputs or all outputs, but not both, is increased by N' is used as the basis for a convenient and efficient treatment of pipelining in design of application specific computers. Classification of pipelining according to the optimization goal (throughput and res... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transformation techniques for serial array design

    Publication Year: 1992, Page(s):574 - 588
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (628 KB)

    This paper describes a design framework for developing application-specific serial array circuits. Starting from a description of the state-transition logic or a fully-parallel architecture, correctness-preserving transformations are employed to derive a wide range of implementations with different space-time trade-offs. The approach has been used in synthesising designs based on field-programmabl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SPERT: a VLIW/SIMD microprocessor for artificial neural network computations

    Publication Year: 1992, Page(s):178 - 190
    Cited by:  Patents (19)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (716 KB)

    SPERT (synthetic perceptron testbed) is a fully programmable single chip microprocessor designed for efficient execution of artificial neural network algorithms. The first implementation is in a 1.2 μm CMOS technology with a 50 MHz clock rate, and a prototype system is being designed to occupy a double SBus slot within a Sun Sparcstation. SPERT sustains over 300×106 connections... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Advanced technology for improved signal processor efficiency

    Publication Year: 1992, Page(s):257 - 268
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (408 KB)

    Wafer scale integration technology offers the promise of implementing application specific processors with significantly higher data rates, lower power, and smaller size than conventional VLSI implementations. Wafer scale integration implementations replace most of the signal lines between chips with intra-wafer lines that exhibit one to two orders of magnitude less stray capacitance so they may b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compilation of narrowband spectral detection systems for linear MIMD machines

    Publication Year: 1992, Page(s):589 - 603
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (624 KB)

    The author discusses the design of a program that maps a class of digital signal processing systems, called narrowband spectral detection systems, to linear MIMD machines. Such systems contain a mixture of data-parallel, systolic and purely serial computations. He describes a new technique, called geometric scheduling, that exploits the special features of the first two styles of computation, and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementing a family of high performance, micrograined architectures

    Publication Year: 1992, Page(s):191 - 205
    Cited by:  Papers (13)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (604 KB)

    This paper describes the design and implementation of high performance micrograined architectures. These architectures are capable of teraops performance. Each architecture is organized as a systolic array of processors. A prototyping system for the architectures is proposed. The prototyping system provides control, I/O, and an interface to a host system for each of the micro-grained architectures... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Some low power implementations of DSP algorithms

    Publication Year: 1992, Page(s):269 - 276
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (348 KB)

    The implementation of digital signal processing algorithms often requires that a variety of conflicting criteria be satisfied. A signal processing system must provide the necessary processing gains, while various measures of the feasibility and efficiency of implementation, such as power and cost, are met. This paper reviews the motivation behind the development of low power signal processing algo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High level software synthesis for signal processing systems

    Publication Year: 1992, Page(s):679 - 693
    Cited by:  Papers (41)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (712 KB)

    For the design of complex digital signal processing systems, block diagram oriented simulation has become a widely accepted standard. Current research is concerned with the coupling of heterogenous simulation engines and the transition from simulation to the implementation of digital signal processing systems. Due to the difficulty in mastering complex design spaces high level hardware and softwar... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Programming systolic arrays

    Publication Year: 1992, Page(s):604 - 618
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    This paper presents the New Systolic Language as a general solution to the problem of systolic programming. The language provides a simple programming interface for systolic algorithms suitable for different hardware platforms and software simulators. The New Systolic Language hides the details and potential hazards of inter-processor communication, allowing data flow only via abstract systolic da... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Deterministic Boltzmann machine VLSI can be scaled using multi-chip modules

    Publication Year: 1992, Page(s):206 - 217
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (596 KB)

    Describes a special purpose, very high speed, digital deterministic Boltzmann neural network VLSI chip. Each chip has 32 physical neural processors, which can be apportioned into an arbitrary topology (input, multiple hidden and output layers) of up to 160 virtual neurons total. Under typical conditions, the chip learns at approximately 5×108 connection updates/second (CUPS). Thro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A systolic rank revealing QR algorithm

    Publication Year: 1992, Page(s):430 - 444
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    In many fields of signal and image processing control, and telecommunication there is much interest today in the numerical techniques offered by linear algebra. The singular value decomposition (SVD) is one of the techniques which have proven useful in many engineering applications, but unfortunately its computation is a costly procedure. The QR factorization (QRF) requires much less computational... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MUSE-a systolic array for adaptive nulling with 64 degrees of freedom, using Givens transformations and wafer scale integration

    Publication Year: 1992, Page(s):277 - 291
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (604 KB)

    This paper describes a highly parallel system of computational processors specialized for real-time adaptive antenna nulling computations with many degrees of freedom, which the author calls MUSE, and a specific realization of MUSE for 64 degrees of freedom. Each processor uses the CORDIC algorithm and has been designed as a single integrated circuit. Ninety-six such processors working together ca... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An integrated system for rapid prototyping of high performance algorithm specific data paths

    Publication Year: 1992, Page(s):134 - 148
    Cited by:  Papers (10)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB)

    A system has been developed which targets the rapid prototyping of high performance data computation units which are typical to real-time digital signal processing applications. The hardware platform of the system is a family of multiprocessor integrated circuits. The prototype chip of this family contains 8 processors connected via a dynamically controlled crossbar switch. With a maximum clock ra... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient scheduling methods for partitioned systolic algorithms

    Publication Year: 1992, Page(s):649 - 663
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (724 KB)

    Various methods for mapping signal processing algorithms into systolic arrays have been developed in the past few years. In this paper, efficient scheduling techniques are developed for the partitioning problem, i.e. problems with size that do not match the array size. In particular, scheduling for the locally parallel-globally sequential (LPGS) technique and the locally sequential-globally parall... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A systolic array chip for robot inverse dynamics computation

    Publication Year: 1992, Page(s):400 - 414
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (600 KB)

    To ensure smooth and accurate movement of a robot arm, the robot inverse dynamics problem must be solved at each servo sampling. The computation of this problem, however, is a mathematically intense task which degrades the sampling period of presentday robot control systems. In addition to the repetitive requirement for its evaluation, the linearly recursive and computer-bound properties of the ro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A reconfigurable processor array with routing LSIs and general purpose DSPs

    Publication Year: 1992, Page(s):102 - 116
    Cited by:  Papers (4)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    A building block for a scalable signal processor array is developed with a general-purpose DSP and a message routing LSI. Each DSP can be connected by multiple routing LSIs forming a point-to-point message-passing network with data packet communication. Low network latency is obtained by cut-through routing technique with sufficient communication bandwidth. The employment of an on-chip routing tab... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling partitions in systolic algorithms

    Publication Year: 1992, Page(s):619 - 633
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (724 KB)

    The authors present a technique for scheduling partitions in systolic algorithms (SA). This technique can be used in combination with any possible projection used for the problem dependent size SA design and with any possible spatial mapping used for the partitions. They also present the necessary code transformations to transform the sequential code into the code that is executed in a processing ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On cycle borrowing analyses for interconnected chips driven by clocks having different but commensurable speeds

    Publication Year: 1992, Page(s):81 - 88
    Cited by:  Papers (3)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (328 KB)

    The author considers the construction of synchronous systems having components driven at different rates by different, but commensurable, clocks. Furthermore these systems are to be constructed using level-sensitive latches with the intent of exploiting cycle borrowing over the entire system. The author presents a framework in which the entire system is managed as a single clocked entity, and inve... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On systolic mapping of multi-stage algorithms

    Publication Year: 1992, Page(s):47 - 61
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (504 KB)

    The authors present a more general mapping problem called multi-stage systolic mapping which focuses on the computing algorithms containing more than one nested loop constructs to be executed sequentially. Since the emerged interface problem now becomes the dominant factor in performing the mapping, the authors argue that the adjacent stages should have matched interface to reduce the overhead. Fo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discrete wavelet transforms in VLSI

    Publication Year: 1992, Page(s):218 - 229
    Cited by:  Papers (21)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (444 KB)

    Three architectures, based on linear systolic arrays, for computing the discrete wavelet transform, are described. The AT 2 lower bound for computing the DWT in a systolic model is derived and shown to be AT2=Ω(N2 Nwk). Two of the architectures are within a factor of log N from optimal, but they... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A transformative approach to the partitioning of processor arrays

    Publication Year: 1992, Page(s):4 - 20
    Cited by:  Papers (7)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (740 KB)

    The paper describes the systematic design of processor arrays with a given dimension and a given number of processing elements. The unified approach to the solution of this problem called partitioning is based on the following concepts: (1) Algorithms and processor arrays are represented by (piecewise regular) programs. (2) The concept of stepwise refinement of programs is used to solve the partit... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interval-related problems on reconfigurable meshes

    Publication Year: 1992, Page(s):445 - 455
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (488 KB)

    Interval graphs provide a natural model for a vast number of scheduling and VLSI problems. A variety of interval graph problems have been solved on the PRAM family. Recently, a powerful architecture called the reconfigurable mesh has been proposed: in essence, a reconfigurable mesh consists of a mesh-connected architecture augmented by a dynamically reconfigurable bus system. It has been argued th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic architectures for finite-state vector quantization

    Publication Year: 1992, Page(s):481 - 495
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (448 KB)

    The authors present a new systolic architecture for implementing finite state vector quantization in real-time for both speech and image data. This architecture is modular and has a very simple control flow. Only one processor is needed for speech compression. A linear array of processors is used for image compression; the number of processors needed is independent of the size of the image. Image ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Determining longest common subsequences of two sequences on a linear array of processors

    Publication Year: 1992, Page(s):526 - 537
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (296 KB)

    This paper presents special-purpose linear array processor architecture for determining longest common subsequences (LCS) of two sequences. The algorithm uses systolic and pipelined architectures suitable for VLSI implementation. The algorithms are also suitable for implementation on parallel machines. The author first develops a `greedy' algorithm to determine some of the LCS and then proposes a ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.