Proceedings of the International Conference on Application Specific Array Processors

2-4 Sept. 1991

Filter Results

Displaying Results 1 - 25 of 37
  • On the use of most significant bit first arithmetic on the design of high performance DSP chips

    Publication Year: 1991
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (7 KB)

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Proceedings of the International Conference on Application Specific Array Processors (Cat. No.91TH0382-2)

    Publication Year: 1991
    Request permission for commercial reuse | PDF file iconPDF (235 KB)
    Freely Available from IEEE
  • Introduction to system design: algorithms and parallel architectures

    Publication Year: 1991
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (6 KB)

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A modular systolic 2-D torus for the general knapsack problem

    Publication Year: 1991, Page(s):458 - 472
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (496 KB)

    The authors propose a modular 2-D torus pipelined processing elements for solving the general knapsack problem of arbitrary size. Each cell has a fixed storage capacity α0 independent of the particular knapsack problem to be solved. They study the vertical speed up defined as the speed up achieved upon the I-D torus, when the capacity of the knapsack goes to infinity; and its asso... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A systolic algorithm for the triangular Stein equation

    Publication Year: 1991, Page(s):473 - 484
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (324 KB)

    The authors solve the Stein equation X+AXB=C, with A and B upper triangular matrices, by means of a bidimensional systolic array processor, independent of problem size. The problem is decomposed into two basic subproblems: the solution of an upper triangular system and a GAXPY operation. They obtain a size-dependent systolic algorithm by means of an appropriate chaining of the solutions of these s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Uniform but non-local DAGS: a trade-off between pure systolic and SIMD solutions

    Publication Year: 1991, Page(s):296 - 308
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (492 KB)

    The authors derive processor arrays which are synthesized from uniform but non-local DAGs. They introduce a scope-b broadcast transformation that amounts working with dependence vectors of `length' b. The parameter b can be adjusted to cope with current integration constraints. They explain the transformation with the Gaussian elimination algorithm. For instance with b=3, they derive an array whic... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesizing systolic arrays: some recent developments

    Publication Year: 1991, Page(s):372 - 386
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    Methods for synthesizing systolic arrays from uniform DAGs are well understood. The idea is to extract from the original sequential algorithm a dependence graph where all incoming arcs to a given node come from a fixed-size neighborhood, so that dependencies are local. Space-time transformations are then used for scheduling the DAG (timing function) and mapping nodes onto physical processors (allo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic architecture for adaptive eigenstructure decomposition based on simultaneous iteration method

    Publication Year: 1991, Page(s):485 - 495
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB)

    Eigenstructure decomposition of correlation matrices is an important pre-processing stage in many modern signal processing applications. In an unknown and possibly changing environment, adaptive algorithms that are efficient and numerically stable as well as readily implementable in hardware for eigen decomposition are highly desirable. Most modern real-time signal processing applications involve ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A TSP engine for performing tabu search

    Publication Year: 1991, Page(s):309 - 321
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (392 KB)

    The tabu search is a new promising optimization heuristic used for obtaining near-optimum solutions of combinatorial optimization problems. This paper looks into an implementation of tabu search on dedicated hardware and shows a potential for improvements of two orders of magnitude in the time taken to perform a fixed number of iterations for the traveling salesman problem (TSP) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A design method for on-line reconfigurable array processors

    Publication Year: 1991, Page(s):387 - 401
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (588 KB)

    A design methodology for reconfigurable array processors is described which extends a known design method for non-redundant array architectures. Using self-checking processing elements, the systematic design of on-line reconfigurable arrays is feasible, which perform reconfiguration concurrently with data processing. Reconfiguration schemes suitable for one- and two-dimensional array processors ar... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel array architectures for motion estimation

    Publication Year: 1991, Page(s):214 - 235
    Cited by:  Papers (5)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (856 KB)

    Motion estimation is one of the most computationally intensive tasks required in digital video compression. The authors propose parallelizable motion estimation algorithms with low computational cost for both sub-optimal and optimal motion estimation. For efficient optimal motion estimation, they develop theoretical bounds based on convexity to reduce the required operations. All algorithms are te... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis of systolic arrays by equation transformations

    Publication Year: 1991, Page(s):324 - 337
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (452 KB)

    Synthesis of systolic arrays, from formal specifications down to a chip, can be done using recurrence equations. The Alpha du Centaur environment that the authors present implements such a design trajectory. Programs, written in Alpha language, are rewritten Ising formal transformations (space-time reindexing, pipelining, control signal generation, etc.), and finally translated into a form suited ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Processor clustering for the design of optimal fixed-size systolic arrays

    Publication Year: 1991, Page(s):402 - 413
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (492 KB)

    The authors have shown in their previous work that processor-clustering is a key operation in the design of problem-size independent systolic/wavefront arrays. Indeed, the processor clustering techniques (called passive-clustering and active-clustering) can not only be used to reduce the size of an array to a constant, but also to achieve the design objectives such as transforming inefficient arra... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High speed implementation of 1-D and 2-D morphological operations

    Publication Year: 1991, Page(s):249 - 262
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (376 KB)

    The design of a morphological processing system is presented, to be used in medical image enhancement and compression. The system consists of a gray scale dilation/erosion systolic array capable of video data rates. The architecture can be implemented with either one dimensional or two dimensional building blocks that accept raster scanned data and exhibits low latency and an optimal pipeline rate... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A defect tolerant systolic array implementation for real time image processing

    Publication Year: 1991, Page(s):25 - 39
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (476 KB)

    An advanced defect tolerant systolic array implementation of the 2D convolution algorithm for real-time image processing applications is presented. The chip contrasts with available convolution chips by the maximum kernel size of two hundred and fifty-six taps, the ability to convolve one video signal with up to four independent coefficient masks, support of adaptive filtering, on-chip delay lines... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping FIR filtering on systolic rings

    Publication Year: 1991, Page(s):87 - 101
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (588 KB)

    During the past decade, systolic arrays have been designed for a wide variety of scientific applications, which are based on highly parallel linear system manipulations. Partitioning and mapping of systolic algorithms has been a key issue for real implementations, in terms of both cost and manageability. The authors demonstrate the mapping of triangular systolic array algorithms onto a one-dimensi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast generation of long sorted runs for sorting a large file

    Publication Year: 1991, Page(s):445 - 456
    Cited by:  Papers (1)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (512 KB)

    On sequential machines, most internal sorting algorithms can sort no more than m items using a memory of size m. However, sorting with a heap can produce sorted sequences, called runs, of length about twice the heap size. A second advantage of sorting with a heap is that data I/O and the heap restructuring can be performed concurrently to reduce the sorting time. The third advantage is that it can... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A decoupled access/execute processor for matrix algorithms: architecture and programming

    Publication Year: 1991, Page(s):281 - 295
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB)

    The authors describe a processor for the execution of a class of matrix algorithms according to the multimesh graph (MMG) mapping method, which is suitable as the processing cell in an application-specific array. The processor uses the decoupled access-execute model of computation, so that it consists of two programmable units: a processing unit (PU) and an access unit (AU). The two programs synch... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Consistency in dataflow graphs

    Publication Year: 1991, Page(s):355 - 369
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (588 KB)

    This paper describes an analytical model for the behavior of dataflow graphs with data-dependent control flow. The number of tokens produced or consumed by each actor is given as a symbolic function of the Booleans in the system. Long term averages can be analyzed to determine consistency of token flow rates, which in turn determines whether memory requirements are bounded. Short-term behavior can... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel strong orientation on a mesh connected computer

    Publication Year: 1991, Page(s):199 - 211
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (328 KB)

    The author presents a solution for the following problem: given an undirected bridgeless graph G=(V, E), find an orientation of each edge such that the resulting directed graph is strongly connected. He assumes the input graph to be given as an adjacency matrix stored in the processors of a mesh connected processor array such that each processor contains one entry. The a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The case for application specific computing

    Publication Year: 1991, Page(s):2 - 9
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB)

    Application specific computing is the only way to solve many computationally intensive problems. In contrast to general purpose computing, application specific computing can achieve high throughput, small size, and (for CMOS realizations) low power. The improvement in the area time product is often in excess of two orders of magnitude. This paper reviews past endeavors in special purpose processin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CAPMA: a content-addressable pattern match architecture for production systems

    Publication Year: 1991, Page(s):236 - 248
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (468 KB)

    CAPMA is an efficient partially parallel pattern match architecture used to speed up the execution time of match process of a production system. The algorithm fully exploits the advantages of content-addressable memories (CAMs) not only to buffer the working memory elements, but also to support the functions for evaluating interconditions among patterns in the left-hand sides (LHSs) of productions... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of a VLSI polynomial evaluator for real-time applications

    Publication Year: 1991, Page(s):13 - 24
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (404 KB)

    Fast evaluation of polynomials is a major goal of computer science, since any continuous function may be approximated as accurately as desired by a polynomial. For instance, most part of current computers evaluate elementary functions using polynomial or rational approximations. J. Duprat and J.M. Muller (1988) presented a new operator, a polynomier, suitable for VLSI implementation, and specifica... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping different node types of dependence graphs into the same processing element

    Publication Year: 1991, Page(s):72 - 86
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (536 KB)

    This paper presents a method for mapping different computation nodes into the same complex processing element. The processing elements which use the minimal number of building blocks are derived automatically from the computations of the different nodes. Known design procedures for mapping algorithms onto array processors can be extended by this method to allow the mapping of dependence graphs wit... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GFLOPS: a general flexible linearly organized parallel structure for images

    Publication Year: 1991, Page(s):431 - 444
    Cited by:  Papers (3)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (612 KB)

    This paper describes a design of a linear array for use in MIMD, SIMD, pipeline and SPMD modes of programming for image processing applications. It gives some results for an evaluation. A great variety of algorithms, from low level algorithms used for pixel operations to high level algorithms used for performing region treatments or eventually for symbolic processing, may be implemented on this un... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.