Proceedings of the International Conference on Application Specific Array Processors

2-4 Sept. 1991

Filter Results

Displaying Results 1 - 25 of 37
  • On the use of most significant bit first arithmetic on the design of high performance DSP chips

    Publication Year: 1991
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (7 KB)

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Proceedings of the International Conference on Application Specific Array Processors (Cat. No.91TH0382-2)

    Publication Year: 1991
    Request permission for commercial reuse | |PDF file iconPDF (235 KB)
    Freely Available from IEEE
  • Introduction to system design: algorithms and parallel architectures

    Publication Year: 1991
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (6 KB)

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GFLOPS: a general flexible linearly organized parallel structure for images

    Publication Year: 1991, Page(s):431 - 444
    Cited by:  Papers (3)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (612 KB)

    This paper describes a design of a linear array for use in MIMD, SIMD, pipeline and SPMD modes of programming for image processing applications. It gives some results for an evaluation. A great variety of algorithms, from low level algorithms used for pixel operations to high level algorithms used for performing region treatments or eventually for symbolic processing, may be implemented on this un... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High speed implementation of 1-D and 2-D morphological operations

    Publication Year: 1991, Page(s):249 - 262
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (376 KB)

    The design of a morphological processing system is presented, to be used in medical image enhancement and compression. The system consists of a gray scale dilation/erosion systolic array capable of video data rates. The architecture can be implemented with either one dimensional or two dimensional building blocks that accept raster scanned data and exhibits low latency and an optimal pipeline rate... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Partitioning schemes for circuit simulation on a multiprocessor array

    Publication Year: 1991, Page(s):177 - 183
    Cited by:  Papers (1)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (332 KB)

    The factorization of sparse matrices is used in the inner loop of many engineering algorithms. including circuit simulation. This time consuming operation can be speeded up by utilizing multiprocessor architectures. Distributed memory architectures can overcome the memory bottleneck normally associated with shared memory machines but require a careful distribution of matrix data to the processors.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of a VLSI polynomial evaluator for real-time applications

    Publication Year: 1991, Page(s):13 - 24
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (404 KB)

    Fast evaluation of polynomials is a major goal of computer science, since any continuous function may be approximated as accurately as desired by a polynomial. For instance, most part of current computers evaluate elementary functions using polynomial or rational approximations. J. Duprat and J.M. Muller (1988) presented a new operator, a polynomier, suitable for VLSI implementation, and specifica... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A decoupled access/execute processor for matrix algorithms: architecture and programming

    Publication Year: 1991, Page(s):281 - 295
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (568 KB)

    The authors describe a processor for the execution of a class of matrix algorithms according to the multimesh graph (MMG) mapping method, which is suitable as the processing cell in an application-specific array. The processor uses the decoupled access-execute model of computation, so that it consists of two programmable units: a processing unit (PU) and an access unit (AU). The two programs synch... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast generation of long sorted runs for sorting a large file

    Publication Year: 1991, Page(s):445 - 456
    Cited by:  Papers (1)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (512 KB)

    On sequential machines, most internal sorting algorithms can sort no more than m items using a memory of size m. However, sorting with a heap can produce sorted sequences, called runs, of length about twice the heap size. A second advantage of sorting with a heap is that data I/O and the heap restructuring can be performed concurrently to reduce the sorting time. The third advantage is that it can... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A TSP engine for performing tabu search

    Publication Year: 1991, Page(s):309 - 321
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (392 KB)

    The tabu search is a new promising optimization heuristic used for obtaining near-optimum solutions of combinatorial optimization problems. This paper looks into an implementation of tabu search on dedicated hardware and shows a potential for improvements of two orders of magnitude in the time taken to perform a fixed number of iterations for the traveling salesman problem (TSP) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pipelining and transposing heterogeneous array circuits

    Publication Year: 1991, Page(s):263 - 277
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (532 KB)

    This paper describes a scheme for representing heterogeneous array circuits, in particular those which have been optimised by pipelining or by transposition. Equations for correctness-preserving transformations of these parametric representations are presented. The method is illustrated on developing novel pipelined designs for parallel division. It is found that, for a field-programmable gate arr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A systolic algorithm for the triangular Stein equation

    Publication Year: 1991, Page(s):473 - 484
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (324 KB)

    The authors solve the Stein equation X+AXB=C, with A and B upper triangular matrices, by means of a bidimensional systolic array processor, independent of problem size. The problem is decomposed into two basic subproblems: the solution of an upper triangular system and a GAXPY operation. They obtain a size-dependent systolic algorithm by means of an appropriate chaining of the solutions of these s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic formal verification of systolic array designs

    Publication Year: 1991, Page(s):338 - 354
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (732 KB)

    The authors have previously (1990) developed a new formalism, called systolic temporal arithmetic (STA), for formal specification and verification of systolic arrays at the array level. The formalism exploits systolic array attributes to produce elegant specification and effective formal design verification and is suitable to be combined with interval temporal logic for multilevel reasoning for se... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesizing systolic arrays: some recent developments

    Publication Year: 1991, Page(s):372 - 386
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (676 KB)

    Methods for synthesizing systolic arrays from uniform DAGs are well understood. The idea is to extract from the original sequential algorithm a dependence graph where all incoming arcs to a given node come from a fixed-size neighborhood, so that dependencies are local. Space-time transformations are then used for scheduling the DAG (timing function) and mapping nodes onto physical processors (allo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel implementations of discrete relaxation technique on fixed size processor arrays

    Publication Year: 1991, Page(s):184 - 198
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (432 KB)

    Discrete relaxation technique has been widely used in pattern recognition, artificial intelligence and computer vision. For the consistent labeling problem for labeling n objects with m labels, a parallel implementation based on a new sequential algorithm is shown. This non-partitioned parallel implementation runs in O(nm) time using nm PE's. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A wave digital filter three-port adaptor with fine grained pipelining

    Publication Year: 1991, Page(s):116 - 128
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (520 KB)

    A VLSI architecture for implementing wave digital filter three-port adaptors is described. The design presented general one and can be used to construct RLC ladder filters. High sampling rates are obtained through a combination of fine grained pipelining and most significant bit first arithmetic. The resulting circuit is highly regular and for the most part consists of simple carry save adders View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A defect tolerant systolic array implementation for real time image processing

    Publication Year: 1991, Page(s):25 - 39
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (476 KB)

    An advanced defect tolerant systolic array implementation of the 2D convolution algorithm for real-time image processing applications is presented. The chip contrasts with available convolution chips by the maximum kernel size of two hundred and fifty-six taps, the ability to convolve one video signal with up to four independent coefficient masks, support of adaptive filtering, on-chip delay lines... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A 40 megasample IIR filter chip

    Publication Year: 1991, Page(s):416 - 430
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (668 KB)

    The design of a high performance bit parallel second order IIR filter chip is described. The chip in question is highly pipelined, uses most significant bit first arithmetic and consists mainly of arrays of simple carry save adders. It has been fabricated in 1.5 um double level metal CMOS technology, accepts 12 bit input data and coefficient values and can operate at up to 40 megasamples per secon... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CAPMA: a content-addressable pattern match architecture for production systems

    Publication Year: 1991, Page(s):236 - 248
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (468 KB)

    CAPMA is an efficient partially parallel pattern match architecture used to speed up the execution time of match process of a production system. The algorithm fully exploits the advantages of content-addressable memories (CAMs) not only to buffer the working memory elements, but also to support the functions for evaluating interconditions among patterns in the left-hand sides (LHSs) of productions... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel digital implementations of neural networks

    Publication Year: 1991, Page(s):162 - 176
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (624 KB)

    The paper reviews implementations of neural networks on parallel digital machines. The connectionist neural networks models are discussed from the point of view of their computational characteristics. The levels of parallelism available in the models and the factors affecting their performance of the models on the parallel machines are presented. Several mapping methodologies applicable to neural ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Uniform but non-local DAGS: a trade-off between pure systolic and SIMD solutions

    Publication Year: 1991, Page(s):296 - 308
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (492 KB)

    The authors derive processor arrays which are synthesized from uniform but non-local DAGs. They introduce a scope-b broadcast transformation that amounts working with dependence vectors of `length' b. The parameter b can be adjusted to cope with current integration constraints. They explain the transformation with the Gaussian elimination algorithm. For instance with b=3, they derive an array whic... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A modular systolic 2-D torus for the general knapsack problem

    Publication Year: 1991, Page(s):458 - 472
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (496 KB)

    The authors propose a modular 2-D torus pipelined processing elements for solving the general knapsack problem of arbitrary size. Each cell has a fixed storage capacity α0 independent of the particular knapsack problem to be solved. They study the vertical speed up defined as the speed up achieved upon the I-D torus, when the capacity of the knapsack goes to infinity; and its asso... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis of systolic arrays by equation transformations

    Publication Year: 1991, Page(s):324 - 337
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (452 KB)

    Synthesis of systolic arrays, from formal specifications down to a chip, can be done using recurrence equations. The Alpha du Centaur environment that the authors present implements such a design trajectory. Programs, written in Alpha language, are rewritten Ising formal transformations (space-time reindexing, pipelining, control signal generation, etc.), and finally translated into a form suited ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic architecture for adaptive eigenstructure decomposition based on simultaneous iteration method

    Publication Year: 1991, Page(s):485 - 495
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (432 KB)

    Eigenstructure decomposition of correlation matrices is an important pre-processing stage in many modern signal processing applications. In an unknown and possibly changing environment, adaptive algorithms that are efficient and numerically stable as well as readily implementable in hardware for eigen decomposition are highly desirable. Most modern real-time signal processing applications involve ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Consistency in dataflow graphs

    Publication Year: 1991, Page(s):355 - 369
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (588 KB)

    This paper describes an analytical model for the behavior of dataflow graphs with data-dependent control flow. The number of tokens produced or consumed by each actor is given as a symbolic function of the Booleans in the system. Long term averages can be analyzed to determine consistency of token flow rates, which in turn determines whether memory requirements are bounded. Short-term behavior can... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.