Application-Specific Array Processors, 1993. Proceedings., International Conference on

25-27 Oct. 1993

Filter Results

Displaying Results 1 - 25 of 63
  • Proceedings of International Conference on Application Specific Array Processors (ASAP '93)

    Publication Year: 1993
    Request permission for commercial reuse | PDF file iconPDF (146 KB)
    Freely Available from IEEE
  • The Xor embedding: An embedding of hypercubes onto rings and toruses

    Publication Year: 1993, Page(s):15 - 28
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (761 KB)

    Many parallel algorithms use hypercubes as the communication topology among processes, which make them suitable to be executed on a hypercube multicomputer. In this way the communication cost is kept to a minimum since processes can be allocated to processors in such a way that only communication between neighbor processors is required. However, the scalability of hypercube multicomputer is constr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A wavefront array processor for on the fly processing of digital video streams

    Publication Year: 1993, Page(s):101 - 108
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (462 KB)

    The authors present a wavefront array processor architecture developed at ETCA and dedicated to real-time processing of digital video streams. The core of the architecture is a mesh-connected three-dimensional network of 1024 custom processing elements. Each processing element can perform up to 50 millions 8- or 16-bit operations per second, working with a 25 MHz clock frequency. Thus algorithms a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A real-time systolic algorithm for on-the-fly hidden surface removal

    Publication Year: 1993, Page(s):238 - 249
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (543 KB)

    Hidden surface removal for real-time realistic display of complex scenes requires intensive computation and justifies usage of parallelism to provide the needed response time. The authors present a systolic algorithm that identifies visible segments on a scanline with the "real-time" characteristic: visible segments are output on-the-fly as soon as segments are input to the systolic array. The pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Synthesis of dedicated SIMD processors

    Publication Year: 1993, Page(s):416 - 427
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (504 KB)

    In this paper, a synthesis method of dedicated architectures is introduced. Its aim is to produce optimized systems derived from the algorithmic expression of a numerical application. The approach addresses the design of dedicated systems for applications that require high numerical computations. An efficient utilization of hardware resources is achieved through the use of vector processing with a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An algorithm for accurate data dependence test

    Publication Year: 1993, Page(s):404 - 415
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (532 KB)

    To test if there is a dependence between different iterations in a loop can be converted to checking if there exist integral points in a polyhedron described by a set of linear equations and inequalities. In this paper, a method for accurate data dependence test is proposed. In this method, first, data dependence test problems with any number of linear equations are transformed equivalently to tes... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-rate transformation of directional affine recurrence equations

    Publication Year: 1993, Page(s):392 - 403
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    There has been an increased attention to the synthesis of algorithmic specific pipeline arrays such as systolic arrays. Most of the existing synthesis techniques are based on a transformation of the algorithm from a class of Recurrence Equations such as Uniform Recurrence Equations (UREs). However, many algorithms cannot be transformed to a URE and the temporal locality of systolic arrays results ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A practical constant time sorting network

    Publication Year: 1993, Page(s):380 - 391
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (644 KB)

    The authors propose a novel VLSI sorting network implementing Leighton's column sort. The network is mech-based and modular; it consists of comparison-exchange processing elements (PEs), routing paths, and short broadcast buses. Each bus contains a small number of simple switches that the authors call shift switches. They enhance and simplify the previously proposed shift switching mechanism to ob... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping Monte Carlo-Metropolis algorithm onto a double ring architecture

    Publication Year: 1993, Page(s):192 - 195
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (196 KB)

    Mathematical models are frequently used to evaluate the physical properties of matter; they involve intensive computation: processing times and loan costs soar, using general purpose computers. Remarkable improvements are promised by special purpose architectures, in particular for the parallel execution of the most time consuming routines; the Monte Carlo-Metropolis method has been examined, and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A massively parallel diagonal-fold array processor

    Publication Year: 1993, Page(s):140 - 143
    Cited by:  Papers (3)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB)

    Image processing for multimedia workstations is a computationally intensive task typically requiring special purpose hardware, for example a nearest neighbor mesh parallel machine organization. One type of nearest neighbor mesh computer consists of a K × K square array of Processor Elements (PEs) where each PE is connected to the North, South, East, and West PEs only. In a torus configuratio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of large neural associative memories by massively parallel array processors

    Publication Year: 1993, Page(s):357 - 368
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    The authors discuss the use of massively parallel array processors for simulating large neural associative memories. Although based on standard matrix operations the simulation of neural associative memories requires special parallel algorithms because a sparse coding of the input and output information is needed. Four different implementations with different mapping strategies and different array... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient algorithm for image-template product on SIMD mesh connected computers

    Publication Year: 1993, Page(s):250 - 260
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (420 KB)

    Convolutions and correlations are fundamental operations in computer vision and image processing. The image-template product expresses convolutions and correlations. In this paper, the authors present an efficient algorithm for the image-template product on SIMD mesh connected computers. The image-template product is computed along disjoint convolution paths of the template. For an M × N ima... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic design of a new finite field division/inverse algorithm

    Publication Year: 1993, Page(s):188 - 191
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (172 KB)

    A systolic architecture of a newly developed algorithm for performing division and inversion over GF(2m) has been successfully realized. It is novel in that the normal inverse/multiplication steps are integrated and the generator polynomial is selectable. The new design with its inherent regularity offers an expandable, fully pipelined high performance circuit, that is very suitable to ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling partitioned algorithms on processor arrays with limited communication supports

    Publication Year: 1993, Page(s):53 - 64
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (440 KB)

    It is important that array designs, especially the scheduling of partitioned arrays, must cope with various kinds of communication constraints such as interconnection topology, channel bandwidth, and inhomogeneous communication delay. The interprocessor communication requirements can be dictated by the dependence vectors and size of the partitioned tiles. A folded constraint graph is created to de... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Digit systolic algorithms for fine-grain architectures

    Publication Year: 1993, Page(s):466 - 477
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    In this paper, the authors present a novel scheme for performing arithmetic efficiently on fine-grain programmable architectures and FPGA-based systems. They achieve an O(n) speedup over the bit-serial methods of existing fine-grain systems such as the DAP, the MPP and the CM2, within the constraints of regular, near neighbor communication and only a small amount of on-chip memory. This is possibl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient architecture of a programmable block matching processor

    Publication Year: 1993, Page(s):560 - 571
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (560 KB)

    An efficient VLSI architecture of a programmable block matching processor for the emulation of a wide spectrum of full search and reduced complexity search block matching algorithms is presented. Optimized efficiency is obtained by using a quadratic systolic array architecture with global accumulation, combined with a programmable meander-like data flow. Flexibility is further increased by cascada... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The PAPRICA SIMD array: Critical reviews and perspectives

    Publication Year: 1993, Page(s):309 - 320
    Cited by:  Papers (9)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (564 KB)

    The PAPRICA project started in 1988 as an experimental VLSI architecture devoted to the efficient computation of data with two-dimensional structure. The main goal of the project is to develop a subsystem that could operate as an attached processing unit to a standard workstation and in perspective as a specialized processing module in dedicated systems devoted to low level image analysis, cellula... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A simple expert system for the reasoning of systolic designs

    Publication Year: 1993, Page(s):128 - 131
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (244 KB)

    The author presents a simple expert system developed for the reasoning of systolic designs. It is based on the STA formalism, the spatial inductive techniques developed earlier, and a temporal induction technique (briefly introduced in this paper) to perform formal verification of systolic array designs. Induction techniques exploit the regularity and locality attributes of systolic arrays. The sy... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • M3: A high performance signal processor for RADAR applications

    Publication Year: 1993, Page(s):164 - 167
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (164 KB)

    Real time radar computing requires high processing performances and fast and efficient I/O capabilities. These goals have been achieved by means of a new multiprocessor architecture based on the Motorola DSP 96002. This system was developed entirely in the FIAR laboratories, and now is a state-of-the-art unit in their avionic radar family. The authors describe a computer system developed specifica... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel framework for multi-rate scheduling in DSP applications

    Publication Year: 1993, Page(s):77 - 88
    Cited by:  Papers (6)  |  Patents (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB)

    The authors present a novel framework for multi-rate scheduling of signal processing programs represented by regular stream flow graphs (RSFGs). The nodes of an RSFG may execute at different rates to avoid unbounded storage requirement under repetitive computation. A distinct feature of the scheduling framework, called the multi-rate software pipelining, is to allow maximum overlapping of operatio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel architecture for a decision-feedback equalizer using extended signal-digit feedback

    Publication Year: 1993, Page(s):490 - 501
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (404 KB)

    A novel bit-level systolic array architecture for implementing a bit parallel decision-feedback equalizer (DFE) is presented. Core of the architecture is an array multiplier using redundant arithmetic in combination with bit-level feedback. The use of signal-digit (SD) circuitry allows one to feed back each digit as soon as it is available. So the recursive computation can be executed with the mos... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An application-specific array architecture for feedforward with backpropagation ANNs

    Publication Year: 1993, Page(s):333 - 344
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (508 KB)

    An application-specific array architecture for Artificial Neural Networks (ANNs) computation is proposed. This array is configured as a mesh-of-appendixed-trees (MAT). Algorithms to implement both the recall and the training phases of the multilayer feedforward with backpropagation ANN model are developed on MAT. The proposed MAT architecture requires only O(log N) time, while other reported techn... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data flow graphs granularity for overhead reduction within a PE in multiprocessor systems

    Publication Year: 1993, Page(s):136 - 139
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (184 KB)

    The authors propose a method to implement Acyclic Data Flow Graphs (ADFG) in any general purpose multiprocessor system supporting a CSP type language. The granularity of ADFG nodes is discussed During ADFG analysis the authors use fine granularity to exploit all the parallelism inherent in the problem. When the graph G has been allocated, it is divided into P subgraphs Gk (P is the numb... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic evaluation of functions: Digit-level algorithm and realization

    Publication Year: 1993, Page(s):514 - 525
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (496 KB)

    The author presents a novel algorithm for the evaluation of functions. The algorithm is systolic and may be realized as a fully scalable and very regular design consisting of merely full-adders and registers. The algorithm evaluates a polynomial according to the Horner scheme, i.e., it performs a cascade of multiply-and-add operations. Data are represented as two's complement fixed-point numbers t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast, storage-efficient parallel sorting algorithm

    Publication Year: 1993, Page(s):369 - 379
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (436 KB)

    A parallel sorting algorithm is presented for storage-efficient internal sorting on MIMD machines. The algorithm first sorts the elements within each node using a serial based algorithm, then a two-phase parallel merge. It requires additional storage of order of the square root of the number of elements in each node. Performance of the algorithm on two general-purpose MIMD machines, the Fujitsu AP... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.