By Topic

Application-Specific Array Processors, 1993. Proceedings., International Conference on

25-27 Oct. 1993

Filter Results

Displaying Results 1 - 25 of 63
  • Proceedings of International Conference on Application Specific Array Processors (ASAP '93)

    Publication Year: 1993
    Request permission for commercial reuse | PDF file iconPDF (146 KB)
    Freely Available from IEEE
  • The Xor embedding: An embedding of hypercubes onto rings and toruses

    Publication Year: 1993, Page(s):15 - 28
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (761 KB)

    Many parallel algorithms use hypercubes as the communication topology among processes, which make them suitable to be executed on a hypercube multicomputer. In this way the communication cost is kept to a minimum since processes can be allocated to processors in such a way that only communication between neighbor processors is required. However, the scalability of hypercube multicomputer is constr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A wavefront array processor for on the fly processing of digital video streams

    Publication Year: 1993, Page(s):101 - 108
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (462 KB)

    The authors present a wavefront array processor architecture developed at ETCA and dedicated to real-time processing of digital video streams. The core of the architecture is a mesh-connected three-dimensional network of 1024 custom processing elements. Each processing element can perform up to 50 millions 8- or 16-bit operations per second, working with a 25 MHz clock frequency. Thus algorithms a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A real-time systolic algorithm for on-the-fly hidden surface removal

    Publication Year: 1993, Page(s):238 - 249
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (543 KB)

    Hidden surface removal for real-time realistic display of complex scenes requires intensive computation and justifies usage of parallelism to provide the needed response time. The authors present a systolic algorithm that identifies visible segments on a scanline with the "real-time" characteristic: visible segments are output on-the-fly as soon as segments are input to the systolic array. The pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A massively parallel diagonal-fold array processor

    Publication Year: 1993, Page(s):140 - 143
    Cited by:  Papers (3)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB)

    Image processing for multimedia workstations is a computationally intensive task typically requiring special purpose hardware, for example a nearest neighbor mesh parallel machine organization. One type of nearest neighbor mesh computer consists of a K × K square array of Processor Elements (PEs) where each PE is connected to the North, South, East, and West PEs only. In a torus configuratio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data flow graphs granularity for overhead reduction within a PE in multiprocessor systems

    Publication Year: 1993, Page(s):136 - 139
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (184 KB)

    The authors propose a method to implement Acyclic Data Flow Graphs (ADFG) in any general purpose multiprocessor system supporting a CSP type language. The granularity of ADFG nodes is discussed During ADFG analysis the authors use fine granularity to exploit all the parallelism inherent in the problem. When the graph G has been allocated, it is divided into P subgraphs Gk (P is the numb... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • RELACS for systolic programming

    Publication Year: 1993, Page(s):132 - 135
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB)

    The RELACS language is a systolic programming language, which simplifies the programmer's task by making explicit the data-flow of systolic algorithms, and by exposing the data delivery mechanism. The underlying architecture model is different from other SIMD architectures in that it physically separates computation and data management. The authors introduce the RELACS language as a syntaxic and a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A highly-parallel match architecture for AI production systems using application-specific associative matching processors

    Publication Year: 1993, Page(s):180 - 183
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (276 KB)

    Here, a highly-parallel two-layer match architecture using specific associative matching processors (AMPs) is proposed to speed up the execution time of match process of AI production systems. Each AMP comprises a 2D array of content-addressable memories, called CAM blocks. The architecture first compiles the left-hand (LHS) of each production into a symbolic form, and then assigns a number of con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient architecture of a programmable block matching processor

    Publication Year: 1993, Page(s):560 - 571
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (560 KB)

    An efficient VLSI architecture of a programmable block matching processor for the emulation of a wide spectrum of full search and reduced complexity search block matching algorithms is presented. Optimized efficiency is obtained by using a quadratic systolic array architecture with global accumulation, combined with a programmable meander-like data flow. Flexibility is further increased by cascada... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The PAPRICA SIMD array: Critical reviews and perspectives

    Publication Year: 1993, Page(s):309 - 320
    Cited by:  Papers (9)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (564 KB)

    The PAPRICA project started in 1988 as an experimental VLSI architecture devoted to the efficient computation of data with two-dimensional structure. The main goal of the project is to develop a subsystem that could operate as an attached processing unit to a standard workstation and in perspective as a specialized processing module in dedicated systems devoted to low level image analysis, cellula... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A simple expert system for the reasoning of systolic designs

    Publication Year: 1993, Page(s):128 - 131
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (244 KB)

    The author presents a simple expert system developed for the reasoning of systolic designs. It is based on the STA formalism, the spatial inductive techniques developed earlier, and a temporal induction technique (briefly introduced in this paper) to perform formal verification of systolic array designs. Induction techniques exploit the regularity and locality attributes of systolic arrays. The sy... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matrix-matrix multiplications and fault tolerance on hypercube multiprocessors

    Publication Year: 1993, Page(s):176 - 180
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (164 KB)

    Several new algorithms for matrix-matrix multiplications on hypercube multiprocessors are presented and evaluated based on the number of multiplications, additions, and transfers. The matrices to be multiplied are uniformly distributed to all processors of a hypercube system. Each processor owns some submatrices which are derived by dividing the source matrices. Each submatrix multiplication can n... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An optimal algo-tech-cuit for the knapsack problem

    Publication Year: 1993, Page(s):548 - 559
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (472 KB)

    The authors first present a formal derivation and proof of correctness of a systolic array for the knapsack problem, an NP-complete problem whose dependency graph is not completely known statically. With q PEs, each with a fixed size memory, the arraystretch runs in Γ(mc/q), which gives optimal speedup of the algorithm. However, it has an intricate tag-based control mechanism which is diffic... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-power polygon renderer for computer graphics

    Publication Year: 1993, Page(s):200 - 213
    Cited by:  Papers (3)  |  Patents (14)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (780 KB)

    Polygon rasterization is the most computational and memory intense stage in rendering synthesized computer images. The authors present a low-power, real-time hardware implementation for this task. Rasterization of two-dimensional Gouraud-shaded polygons at 90,000 polygons/sec is achievable with computational power consumption of about 12 mW at 1.5 V operation, using an array configuration of 16 re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new formulation of the mapping conditions for the synthesis of linear systolic arrays

    Publication Year: 1993, Page(s):297 - 308
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (508 KB)

    The authors present a new formulation for mapping algorithms into linear systolic arrays. The closed-form necessary and sufficient mapping conditions are derived to identify mappings without computation conflicts and data link collisions. These mapping conditions are easy to check because their constituent variables are the space-time mapping matrix and the problem size parameters. The design of o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Digit systolic algorithms for fine-grain architectures

    Publication Year: 1993, Page(s):466 - 477
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    In this paper, the authors present a novel scheme for performing arithmetic efficiently on fine-grain programmable architectures and FPGA-based systems. They achieve an O(n) speedup over the bit-serial methods of existing fine-grain systems such as the DAP, the MPP and the CM2, within the constraints of regular, near neighbor communication and only a small amount of on-chip memory. This is possibl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic evaluation of functions: Digit-level algorithm and realization

    Publication Year: 1993, Page(s):514 - 525
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (496 KB)

    The author presents a novel algorithm for the evaluation of functions. The algorithm is systolic and may be realized as a fully scalable and very regular design consisting of merely full-adders and registers. The algorithm evaluates a polynomial according to the Horner scheme, i.e., it performs a cascade of multiply-and-add operations. Data are represented as two's complement fixed-point numbers t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On synthesizing application-specific array architectures from behavioral specifications

    Publication Year: 1993, Page(s):124 - 127
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (180 KB)

    The authors describe a design framework, Architect, being developed for synthesizing application-specific array architectures from behavioral specifications to Register-Transfer (RT) descriptions , which can be identified as a number of cooperating tasks; signal transformations, hardware mapping expressed as, in general, nonlinear mapping and scheduling function with hardware constraints, memory m... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-rate transformation of directional affine recurrence equations

    Publication Year: 1993, Page(s):392 - 403
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    There has been an increased attention to the synthesis of algorithmic specific pipeline arrays such as systolic arrays. Most of the existing synthesis techniques are based on a transformation of the algorithm from a class of Recurrence Equations such as Uniform Recurrence Equations (UREs). However, many algorithms cannot be transformed to a URE and the temporal locality of systolic arrays results ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Processing of variable size images on a cellular array: Performance analysis with the Abingdon Cross Benchmark

    Publication Year: 1993, Page(s):172 - 175
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (220 KB)

    Handling a continuous flow of variable size images is a requirement for real time computer vision machines. A modular system based on a small size SIMD cellular array of 1-bit processing elements has been developed with this goal in mind and it is now evaluated against the Abingdon Cross Benchmark specifications. The benchmark tests the combination of algorithms and architecture and generates a qu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic design of a new finite field division/inverse algorithm

    Publication Year: 1993, Page(s):188 - 191
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (172 KB)

    A systolic architecture of a newly developed algorithm for performing division and inversion over GF(2m) has been successfully realized. It is novel in that the normal inverse/multiplication steps are integrated and the generator polynomial is selectable. The new design with its inherent regularity offers an expandable, fully pipelined high performance circuit, that is very suitable to ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A CAD tool for electromagnetic simulation on the associative string processor

    Publication Year: 1993, Page(s):583 - 592
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (376 KB)

    The associative string processor (ASP) is a scalable fine grained SIMD architecture, optimized for image processing tasks. In this paper the authors describe a new application for this available architecture. The finite different time domain (FDTD) method is a technique used to solve electromagnetic boundary value problems. The technique produces results in the time domain which can then be post-p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A period-processor-time-minimal schedule for cubical mesh algorithms

    Publication Year: 1993, Page(s):261 - 272
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    The paper, using a direct acyclic graph (dag) model of algorithms, investigates precedence constrained multiprocessor schedules for the n × n × n directed mesh. This cubical mesh is fundamental, representing the standard algorithm for square matrix product, as well as many other algorithms. Its completion requires at least 3n - 2 multiprocessor steps. Time-minimal multiprocessor schedu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Time-optimal visibility-related algorithms on meshes with multiple broadcasting

    Publication Year: 1993, Page(s):226 - 237
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB)

    The compaction step of integrated circuit design motivates the study of various visibility problems among vertical segments in the plane. One popular variant is referred to as the Vertical Segment Visibility problem (VSV, for short) and is stated as follows. Given a collection S of n disjoint vertical line segments in the plane, for every endpoint of a segment in S determine the first line segment... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • VLSI array synthesis for polynomial GCD computation

    Publication Year: 1993, Page(s):536 - 547
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (500 KB)

    Polynomial GCD (greatest common divisor) finding is an important problem in algebraic computation, especially in decoding error correcting codes. The authors show a new systolic array structure for the polynomial GCD problem using a systematic array synthesis technique. The VLSI implementation of the array structure is area-efficient and achieves maximum throughput with pipelining. The dependency ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.