By Topic

Application Specific Array Processors, 1994. Proceedings. International Conference on

Date 22-24 Aug. 1994

Filter Results

Displaying Results 1 - 25 of 41
  • On the injectivity of modular mappings

    Publication Year: 1994, Page(s):236 - 247
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (614 KB)

    Affine space-time mappings have been extensively studied for systolic array design and parallelizing compilation. However, there are practical important cases that require other types of transformations. This paper considers so-called modular mappings described by linear transformations modulo a constant vector. Sufficient conditions for these mappings to be one-to-one are investigated for rectang... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94)

    Publication Year: 1994
    Request permission for commercial reuse | PDF file iconPDF (315 KB)
    Freely Available from IEEE
  • A high performance IIR filter chip and its evaluation system

    Publication Year: 1994, Page(s):22 - 32
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (380 KB)

    A highly flexible programmable IIR filter chip has been designed and fabricated to commercial requirements within a collaborative project involving several industrial partners. The device uses 8 highly regular 16 bit array multiplier-accumulators which have been pipelined to achieve an overall computational rate of 30 MHz using a 1 micron gate array process. Most significant bit first arithmetic h... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automated design of DSP array processor chips

    Publication Year: 1994, Page(s):33 - 44
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (528 KB)

    Details are presented of the DAC (DSP ASIC Compiler) silicon compiler framework. DAC allows a non-specialist to automatically design DSP ASICs and DSP ASIC cores directly form a high level specification. Typical designs take only several minutes and the resulting layouts are comparable in area and performance to handcrafted designs View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Behavioral synthesis of high performance, low cost, and low power application specific processors for linear computations

    Publication Year: 1994, Page(s):45 - 56
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (528 KB)

    Throughput has been widely traditionally recognized as the most popular performance metric for implementation of application specific computations. However, increasingly applications such as embedded controllers impose constraints on both throughput and latency as important metrics of speed. Although throughput alone can be arbitrarily improved for several classes of systems using previously publi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A processor-time-minimal schedule for the standard tensor product algorithm

    Publication Year: 1994, Page(s):176 - 187
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (632 KB)

    The paper, using a directed acyclic graph (dag) model of algorithms, investigates precedence constrained multiprocessor schedules for the n×n×n×n directed mesh. Its completion requires at least 4n-3 multiprocessor steps. Time-minimal multiprocessor schedules that use as few processors as possible are called processor-time-minimal. For the 4D mesh, such a schedule requires at leas... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed control synthesis for data-dependent iterative algorithms

    Publication Year: 1994, Page(s):57 - 68
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (480 KB)

    Data-dependent control flow changes are typically implemented in complex general-purpose controllers. However, in medium to fine-grained iterative algorithms found in DSP and arithmetic, it is desirable for both cost and performance reasons to develop simplified and distributed control structures throughout the array architectures. We present a transformation technique to systematically convert an... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A processor for calorimetry at the Large Hadron Collider in the FERMI project

    Publication Year: 1994, Page(s):188 - 199
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (572 KB)

    A dedicated digital signal processor has been designed as part of a fully digital front-end for calorimetric detectors developed for experiments in high-energy particle physics to be carried out at CERN with the Large Hadron Collider. Its function is to evaluate the collision energy by analyzing the 16-bits samples (at 67 Msamples per second) contained in a time window through convolutions with 10... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast linear Hough transform

    Publication Year: 1994, Page(s):1 - 9
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (352 KB)

    The Hough transform is the choice technique for identifying straight lines through digital images, with applications to high energy physics and computer vision. Classical methods for implementing the Hough transform of a N×N binary image require to compute N3 additions over n=log2(N) bits integers, hence nN3 bit operations per transform. We introduce a new ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel architectures for computing the Hough transform and CT image reconstruction

    Publication Year: 1994, Page(s):152 - 163
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (424 KB)

    This paper discusses high-speed array implementations of two image processing algorithms, namely the `Hough transform for detection of line segments', and `backprojection in CT image reconstruction'. A multi-chip-module (MCM) construction is proposed consisting of three types of chips, a high speed multi-function nonlinear chip, a flexible multiply-accumulate chip, and an image kernel chip. Called... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analog VLSI arrays for morphological image processing

    Publication Year: 1994, Page(s):132 - 142
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    A two-dimensional analog VLSI array that performs basic morphological image processing operations is presented. The system uses a smart pixel approach that facilitates the parallel computation of continuous real-time outputs. Photodetectors within the array of smart pixels also allow for parallel optical inputs. The processing is performed by current-mode circuitry implemented with CMOS technology... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Register transfer modeling and simulation for array processors

    Publication Year: 1994, Page(s):111 - 122
    Cited by:  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (556 KB)

    This paper presents a register transfer modeling scheme for array processor simulation. Its main goals are to verify the application specific design by real data computation, and to help fine tune the array architecture by precise timing analysis. The data flow graph of the design is translated into a register transfer language which is further combined with a hardware description module. An inter... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Regular array synthesis using ALPHA

    Publication Year: 1994, Page(s):200 - 211
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (448 KB)

    We report our current research in a computer assisted methodology for synthesizing regular array processors using the ALPHA language and design environment. The design process starts from an algorithmic level description of the function and finishes with a netlist of an array processor which performs the specified function. To illustrate the proposed approach, we present the design of an array pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal mapping of systolic algorithms by regular instruction shifts

    Publication Year: 1994, Page(s):224 - 235
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (496 KB)

    This paper addresses the problem of determining efficient mappings of systems of affine recurrence equations into regular arrays, in a nearly space-optimal fashion. A new nonlinear allocation technique is presented: the Instruction Shift. It allows to synthesize planar regular arrays without increasing the initial linear schedule. This technique is illustrated with the LLt Cholesky fact... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel DSP-based neural network emulator with CMOS VLSI packet switching hardware

    Publication Year: 1994, Page(s):381 - 391
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (512 KB)

    This work describes a parallel neural network emulator which uses standard DSPs and application-specific VLSI communication processors with an integrated hardware routing algorithm. The use of DSPs as programmable processing elements enables the emulation of different types of neurons including biologically inspired models with learnable synaptic weights and delays, variable neuron gain, and stati... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An optimisation methodology for array mapping of affine recurrence equations in video and image processing

    Publication Year: 1994, Page(s):415 - 426
    Cited by:  Papers (2)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (560 KB)

    This paper addresses the problem of deriving optimised array architectures for real-time multi-dimensional signal processing systems, as occurring in image, speech and video applications. The starting point is a set of Weak Single Assignment Codes. For this abstract specification, we solve the difficult task of finding a globally optimised architecture with matched throughput while avoiding an exp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rapid prototyping with programmable control paths

    Publication Year: 1994, Page(s):69 - 74
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (300 KB)

    The provision of a programmable control path allows a designer to experimentally build and evaluate many different instruction sets and data paths in a short period of time. For this approach to be practical, the designer needs a way to quickly modify the control path hardware to reflect the changes in the instruction set. To this end, we describe a flexible and efficient method for generating con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel system for photo realistic artificial scene rendering

    Publication Year: 1994, Page(s):314 - 323
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (412 KB)

    We present a parallel system for fast rendering of artificial scenes with photo realism. The underlying parallel algorithm is based on ray-tracing and radiosity shading. The system consists of a standard workstation, a medium-size mesh of cluster processors and a high-bandwidth interconnection between them. Each cluster processor consists of a programmable TMS320C40 core and three dedicated VLSI s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A variable-precision interval arithmetic processor

    Publication Year: 1994, Page(s):248 - 258
    Cited by:  Papers (3)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (476 KB)

    This paper presents a special-purpose processor which implements variable-precision, interval arithmetic. Variable-precision arithmetic allows the precision of the computation to be specified, based on the problem to be solved and the required accuracy of the computation. Interval arithmetic produces two values for each result, such that the true result is guaranteed to be between the two values. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Algorithms and architectures for hierarchical compression of video

    Publication Year: 1994, Page(s):10 - 21
    Cited by:  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (488 KB)

    The paper addresses the problem of collaborative video over “heterogeneous” networks. Current standards for video compression are not designed to deal with this problem. We define an additional set of metrics (ie., in addition to the standard rate versus distortion measure) to evaluate compression algorithms for this application. We also present an efficient algorithm and corresponding... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Verification of regular architectures using ALPHA: a case study

    Publication Year: 1994, Page(s):164 - 175
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (452 KB)

    We present a formal method for the verification of regular VLSI architectures. In our method, the behavioral specification of the chip and its implementation are first expressed in ALPHA, a language for the design of regular synchronous architectures. The behavioral specification as refined down to an abstract architecture description, while the implementation is simplified by induction techniques... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast pipelined FFT unit

    Publication Year: 1994, Page(s):143 - 151
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (420 KB)

    This paper is dedicated to the presentation of the architecture of a VLSI butterfly processing element, for computing FFT in serial arithmetic. This butterfly PE uses complex samples and weights, with real and imaginary parts represented separately in full fractional two's complement form. The PE is based on a compact serial/parallel to serial complex multiplier, which optimises complex multiplica... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A systolic array for 2-D DFT and 2-D DCT

    Publication Year: 1994, Page(s):123 - 131
    Cited by:  Papers (18)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB)

    A new approach for computing the 2-D DFT (discrete Fourier transform) and 2-D DCT (discrete cosine transform) is presented. A new design of a systolic array for transposed matrix multiplication is also shown in this paper. The new 2-D DFT/DCT avoids the need for the array transposer that was required by earlier implementations, and all processing can be pipelined easily. This approach employs a si... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal synthesis of application specific heterogeneous pipelined multiprocessors

    Publication Year: 1994, Page(s):99 - 110
    Cited by:  Papers (3)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (508 KB)

    We present a technique and formal model for optimal synthesis of specialized heterogeneous multiprocessors, given task flow graphs to be executed in a pipelined (periodic) fashion. SOS is a formal approach to system synthesis using mixed integer-linear programming, ensuring optimally of the final solutions. SOS was extended to cover the pipelined design style. The extensions were made while trying... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data compiling for systems of affine recurrence equations

    Publication Year: 1994, Page(s):212 - 223
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (612 KB)

    In order to get a parallel solution from a system of affine recurrence equations, a space-time transformation must first be determined. Such a transformation is characterized by a schedule and an allocation. In the context of data parallelism, efficient compilers require among other criteria appropriate data compiling techniques. These techniques should take into account the communication primitiv... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.