Scheduled System Maintenance:
On May 6th, single article purchases and IEEE account management will be unavailable from 8:00 AM - 5:00 PM ET (12:00 - 21:00 UTC). We apologize for the inconvenience.
By Topic

Computer Design: VLSI in Computers and Processors, 1989. ICCD '89. Proceedings., 1989 IEEE International Conference on

Date 2-4 Oct. 1989

Filter Results

Displaying Results 1 - 25 of 116
  • Proceedings. 1989 IEEE International Conference on Computer Design: VLSI in Computers and Processors (Cat. No.89CH2794-6)

    Publication Year: 1989 , Page(s): 0_1
    Save to Project icon | Request Permissions | PDF file iconPDF (920 KB)  
    Freely Available from IEEE
  • Optical interconnects for interprocessor communications in the Connection Machine

    Publication Year: 1989 , Page(s): 58 - 61
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (332 KB)  

    The Connection Machine computer has 65536 processors interconnected through a hypercube network. In commercial systems, current costs make short-distance optical fibers attractive only when connector density or wire density are critical. The prototype utilized a 400-Mb/s optical interconnect consisting of two fibers to provide the interprocessor communication between two 8 K processor subcubes. A 1100-Mb/s link will be completed by the end of 1989. In this case, optical interconnects provided over two orders of magnitude improvement in connection density. The architecture, design, and physical characteristics of a prototype optical interconnect for the Connection Machine are described. The future use of optical interconnections in computer systems is explored View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FPC: a floating-point processor controller chip for systolic signal processing

    Publication Year: 1989 , Page(s): 14 - 17
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    The FPC (floating-point process controller) chip design and the AMD Am29325 32-b floating-point processor mathematics chip form a two-chip cell designed for one- or two-dimensional systolic arrays which can be used to implement a wide variety of signal processing applications. The FPC controls the Am29325, routes data to and from it, and routes data and control to other cells in the array. Unique features include two interchangeable data memories, an input port which can be used as either a local or global port, and a 32-b instruction word that provides concurrent use of all cell resources. Additional features include a program memory, two data streams, and three control streams. The novel architectural features of the cell are described, and a matrix multiplication example is used to demonstrate their usefulness View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A general-purpose video signal processor: architecture and programming

    Publication Year: 1989 , Page(s): 74 - 77
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (312 KB)  

    Programming aspects of a new digital, flexible processor especially designed for the effective processing of real-time video signals are addressed. The modular architecture contains a number of programmable, pipelined processing elements. A programmable crossbar switch provides a flexible interconnection between the processing elements. The programs can be constructed with the aid of graphical, interactive tools that abstract from hardware details. Efficient mapping algorithms that automate parts of the programming trajectory have been designed. Data are presented on the hardware utilization that was achieved by applying the tools to a number of video algorithms View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Macrocell-level compaction with automatic jog introduction

    Publication Year: 1989 , Page(s): 536 - 539
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (284 KB)  

    A novel algorithm for compacting a VLSI chip on the macrocell level is presented. Compared to previous algorithms, the technique can handle larger designs, produces higher-quality output, and reduces designer intervention as much as possible. Jogs are automatically introduced in the connecting wires to achieve the needed flexibility for placing cells into optimal positions View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An automatic test pattern generation program for large ASICs

    Publication Year: 1989 , Page(s): 244 - 248
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (372 KB)  

    An automatic test pattern generation program (ATPG) is described for large, application-specific integrated circuits (ASICs) designed with a scan path technique. This program was implemented with an improved deterministic test generation algorithm that makes use of a split model for circuit representation and multiple testability heuristics for efficient search. The program can also interface to a hardware accelerator to speed up the test compaction process View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast algorithm for mixed-radix conversion in residue arithmetic

    Publication Year: 1989 , Page(s): 18 - 21
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (292 KB)  

    An algorithm based on a partitioning of the coefficient matrix when the mixed-radix conversion problem is cast as a set of linear congruent equations is presented. The algorithm partitions the moduli set into disjoint subsets such that the product of the moduli in each subset is less than the largest integer representable by the computer. It is shown that, with this partitioning strategy, mixed-radix representation of a residue number can be computed using less than O (n2) arithmetic steps where n is the cardinality of the moduli set. It is also shown that if a good partitioning exists, then the algorithm requires only O(n 1.5) arithmetic steps. The algorithm is particularly suitable for single processor implementation of algorithms from the residue number system applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A high performance BiCMOS 32-bit microprocessor

    Publication Year: 1989 , Page(s): 358 - 361
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (352 KB)  

    The authors have developed the world's first BiCMOS 32-b single-chip microprocessor. It integrates 529 K transistors into a 12.98-mm2 chip and typically realizes a 70-MHz frequency. The frequency is 1.5-2 times higher than that of current CMOS microprocessors. The microprocessor is designed to reduce the number of interchip communication signals in the critical path and to use basic cells optimally so as to allow fabrication into a single chip. The microprogram is divided into two parts, and frequently used microinstructions are stored in the ROM on the chip to reduce interchip communication. The translation lookaside buffer is also integrated in the microprocessor to reduce the interchip communication signals for memory access. Because of chip size and logic complexity constraints, less than 20% of the basic cells can be BiCMOS cells View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generic ASIC architecture for digital signal processing

    Publication Year: 1989 , Page(s): 82 - 85
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (336 KB)  

    The architectural methodology behind a novel high-level VLSI cell compiler currently under development is described. The tool is aimed specifically at digital signal processing applications, synthesizing powerful arithmetic kernel processors from high-level parameterized schematics. Underlying the tool is a generic pipelined numerical processing architecture, flexible enough in its use of innate parallelism to meet a wide range of throughput requirements with minimal waste of resources. Machines are synthesized using this architecture as a blueprint. To this end, the tool encapsulates the essential knowledge required to assemble a useful set of arithmetic operators over all local and global parametric combinations. Some multiplier instances are illustrated View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Issues in the test of artificial neural networks

    Publication Year: 1989 , Page(s): 487 - 490
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (316 KB)  

    Test concepts for artificial neural networks are discussed. It is shown that the traditional design-for-test techniques such as (boundary) scan are of limited use owing to the high connectivity and redundancy of neural networks. An information-theoretical approach that allows for wafer as well as chip test is outlined. This approach involves testing directly on the macro properties of the neural network. The influence of device faults and built-in fault tolerance of the network is captured in so-called fault tolerant curves. These curves, obtained by simulation, link the various macro and micro properties together and allow an information-driven test of the neural network View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A global floorplanning technique for VLSI layout

    Publication Year: 1989 , Page(s): 92 - 95
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (280 KB)  

    The floorplanning of rectangular cells is discussed. A new global approach that simultaneously accounts for different design goals is presented. A key aspect of this approach is a more general slicing structure representation of the floorplan that is not restricted to a special case of rectangular dissection and a two-dimensional partitioning procedure. A new model for the prediction of the associated shape functions is presented and an analytic optimization technique for the pin allocation is described View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance and microarchitecture of the i486 processor

    Publication Year: 1989 , Page(s): 182 - 187
    Cited by:  Papers (2)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (372 KB)  

    The i486 microprocessor includes a carefully tuned, five-stage pipeline with an integrated 8-kB cache. A variety of techniques previously associated only with RISC (reduced-instruction-set computer) processors are used to execute the average instruction in 1.8 clocks. This represents a 2.5× reduction from its predecessor, the 386 microprocessor. The pipeline and clock count comparisons are described in detail. In addition, an onchip floating-point unit is included which yields a 4× clock count reduction from the 387 numeric coprocessor. The microarchitecture enhancements and optimizations used to achieve this goal, most of which are non-silicon-intensive, are discussed. All instructions of the 386 microprocessor and the 387 numeric coprocessor are implemented in a completely compatible fashion View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Magnitude classes in switch-level modeling

    Publication Year: 1989 , Page(s): 284 - 288
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (452 KB)  

    The relationship between switch-level circuit models and the linear electric circuits from which they are abstracted were investigated. This is important in determining the accuracy and consistency of switch-level simulation programs. A precise new definition of magnitude or strength classes is presented, which leads to exact bounds on the accuracy of resistance and voltage calculations with magnitude classes relative to the corresponding linear calculations. The applicability to switch-level networks of standard solution methods for linear networks, including Gaussian elimination and Jacobi iteration, is also examined. The results indicate that the potential of switch-level simulators to provide accurate results is far less than previously thought View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Computer aided design and built in self test on the i486 CPU

    Publication Year: 1989 , Page(s): 199 - 202
    Cited by:  Papers (15)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (288 KB)  

    The computer-aided design tools created to accelerate the design of the i486 CPU are described. Emphasis is on the logic synthesis, layout synthesis, and timing verification programs. The impact of the state-of-the-art tools on the project schedule is described, and the built-in self-test features on the processor are detailed. The type of advanced tools that will be required to design and test future processors, which will be more complex, is considered View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SLAM: a smart analog module layout generator for mixed analog-digital VLSI design

    Publication Year: 1989 , Page(s): 24 - 27
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    A tool for the smart layout of analog modules, which aims to provide a flexible analog layout solution for the mixed analog-digital VLSI design environment, is described. New algorithms have been developed for novel primitive cell recognition, intelligent layout and detail routing, and performance-driven optimization. Software implementation and experimental results on several analog VLSI modules such as operational amplifiers and voltage-controlled oscillators are presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Counter-based residue arithmetic circuit for easily testable VLSI digital signal processing systems

    Publication Year: 1989 , Page(s): 362 - 365
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (176 KB)  

    A counter-based residue arithmetic circuit composed of ring counters which performs residue arithmetic operations by pulse counting is proposed for easily testable VLSI digital signal processing systems. A master-slice LSI on which counter-based residue arithmetic circuits are regularly arranged is also presented. It is demonstrated that the counter-based residue arithmetic circuit has a self-testable structure, and a highly regular and easily testable system implementation can be realized using the circuits View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The matrix transform chip

    Publication Year: 1989 , Page(s): 86 - 89
    Cited by:  Papers (1)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (236 KB)  

    The matrix transform chip (MTC) is designed to perform matrix computations of the form Y=UDV where D is the input data matrix of 16-bit twos complement fixed-point numbers and U, V, are arbitrary coefficient matrices of the same precision. The data matrix D is input to the chip in raster scanned order at a maximum sample rate of 40 MHz, and the output matrix is provided in the same order. On a single chip, the maximum dimension of all matrices must be less than eight, but multiple chips can be cascaded to obtain arbitrary dimensions. The MTC consists of 16 16-bit parallel multipliers/40-bit accumulators, a kilobyte of dual-ported transposition static RAM, and a kilobyte of coefficient static RAM, arranged to interact in a regular iterative architecture. At peak operation, the MTC is capable of performing 0.64 billion fixed-point multiples, 0.64 billion 40-bit accumulates, along with 1.92 billion pseudorandom memory-access operations per second View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulation of MOS circuit performance degradation with emphasis on VLSI design-for-reliability

    Publication Year: 1989 , Page(s): 492 - 495
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (348 KB)  

    A framework for a reliability simulation tool to assess the hot-carrier-induced degradation of MOS circuits is presented, and the major components of this framework are examined. A method is introduced for dynamic simulation of hot-carrier-induced transistor degradation within the circuit environment. The approach accounts for the gradual degradation of terminal voltage waveforms of MOS transistors during long-term operation. It is demonstrated that the estimation of individual device lifetimes is not sufficient for circuit reliability assessment. The critical transistors that are most likely to cause circuit performance failures are identified by combining the long-term degradation estimates with the corresponding circuit performance sensitivities View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Novel architecture for a high performance full custom graphics processor

    Publication Year: 1989 , Page(s): 410 - 414
    Cited by:  Patents (21)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (412 KB)  

    The design and architectural features of a 32-b microprogrammed graphics processor are discussed. A description is given of the application of datapath elements and the exploitation of VRAM hardware, to achieve state-of-the-art processor performance levels. The host interface is discussed, along with video control, screen refresh control, and emulation and testability View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient approach to pseudo-exhaustive test generation for BIST design

    Publication Year: 1989 , Page(s): 576 - 579
    Cited by:  Papers (8)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (284 KB)  

    In the built-in self-test (BIST) methodology, the two major problems which must be addressed are test generation and response analysis. An efficient, unified solution to the problem of test generation is presented. A design procedure that is computationally efficient and produces test generation circuitry with low hardware overhead is proposed. The effectiveness of this approach is demonstrated by detailed comparisons of its results with those that would be obtained by existing techniques View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Systolic L-U decomposition array with a new reciprocal cell

    Publication Year: 1989 , Page(s): 460 - 465
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB)  

    A systolic architecture for L-U decomposition using a recently developed reciprocal cell is presented. The arithmetic of this new cell is based on second-order polynomial interpolation, which results in high speed and inherent stability in inversion. In part, the speed of the cell is realized by use of a special empirical mapping. For the 16-b mantissa of a floating point number, the lookup table size required is 64 words, and the RMS value of the error is approximately one-third of the LSB of the 16-b result View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Locality characteristics of symbolic programs

    Publication Year: 1989 , Page(s): 508 - 511
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (320 KB)  

    By analyzing the virtual address traces of artificial intelligence applications, characteristics have been found in the temporal, spatial, and structural locality of the virtual memory references during symbolic program execution. These locality characteristics differ significantly from those of conventional workloads. This analysis is made using the author's two-state semi-Markov model of memory referencing behavior and not only reveals aspects of temporal and spatial locality that are much stronger in symbolic workloads, but also uncovers a high degree of structural locality in both types of workloads. Based on these findings, a unique memory system design that exploits these special reference locality characteristics of symbolic workloads is proposed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High performance I/O processors for real-time pulse handling

    Publication Year: 1989 , Page(s): 415 - 418
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (308 KB)  

    Two peripheral processor LSIs, the FTI (fast timed input port) and the FTO (fast timed output port), have been developed for real-time pulse handling. By using the time-wheel scheme, these processors provide a high-level command interface with the host CPU, thus alleviating the CPU load. New features, such as time difference measurement between channels and user reprogrammability during operation have been realized using this approach. The prototypes of both FTI and FTO were designed and fabricated using a 1.5-μm CMOS sea-of-gates technology, and demonstrated the effectiveness of the time-wheel scheme View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A cached system architecture dedicated for the system IO activity on a CPU board

    Publication Year: 1989 , Page(s): 518 - 522
    Cited by:  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB)  

    The architecture of a cached IO subsystem on a CPU board of a high-performance workstation is described. The cached IO subsystem is intended to reduce the memory latency for IO activity and minimize the number of processor cycles stolen by IO traffic, to achieve a better balanced computer system in terms of both the CPU computation power and the system IO bandwidth. The model of the cached IO subsystem is shown to illustrate the system architecture. The discussion covers arbitration of system IO activity, the architecture of the virtual address accessed IO cache, the cache coherency algorithm, as well as the comparison of theoretical bounds of IO bandwidth and CPU degradation between different caching schemes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A VLSI module for IEEE floating-point multiplication/division/square root

    Publication Year: 1989 , Page(s): 366 - 368
    Cited by:  Patents (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (140 KB)  

    The major objective of this VLSI module design is to determine how to modify a fast floating-point multiplier so that it can perform division and square root in accordance with IEEE standards. This has been achieved by applying the Newton-Ralphson iteration only on the mantissa and adjusting the iterated result by a rounding algorithm. Using 1.0-μm CMOS standard cell technology, the total area of this module is approximately 7.0 mm×6.5 mm, which is just 25% larger than the floating-point multiplier. The module can compute multiplication, division, and square root in 3, 31, and 43 cycles, respectively. The cycle time, under nominal conditions, is expected to be 20 ns View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.