By Topic

Very Large Scale Integration (VLSI) Systems, IEEE Transactions on

Issue 6 • Date Dec. 2000

Filter Results

Displaying Results 1 - 16 of 16
  • The interpretation and application of Rent's rule

    Page(s): 639 - 648
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (214 KB)  

    This paper provides a review of both Rent's rule and the placement models derived from it. It is proposed that the power-law form of Rent's rule, which predicts the number of terminals required by a group of gates for communication with the rest of the circuit, is a consequence of a statistically homogeneous circuit topology and gate placement. The term "homogeneous" is used to imply that quantities such as the average wire length per gate and the average number of terminals per gate are independent of the position within the circuit. Rent's rule is used to derive a variety of net length distribution models and the approach adopted in this paper is to factor the distribution function into the product of an occupancy probability distribution and a function which represents the number of valid net placement sites. This approach places diverse placement models under a common framework and allows the errors introduced by the modeling process to be isolated and evaluated. Models for both planar and hierarchical gate placement are presented. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Prediction of net-length distribution for global interconnects in a heterogeneous system-on-a-chip

    Page(s): 649 - 659
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (306 KB)  

    A system-on-a-chip (SoC) contains several pre-designed heterogeneous megacells that have been designed and routed optimally. In this paper a new stochastic net-length distribution for global interconnects in a nonhomogeneous SoC is derived using novel models for netlist, placement, and routing information. The netlist information is rigorously derived based on heterogeneous Rent's rule, the placement information is modeled by assuming a random placement of terminals for a given net in a bounding area, and the routing information is constructed based on a new model for minimum rectilinear Steiner tree construction (MRST). The combination of the three models gives a priori estimation of global net-length distribution in a heterogeneous SoC. Unlike previous models that empirically relate the average length of the global wires to the chip area, the new distribution provides a complete and accurate distribution of net-length for global interconnects. Through comparison with actual product data, it is shown that the new stochastic model successfully predicts the global net-length distribution of a heterogeneous system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Heterogeneous architecture models for interconnect-motivated system design

    Page(s): 660 - 670
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (404 KB)  

    On-chip interconnect demand is becoming the dominant factor in modern processor performance and must be estimated early in the design process. This paper presents a set of heterogeneous architectural models that combines architecture description and Rent's rule-based wiring models. These architecture models allow flexible heterogeneous system specifications, enabling investigations of prospective designs in different technology scenarios. Comparisons against actual data demonstrate the models' effectiveness for architecture explorations with highly accurate estimations of local and global wiring demand, as well as chip area and cycle time. Simulation of two candidate system designs reveal trends in interconnect delay with increasing architectural complexity, and confirm the need for high computational locality and short global wires for future architectures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • System-level performance evaluation of three-dimensional integrated circuits

    Page(s): 671 - 678
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (199 KB)  

    In this paper, the wire (interconnect)-length distribution of three-dimensional (3-D) integrated circuits (ICs) is derived using Rent's rule and following the methodology used to estimate two-dimensional (2-D) (wire-length distribution). Two limiting cases of connectivity between logic gates on different device layers are examined by comparing the wire-length distribution and average and total wire-length. System performance metrics such as clock frequency, chip area, etc., are estimated using wire-length distribution, interconnect delay criteria, and simple models representing the cost or complexity for manufacturing 3-D ICs. The technology requirement for interconnects in 3-D integration is also discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rent exponent prediction methods

    Page(s): 679 - 688
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (140 KB)  

    A wide variety of models for estimating the distribution of on-chip net lengths assume an accurate estimate for an empirical parameter called the Rent exponent. Due to its definition as an exponent, these models are sensitive to its precise value, and careful selection is essential for good estimates of layout requirements and cycle times. In addition, it is also important to be able to predict changes in the Rent exponent with (possibly discontinuous) changes in interconnect technology. This paper presents a range of methods for estimating the Rent exponents of arbitrarily large gate placements as a function of optimization procedure and the level of fan-out present in the netlist. The first part of the paper describes a rapid algorithmic approach which combines the self-similar, or fractal attributes of small wiring cells with a Monte Carlo sampling method. This method is shown to accurately account for variations in both the wiring signature of the netlist and for the effects of most algorithms used for placement optimization. The second part of the paper presents an analytical model for Rent exponent prediction, based on a renormalization group transformation. This transformation is designed to filter out information which does not contribute to the scale-invariant properties of the optimized netlist enabling the derivation of a closed-form expression for the Rent exponent. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A compact physical via blockage model

    Page(s): 689 - 692
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (136 KB)  

    Via blockage due to signal interconnects and its impact on wirability of multi-billion-transistor chips are systematically analyzed. Via classifications are introduced. By taking advantage of a stochastic interconnect length distribution and a multi-level interconnect network architecture, a physical via blockage model exploiting channel availability is proposed. This model reveals that the most severe via blockage occurs on first metal level, wasting more than 10% and up to about 50% of wiring area. A new perspective on chip size limit imposed by via blockage is also provided by using the proposed model. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using dynamic cache management techniques to reduce energy in general purpose processors

    Page(s): 693 - 708
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (479 KB)  

    The memory hierarchy of high-performance and embedded processors has been shown to be one of the major energy consumers. For example, the Level-1 (L1) instruction cache (I-Cache) of the StrongARM processor accounts for 27% of the power dissipation of the whole chip, whereas the instruction fetch unit (IFU) and the I-Cache of Intel's Pentium Pro processor are the single most important power consuming modules with 14% of the total power dissipation [2]. Extrapolating current trends, this portion is likely to increase in the near future, since the devices devoted to the caches occupy an increasingly larger percentage of the total area of the chip. In this paper, we propose a technique that uses an additional mini cache, the LO-Cache, located between the I-Cache and the CPU core. This mechanism can provide the instruction stream to the data path and, when managed properly, it can effectively eliminate the need for high utilization of the more expensive I-Cache. We propose, implement, and evaluate five techniques for dynamic analysis of the program instruction access behavior, which is then used to proactively guide the access of the LO-Cache. The basic idea is that only the most frequently executed portions of the code should be stored in the LO-Cache since this is where the program spends most of its time. We present experimental results to evaluate the effectiveness of our scheme in terms of performance and energy dissipation for a series of SPEC95 benchmarks. We also discuss the performance and energy tradeoffs that are involved in these dynamic schemes. Results for these benchmarks indicate that more than 60% of the dissipated energy in the I-Cache subsystem can be saved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The design of the MGAP-2: a micro-grained massively parallel array

    Page(s): 709 - 716
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (181 KB)  

    The Micro-Grain Array Processor-2 (MGAP-2) is a two-dimensional SIMD array of 49152 fine-grain processors designed primarily for high-performance signal and image processing. Each processor can compute two arbitrary three-input Boolean functions, contains local RAM, and has additional logic for interprocessor communication. The MGAP-2 differs from existing fine-grain arrays in that it has a high degree of integration while incorporating processor level interconnect control. Each processor can independently select its communication direction. This allows a programmer to map algorithms onto the array in a more efficient manner than if the processors communicated in the standard SIMD fashion. Also, the MGAP-2's processor level interconnect allows groups of processors to be clustered into larger computational units, making the basic computational units as powerful as they need to be for a given problem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intrinsic leakage in deep submicron CMOS ICs-measurement-based test solutions

    Page(s): 717 - 723
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (154 KB)  

    The high leakage current in deep submicron, short-channel transistors can increase the stand-by power dissipation of future IC products and threaten well established quiescent current (I/sub DDQ/)-based testing techniques. This paper reviews transistor intrinsic leakage mechanisms. Then, these well-known device properties are applied to a test application that combines I/sub DDQ/ and ICs maximum operating frequency (F/sub max/) to establish a novel two-parameter test technique for distinguishing intrinsic and extrinsic (defect) leakages in ICs with high background leakage. Results show that I/sub DDQ/ along with F/sub max/ can be effectively used to screen defects in high performance, low V/sub T/ (transistor threshold voltage) CMOS ICs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Highly regular, modular, and cascadable design of cellular automata-based pattern classifier

    Page(s): 724 - 735
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (274 KB)  

    This paper enumerates a new approach to the solution of classification problems based on the properties of Additive Cellular Automata. Classification problem plays a major role in various fields of computer science, such as grouping of the records in database systems, detection of faults in VLSI circuits, image processing, and so on. The state-transition graph of Non-group Cellular Automata (CA) consists of a set of disjoint trees rooted at some cyclic states of unit cycle length - thus forming a natural classifier. First a scheme of classifying the patterns distributed into only two classes has been dealt with. This has been further extended for solution of the multiclass classification problem. The Multiclass Classifier saves on an average 34% of memory as compared to the straight-forward approach storing directly the class of each pattern. A regular, modular, and cascadable hardware implementation of the classifier has been presented which is highly suitable for VLSI realization. The design has been specified in Verilog and verified for functional correctness. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving path delay testability of sequential circuits

    Page(s): 736 - 741
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (157 KB)  

    We analyze the causes of low path delay fault coverage in synchronous sequential circuits and propose a method to improve testability. The three main reasons for low path delay fault coverage are found to be: (A) combinationally false (nonactivatable) paths; (B) sequentially nonactivatable paths; and (C) unobservable fault effects. Accordingly, we classify undetected faults in Groups A, B, and C. Combinationally false paths ran be made testable by modifying the circuit or resynthesizing the combinational logic as discussed by other researchers. A majority of the untestable faults are, however found in Group B, where a signal transition cannot be functionally propagated through a combinational path. A test requires two successive states necessary to create a signal transition and propagate it through the target path embedded in the sequential circuit. We study a partial scan technique in which flip-flops are scanned to break cycles and shun that a substantial increase in the coverage of path delay faults is possible. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a configurable accelerator for moment computation

    Page(s): 741 - 746
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (220 KB)  

    The method of moments is one of the most powerful techniques for image analysis. However, real-time applications of this method have been prohibited due to the computational intensity in calculating the moments. This paper presents a novel configurable hardware accelerator for expediting the moment computation. The fundamental building block of the proposed accelerator is a custom-designed floating-point moment processing element (MPE). Running at 75 MHz, the MPE can provide a 12X speedup over a 166 MHz TMS320C6701 digital signal processor. On top of this, a linear performance boost can be obtained by connecting up to eight MPEs into a one-dimensional (1-D) array. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GABIND: a GA approach to allocation and binding for the high-level synthesis of data paths

    Page(s): 747 - 750
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (79 KB)  

    We present here a technique for allocation and binding for data path synthesis (DPS) using a Genetic Algorithm (GA) approach. This GA uses an unconventional crossover mechanism relying on a force directed data path binding completion algorithm. The data path is synthesized using some supplied design parameters. A bus-based interconnection scheme, use of multi-port memories, and provision for multicycling and pipelining are the main features of this system. The method presented here has been applied to standard benchmark examples and the results obtained are promising. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Partitioning algorithm to enhance pseudoexhaustive testing of digital VLSI circuits

    Page(s): 750 - 754
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (95 KB)  

    This brief introduces a partitioning algorithm, which facilitates pseudoexhaustive testing, to detect and locate faults in digital VLSI circuits. The algorithm is based on an analysis of circuit's primary input cones and fanout (PIFAN) values. An invasive approach is employed, which creates logical and physical partitions by automatically inserting reconfigurable test cells and multiplexers. The test cells are used to control and observe multiple partitioning points, while the multiplexers expand the controllability and observability provided by the test cells. The feasibility and efficiency of our algorithm are evaluated by partitioning numerous ISCAS 1985 and 1989 benchmark circuits containing up to 5597 gates. Our results show that the PIFAN algorithm offers significant reductions in overhead and test time when compared to previous partitioning algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Page(s): 755 - 758
    Save to Project icon | Request Permissions | PDF file iconPDF (41 KB)  
    Freely Available from IEEE
  • Subject index

    Page(s): 758 - 765
    Save to Project icon | Request Permissions | PDF file iconPDF (66 KB)  
    Freely Available from IEEE

Aims & Scope

Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing, and systems applications. Generation of specifications, design, and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor, and process levels.

To address this critical area through a common forum, the IEEE Transactions on VLSI Systems was founded. The editorial board, consisting of international experts, invites original papers which emphasize the novel system integration aspects of microelectronic systems, including interactions among system design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and system level qualification. Thus, the coverage of this Transactions focuses on VLSI/ULSI microelectronic system integration.

Topics of special interest include, but are not strictly limited to, the following: • System Specification, Design and Partitioning, • System-level Test, • Reliable VLSI/ULSI Systems, • High Performance Computing and Communication Systems, • Wafer Scale Integration and Multichip Modules (MCMs), • High-Speed Interconnects in Microelectronic Systems, • VLSI/ULSI Neural Networks and Their Applications, • Adaptive Computing Systems with FPGA components, • Mixed Analog/Digital Systems, • Cost, Performance Tradeoffs of VLSI/ULSI Systems, • Adaptive Computing Using Reconfigurable Components (FPGAs) 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Yehea Ismail
CND Director
American University of Cairo and Zewail City of Science and Technology
New Cairo, Egypt
y.ismail@aucegypt.edu