By Topic

Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on

Issue 11 • Date Nov 1997

Filter Results

Displaying Results 1 - 13 of 13
  • Arithmetic built-in self-test for DSP cores

    Publication Year: 1997 , Page(s): 1358 - 1369
    Cited by:  Papers (27)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1012 KB)  

    A new built-in self-test (BIST) methodology is presented in which all generation and compaction functions are executed by basic building blocks such as adders, ALU's, and multipliers, performing regular arithmetic functions in digital signal processing (DSP) cores. It is demonstrated how these components are themselves tested, and subsequently used to perform more complex testing functions. The need for extra hardware is either entirely eliminated or drastically reduced, test vectors can be easily distributed to different modules of the system, test responses can be collected in parallel, and there is virtually no performance degradation. As an integral part of the proposed BIST environment, arithmetic two-dimensional (2-D) generators of pseudorandom test vectors are also introduced to further integrate the scheme with parallel scan and boundary scan designs used to test peripheral devices of the core View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Partitioning and analysis of static digital CMOS circuits

    Publication Year: 1997 , Page(s): 1292 - 1310
    Cited by:  Papers (2)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (548 KB)  

    Performance optimization of automatic test pattern generation (ATPG) algorithms has received considerable attention. While the application of high-performance algorithms is often limited to simple gates such as AND's, OR's, and XOR, the cell libraries of silicon vendors usually contain more sophisticated structures. To deal with this problem, we present a library independent algorithm for the partitioning and analysis of static digital CMOS circuits described at the switch level. The algorithm recognizes inverters, NANDs, and NORs. It also checks whether a partition can set its output to a high impedance state, thus being capable of partitioning large bus structures with tristate gates. Our approach supports existing ATPG algorithms at the gate level. Moreover, it allows a mixed-level approach for ATPG using detailed fault models at the switch level for whatever partitions it is necessary. Our implementation processes in the order of 2000 transistors per second. This is for circuits containing combinational and sequential logic on a state-of-the-art workstation. We processed complete chips with up to 72000 transistors, which is clearly adequate for practical purposes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The state reduction of nondeterministic finite-state machines

    Publication Year: 1997 , Page(s): 1278 - 1291
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (388 KB)  

    Nondeterministic finite-state machines (NFSM's) are a powerful tool for specifying the desired behavior of a sequential system along with its degrees of freedom. Moreover, nondeterministic machines arise naturally when considering the optimization problem of interacting synchronous circuits or deterministic finite-state machines. Yet, NFSM's have been considered only recently in the literature. In this paper, we present algorithms for synthesizing a minimum state implementation of a specification NFSM. The algorithm is a generalization of the classical method for incompletely specified machines, with several modifications. We also introduce the notion of equivalence between compatibility classes and a novel formulation of the closure problem to eliminate unnecessary classes and implications, so as to speed up substantially the search of the optimum solution. A novel BDD-based implementation technique is also presented, which avoids the explicit representation of the transition relation of the original NFSM View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Theory and algorithms for state minimization of nondeterministic FSMs

    Publication Year: 1997 , Page(s): 1311 - 1322
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (464 KB)  

    This paper addresses state minimization problems of different classes of nondeterministic finite-state machines (NDFSMs). We describe a fully implicit algorithm for state minimization of pseudo nondeterministic FSM's (PNDFSMs). The results of our implementation are reported and shown to be superior to a previous explicit formulation. We could solve exactly all but one problem of a published benchmark, while an explicit program could complete approximately one half of the examples, and in those cases, with longer run times. Then we present a theoretical solution to the problem of exact state minimization of general NDFSMs, based on the proposal of generalized compatibles. This gives an algorithmic framework to explore behaviors contained in a general NDFSM View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On variable clock methods for path delay testing of sequential circuits

    Publication Year: 1997 , Page(s): 1237 - 1249
    Cited by:  Papers (12)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (228 KB)  

    We propose a delay test methodology for general sequential circuits. To test a combinational path between two flip-flops, the source flip-flop is initialized to the appropriate value, followed by creation of a transition, which is propagated through the path. An incorrect logic value is captured in the destination flip-flop if the path delay exceeds the clock period. The state of the destination flip-flop is observed at a primary output through path sensitization. Only one vector that propagates the transition through the path is applied with the rated clock period. All other vectors use a slow speed clock to ensure fault-free initialization and fault effect observation. The test generation method uses a 13-value algebra that represents the relevant transition and hazard states of signals. Since several path delay faults can be activated by the vector applied at the rated clock, only the flip-flops with hazard-free steady values are assumed to have deterministic states. This allows us to generate sequentially robust tests. We present the results of the test generation method on ISCAS benchmark circuits View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel standard cell placement algorithm

    Publication Year: 1997 , Page(s): 1342 - 1357
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (564 KB)  

    We present a loosely coupled parallel algorithm for the placement of standard cell integrated circuits. Our algorithm is a derivative of simulated annealing. The implementation of our algorithm is targeted toward networks of Unix workstations. This is the very first reported parallel algorithm for standard cell placement which yields as good or better placement results than its serial version. In addition, it is the first parallel placement algorithm reported which offers nearly linear speed-up for small numbers of processors, in terms of the number of processors (workstations) used, over the serial version. Despite using the rather slow local area network as the only means of interprocessor communication, the processor utilization is quite high, up to 98% for two processors and 90% for six processors. The new parallel algorithm has yielded the best overall results ever reported for the set of MCNC standard cell benchmark circuits View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A delay budgeting algorithm ensuring maximum flexibility in placement

    Publication Year: 1997 , Page(s): 1332 - 1341
    Cited by:  Papers (20)  |  Patents (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (448 KB)  

    In this paper, we present a new, general approach to the problem of computing upper bounds on net delays. The upper bounds on net delays are computed so that timing constraints between input and output signals are satisfied. The set of delay upper bounds is called a delay budget. The objective of this work is to compute a delay budget that will lead to timing feasible circuit placement and routing. In our formulation, we find a delay budget so that the placement phase has “maximum flexibility.” We formulate this problem as a convex programming problem and prove that it has a special structure. We utilize the special structure of the problem to propose an efficient graph-based algorithm. We present experimental results for our algorithms with the MCNC placement benchmarks. Our experiments use budgeting results as net length constraints for the TimberWolf placement program, which we use to evaluate the budgeting algorithms. We obtain an average of 50% reduction in net length constraint violations over the well-known zero-slack algorithm (ZSA). We also study different delay budgeting objective functions, which yield 2× performance improvements without loss of solution quality. Our results and graph-based formulation show that our proposed algorithm is suitable for modern large-scale budgeting problems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Redundancy removal and test generation for circuits with non-Boolean primitives

    Publication Year: 1997 , Page(s): 1370 - 1377
    Cited by:  Papers (1)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (188 KB)  

    Production VLSI circuits typically consist of primitives like tristate buffers, bidirectional buffers, and bus configurations that assume non-Boolean values like the high-impedance state. We describe a systematic methodology for extending test generation algorithms that work on combinational circuits with only Boolean primitives to full-scan production circuits. Key features of the methodology are illustrated using the energy minimization based test generation algorithm for combinational circuits. The main features of our methodology that make the test generation algorithm practical for large production circuits are: (1) only one Boolean variable is used to represent the value on a signal and all signals assume only Boolean values during the test generation procedure; (2) the function of non-Boolean primitives is separated into Boolean and non-Boolean components with energy functions required only for the Boolean component; and (3) non-Boolean components are implicitly considered in the energy minimization procedure. In this process, no new energy functions other than the normal Boolean gate energy functions are needed. We give a method for identifying and removing redundancies in production circuits using energy minimization. The formulation is also applicable to Boolean satisfactorily and BDD methods. We first use the test generation algorithm for identifying undetectable faults and then relax specific constraints in the original test generation problem by ignoring the non-Boolean components. We show that undetectability in the relaxed formulation implies redundancy. We report redundancy removal results for production VLSI circuits, ISCAS 85, and full-scan versions of the ISCAS 89 benchmark circuits View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parameter extraction for statistical IC modeling based on recursive inverse approximation

    Publication Year: 1997 , Page(s): 1250 - 1259
    Cited by:  Papers (4)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (340 KB)  

    An accurate and efficient parameter extraction methodology, utilizing a new technique called recursive inverse approximation (RIA), is proposed for statistical modeling of integrated circuits. The main features of RIA are (1) linear approximation is used to obtain initial model parameter estimates, (2) reverse verification performs accuracy checking, and (3) error correction functions are constructed in the extracted parameter space to recursively refine the previously extracted parameter values. As a result, an approximate inverse mapping from the measured performance space to the model parameter space is established for statistical parameter extraction. Examples of parameter extraction for MOS transistors and IC multiplier block demonstrate high efficiency and accuracy of the new method View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of minimum and uniform bipartites for optimum connection blocks of FPGA

    Publication Year: 1997 , Page(s): 1377 - 1383
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB)  

    The design of optimum connection blocks of field programmable gate arrays (FPGA's) in number and in distribution of switches is formulated as a bipartite graph design problem and solved. A bipartite with vertex sets R and L (|R|⩽|L|) is called totally perfect if there is a perfect matching from Ls to R for any Ls⊂L with |Ls|⩽|R|. The difference of maximum and minimum degrees of the vertices in L or R is called the skew of the respective vertex set. The problem is to construct a minimum totally perfect bipartite graph with the minimum skew. The result shows that a method, biscattering, can construct such a matrix in O(|R|×|L|) time where the lower bound is attained for both skews. This construction also solves the problem of designing optimum direct-concentrators View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Process variation effects on circuit performance: TCAD simulation of 256-Mbit technology [DRAMs]

    Publication Year: 1997 , Page(s): 1383 - 1389
    Cited by:  Papers (11)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (268 KB)  

    This paper describes the first study of the complete sequence from process simulation to circuit performance and the corresponding sensitivities for 0.25-μm technology. This is made possible by a combination of physically based process models and a systematic calibration involving SIMS, one-dimensional (1-D), and two-dimensional (2-D) device characteristics. Simulated nFET and pFET characteristics match hardware (HW) within 5-10% for both long-channel and nominal length devices. Simulated ring-oscillator performance is in good agreement with HW data. Sensitivities of device characteristics and the inverter gate delay to process variations (within 10%) are quantified. These investigations establish the correlation between process variations and circuit performance View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TIGER: an efficient timing-driven global router for gate array and standard cell layout design

    Publication Year: 1997 , Page(s): 1323 - 1331
    Cited by:  Papers (10)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB)  

    In this paper, we propose an efficient timing-driven global router, TIGER, for gate array and standard cell layout design. Unlike other conventional global routing techniques, interconnection delays are modeled and included during the routing and rerouting process in order to minimize the maximum channel density for gate arrays or the total track number for standard cells, as well as to satisfy the timing constraints in TIGER. The timing-driven global routing problem is formulated as a multiterminal, multicommodity network flow problem with integer flows under additional timing constraints. Two novel performance-driven Steiner tree algorithms are proposed to generate the initial global routing trees. A critical-path-based timing analysis method is used to guarantee the satisfaction of timing constraints. Experimental results based on MCNC (ISCAS) benchmarks show that TIGER can obtain better results than or comparable results with TimberWolf 5.6 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SCALP: an iterative-improvement-based low-power data path synthesis system

    Publication Year: 1997 , Page(s): 1260 - 1277
    Cited by:  Papers (21)  |  Patents (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (336 KB)  

    In this paper, we present SCALP, a comprehensive low-power data path synthesis system that performs the various high-level synthesis tasks (transformations, scheduling, clock selection, module selection, and hardware allocation and assignment) with an aim of reducing the power consumption in the synthesized data path. Focusing on only one or a small subset of the high-level synthesis tasks makes it difficult to realize the full potential for power savings at the algorithm and architecture levels. Our synthesis algorithms, which are based on an iterative improvement strategy with efficient pruning techniques, are capable of performing the various high-level synthesis tasks (and considering their interactions) in an efficient manner. Supply voltage and clock period pruning strategies are used for quickly eliminating inferior design points during the search for the minimum power solution. Estimating switched capacitance accurately at intermediate stages during high-level synthesis can be challenging since the exact structure of the circuit, which affects both physical capacitance and switching activity, may not be available, and due to the high computational complexity of running register-transfer level power analysis tools several times during high-level synthesis. SCALP overcomes the above problems by maintaining a complete image of the structural register-transfer level (RTL) circuit (this is possible since we have a complete solution at any point during iterative improvement), and employing a very fast switched capacitance estimation technique that Is based on the concept of switched capacitance matrices. Our system can handle diverse module libraries and utilize complex scheduling constructs such as multicycling, chaining, and structural pipelining. Retiming and functional pipelining are used in our system to meet tight performance constraints, and to enable the ensuing synthesis steps to better explore the implementation space. Results on several real-life examples are presented to demonstrate the effectiveness of the algorithm. Power estimates obtained using switch-level simulation after layout indicate that up to an order-of-magnitude of power savings can be obtained using our synthesis system View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The purpose of this Transactions is to publish papers of interest to individuals in the areas of computer-aided design of integrated circuits and systems.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief

VIJAYKRISHNAN NARAYANAN
Pennsylvania State University
Dept. of Computer Science. and Engineering
354D IST Building
University Park, PA 16802, USA
vijay@cse.psu.edu