By Topic

Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on

Issue 5 • Date May 2004

Filter Results

Displaying Results 1 - 21 of 21
  • Table of contents

    Page(s): c1 - 585
    Save to Project icon | Request Permissions | PDF file iconPDF (42 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (37 KB)  
    Freely Available from IEEE
  • Equivalence checking of arithmetic circuits on the arithmetic bit level

    Page(s): 586 - 597
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (456 KB) |  | HTML iconHTML  

    One of the most severe shortcomings of currently available equivalence checkers is their inability to verify arithmetic circuits and multipliers, in particular. In this paper, we present a bit-level reverse-engineering technique that complements standard equivalence checking frameworks. We propose a Boolean mapping algorithm that extracts a network of half adders from the gate netlist of an addition circuit. Once the arithmetic bit-level representation of the circuit is obtained, equivalence checking can be performed using simple arithmetic operations. We have successfully applied the technique for the verification of a large number of multipliers of different architectures as well as more general arithmetic circuits, such as multiply/add units. The experimental results show the great promise of our approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Structural FSM traversal

    Page(s): 598 - 619
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (792 KB) |  | HTML iconHTML  

    This paper discusses a "structural" technique for traversing the state space of a finite state machine (FSM) and its application to equivalence checking of sequential circuits. The key ingredient to a state-space traversal is a data structure to represent state sets. In structural FSM, traversal-state sets are represented noncanonically and implicitly as gate netlists. First, we present an exact algorithm, which is based on an iterative expansion of the FSM into time frames and a network-decomposition procedure serving the same purpose as an existential quantification operation. Then, we discuss approximative algorithms for the application of structural FSM traversal to sequential equivalence checking. We theoretically analyze the properties of the exact as well as the approximative algorithms. Finally, we give details on the implementation of a sequential equivalence checker and present experimental results that demonstrate the effectiveness of the proposed approach for equivalence checking of optimized and retimed circuits. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of high-performance system-on-chips using communication architecture tuners

    Page(s): 620 - 636
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1104 KB) |  | HTML iconHTML  

    In this paper, we present a methodology for the design of high-performance system-on-chip communication architectures. The approach is based on the addition of a layer of circuitry called the communication architecture tuner (CAT) layer around an existing communication architecture topology. The added layer provides a system with the capability of adapting to runtime variability in the communication needs of its constituent components. For example, more critical data may be handled differently, leading to lower communication latencies. The CAT associated with each component monitors its internal state, analyzes the communication transactions it generates, and "predicts" the relative importance of the transactions in terms of their impact on system-level performance metrics. It then configures the protocol parameters of the underlying communication architecture (e.g., priorities, burst modes, etc.) to best suit the system's changing communication needs. We illustrate the issues and tradeoffs involved in the design of CAT-based communication architectures, and present algorithms that automate the key steps. Experiments with example systems indicate that performance metrics (e.g., number of missed deadlines, average processing time) for systems with CAT-based communication architectures are significantly (sometimes over an order of magnitude) better than those with conventional communication architectures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • LPRAM: a novel low-power high-performance RAM design with testability and scalability

    Page(s): 637 - 651
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (504 KB) |  | HTML iconHTML  

    To date, all of the proposals for low-power designs of RAMs essentially focus on circuit-level solutions. What we propose here is a novel architecture (high) level solution. Our methodology provides a systematic tradeoff between power and area. Also, it allows tradeoff between test time and power consumed in test mode. Significantly, too, the proposed design has the potential to achieve performance improvements while simultaneously reducing power. In this respect, it stands apart from other approaches where power reduction results in speed reduction. The basic approach here divides the RAM into modules, interconnecting these modules in a binary tree where the tree can be reconfigured dynamically during normal operation and during test mode. Furthermore, during test mode, most of the RAM can be switched off, which provides major power reduction, while test-application time is reduced. The aspect ratio of the modules is allowed to vary as a design parameter. The chosen aspect ratio for module impacts power/access time/area tradeoffs. Such novel features make the proposed methodology of potential practical significance. Also, a design tool is developed which inputs various parameters, such as desired power/performance, giving outputs basic design parameters, such as the needed number of modules, area overhead, and resulting test speed-up. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hybrid energy-estimation technique for extensible processors

    Page(s): 652 - 664
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (704 KB) |  | HTML iconHTML  

    In this paper, we present an efficient and accurate methodology for estimating the energy consumption of application programs running on extensible processors. Extensible processors, which are getting increasingly popular in embedded system design, allow a designer to customize a base processor core through instruction set extensions. Existing processor energy macromodeling techniques are not applicable to extensible processors, since they assume that the instruction set architecture as well as the underlying structural description of the micro-architecture remain fixed. Our solution to the above problem is a hybrid energy macromodel suitably parameterized to estimate the energy consumption of an application running on the corresponding application-specific extended processor instance, which incorporates any custom instruction extension. Such a characterization is facilitated by careful selection of macromodel parameters/variables that can capture both the functional and structural aspects of the execution of a program on an extensible processor. Another feature of the proposed energy characterization flow is the use of regression analysis to build the macromodel. Regression analysis allows for in-situ characterization, thus allowing arbitrary test programs to be used during macromodel construction. We validated the proposed methodology by characterizing the energy consumption of a state-of-the-art extensible processor (Tensilica's Xtensa). We used the macromodel to analyze the energy consumption of several benchmark applications with custom instructions. The mean absolute error in the macromodel estimates is only 3.3%, when compared to the energy values obtained by a commercial tool operating on the synthesized register-transfer level (RTL) description of the custom processor. Our approach achieves an average speedup of three orders of magnitude over the commercial RTL energy estimator. Our experiments show that the proposed methodology also achieves good relative accuracy, which is essential in energy optimization studies. Hence, our technique is both efficient and accurate. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimizing total power by simultaneous Vdd/Vth assignment

    Page(s): 665 - 677
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (560 KB) |  | HTML iconHTML  

    In this paper, we investigate the effectiveness of simultaneous multiple supply and threshold voltage assignment in minimizing the total power (static+dynamic) in generic digital CMOS designs. Achievable power reductions under varying conditions are investigated, including static-power limited designs and sub-1-V processes. Rules-of-thumb are developed for optimal Vdd's and Vth's to be used in future designs. These models show the optimal second Vdd to be approximately half the nominal Vdd while the potential total power savings is significantly greater than previously anticipated (60%-65%). We describe the impact of level conversion delays and also demonstrate that the scaling properties of multivoltage systems are very good, particularly when considering impending device scaling advancements. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A multiparameter moment-matching model-reduction approach for generating geometrically parameterized interconnect performance models

    Page(s): 678 - 693
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (544 KB) |  | HTML iconHTML  

    In this paper, we describe an approach for generating accurate geometrically parameterized integrated circuit interconnect models that are efficient enough for use in interconnect synthesis. The model-generation approach presented is automatic, and is based on a multiparameter moment matching model-reduction algorithm. A moment-matching theorem proof for the algorithm is derived, as well as a complexity analysis for the model-order growth. The effectiveness of the technique is tested using a capacitance extraction example, where the plate spacing is considered as the geometric parameter, and a multiline bus example, where both wire spacing and wire width are considered as geometric parameters. Experimental results demonstrate that the generated models accurately predict capacitance values for the capacitor example, and both delay and cross-talk effects over a reasonably wide range of spacing and width variation for the multiline bus example. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simultaneous floor plan and buffer-block optimization

    Page(s): 694 - 703
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (488 KB) |  | HTML iconHTML  

    As technology advances and the number of interconnections among modules rapidly increases, timing closure, and design convergence are the most important concerns. Hence, it is desirable to consider interconnect optimization as early as possible. Previous work for this issue can be classified into two directions: wire planning and buffer-block planning for interconnect-driven floorplanning. Wire planning for interconnect-driven floorplanning does not consider buffer insertion, and buffer-block planning for interconnect-driven floorplanning cannot overcome the limitation of a bad initial floorplan. In this paper, we first address simultaneous floorplanning and buffer-block planning (i.e., integrating buffer-block planning into floorplanning) for interconnect optimization. We adopt simulated annealing to refine a floorplan so that buffers can be inserted more effectively. In each iteration, we construct a routing tree for each net, allocate buffers for all nets, introduce corresponding buffer blocks into the intermediate floorplan, and invoke Lagrangian relaxation to optimize area and satisfy timing requirements. Further, in order to reduce the problem size, we present supermodule partitioning which partitions modules into supermodules. Experimental results show that our method of integrating buffer-block planning into floorplanning can significantly improve the interconnect delay and reduce the number of buffers needed. Based on a set of MCNC benchmark circuits, our approach achieves an average success rate of 86.1% of nets meeting timing constraints, inserts only 272 buffers on average, and consumes an average extra area of only 0.28% over the given floorplan, compared with the average success rate of 62.6%, 1123 buffers, and extra area of 1.05% resulted from a famous recent work presented at ICCAD'99. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Steiner tree construction based on spanning graphs

    Page(s): 704 - 710
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (368 KB) |  | HTML iconHTML  

    The Steiner Minimal Tree (SMT) problem is a very important problem in very large scale integrated computer-aided design. Given n points on a plane, an SMT connects these points through some extra points (called Steiner points) to achieve a minimal total length. Even though there exist many heuristic algorithms for this problem, they have either poor performances or expensive running time. This paper records an implementation of an efficient SMT algorithm that has a worst case running time of O(nlogn) and a performance close to that of the Iterated 1-Steiner algorithm. The algorithm efficiently combines Borah et al.'s edge substitute concept with Zhou et al.'s spanning graph. Extensive experimental studies are conducted to compare it with other programs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Full-chip, three-dimensional shapes-based RLC extraction

    Page(s): 711 - 727
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (512 KB) |  | HTML iconHTML  

    In this paper, we report the development of a full-chip, three-dimensional, shapes-based, resistence-inductance-capacitance extraction tool, which was developed as part of a university-industry collaboration. The technique of return-limited inductances is used to provide a sparse, frequency-independent inductance and resistance network with self-inductances that represent sensible "nominal" values in the absence of mutual coupling. Mutual inductances are extracted for accurate crosstalk analysis. The tool exploits high-capacity scan-band techniques and disk caching. Accuracy is validated by comparison with full-wave finite-element field solvers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multigranular parallel algorithms for solving linear equations in VLSI circuit simulation

    Page(s): 728 - 736
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB) |  | HTML iconHTML  

    Algorithms for parallel execution of forward elimination to solve linear equations arising from very large scale integrated circuit simulation are discussed. Here, a multigranular method is introduced, exploiting different levels of potential parallelism. According to these levels, the new method contains four phases, which are dynamically linked. Therefore, the use of an architecture with shared memory in connection with multithreaded programming enables the parallelization of a serial well-adapted sparse-matrix solver. In order to take system-specific properties into account, an adaptive partitioning is proposed. For this, the partition size is enlarged step by step as long as the measured execution time decreases. Applying this method to industrial examples an efficiency of processor usage of nearly 90%, with up to 12 processors, is reached for relevant circuits. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Linked faults in random access memories: concept, fault models, test algorithms, and industrial results

    Page(s): 737 - 757
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (704 KB) |  | HTML iconHTML  

    The analysis of linked faults (LFs), which are faults that influence the behavior of each other, such that masking can occur, has proven to be a source for new memory tests, characterized by an increased fault coverage. However, many newly reported fault models have not been investigated from the point-of-view of LFs. This paper presents a complete analysis of LFs, based on the concept of fault primitives, such that the whole space of LFs is investigated and accounted for and validated. Some simulated defective circuits, showing linked-fault behavior, will be also presented. The paper establishes detection conditions along with new tests to detect each fault class. The tests are merged into a single test March SL detecting all considered LFs. Preliminary test results, based on Intel advanced caches, show that its fault coverage is high as compared with all other traditional tests and that it detects some unique faults; this makes March SL very attractive industrially. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient test solutions for core-based designs

    Page(s): 758 - 775
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB) |  | HTML iconHTML  

    A test solution for a complex system requires the design of a test access mechanism (TAM), which is used for the test data transportation, and a test schedule of the test data transportation on the designed TAM. An extensive TAM will lead to lower test-application time at the expense of higher routing costs, compared to a simple TAM with low routing cost but long testing time. It is also possible to reduce the testing time of a testable unit by loading the test vectors in parallel, thus increasing the parallelization of a test. However, such a test-time reduction often leads to higher power consumption, which must be kept under control since exceeding the power budget could damage the system under test. Furthermore, the execution of a test requires resources and concurrent execution of tests may not be possible due to resource or other conflicts. In this paper, we propose an integrated technique for test scheduling, test parallelization, and TAM design, where the test application time and the TAM routing are minimized, while considering test conflicts and power constraints. The main features of our technique are the efficiency in terms of computation time and the flexibility to model the system's test behavior, as well as the support for the testing of interconnections, unwrapped cores and user-defined logic. We have implemented our approach and made several experiments on benchmarks as well as industrial designs in order to demonstrate that our approach produces high-quality solution at low computational cost. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Embedded deterministic test

    Page(s): 776 - 792
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1192 KB) |  | HTML iconHTML  

    This paper presents a novel test-data volume-compression methodology called the embedded deterministic test (EDT), which reduces manufacturing test cost by providing one to two orders of magnitude reduction in scan test data volume and scan test time. The presented scheme is widely applicable and easy to deploy because it is based on the standard scan/ATPG methodology and adopts a very simple flow. It is nonintrusive as it does not require any modifications to the core logic such as the insertion of test points or logic bounding unknown states. The EDT scheme consists of logic embedded on a chip and a new deterministic test-pattern generation technique. The main contributions of the paper are test-stimuli compression schemes that allow us to deliver test data to the on-chip continuous-flow decompressor. In particular, it can be done by repeating certain patterns at the rates, which are adjusted to the requirements of the test cubes. Experimental results show that for industrial circuits with test cubes with very low fill rates, ranging from 3% to 0.2%, these schemes result in compression ratios of 30 to 500 times. A comprehensive analysis of the encoding efficiency of the proposed compression schemes is also provided. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MR: a new framework for multilevel full-chip routing

    Page(s): 793 - 800
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (473 KB)  

    In this paper, we propose a novel framework for multilevel full-chip routing considering both routability and performance called MR. The two-stage multilevel framework consists of coarsening, followed by uncoarsening. Unlike the previous multilevel routing, MR integrates global routing, detailed routing, and resource estimation, together at each level of the framework, leading to more accurate routing resource estimation during coarsening and thus facilitating the solution refinement during uncoarsening. Further, the exact routing information obtained at each level makes MR more flexible in dealing with various routing objectives (such as crosstalk, power, etc.). Experimental results show that MR obtains significantly better routing solutions than previous works. For example, for a set of 11 commonly used benchmark circuits, MR achieves 100% routing completion for all circuits, while the previous multilevel routing, the three-level routing, and the hierarchical routing can complete routing for only 2, 0, 2 circuits, respectively. In particular, the number of routing layers used by MR is even smaller. We also have performed experiments on timing-driven routing. The results are also very promising. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Testing SoC interconnects for signal integrity using extended JTAG architecture

    Page(s): 800 - 811
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (672 KB)  

    As technology shrinks and working frequency reaches the multigigahertz range, designing and testing interconnects are no longer trivial issues. In this paper, we propose an enhanced boundary-scan architecture to test high-speed interconnects for signal integrity. This architecture includes: 1) a modified driving cell that generates patterns according to multiple transitions fault model and 2) an observation cell that monitors signal integrity violations. To fully comply with the conventional Joint Test Action Group Standard, two new instructions are used to control cells and scan activities in the integrity test mode. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2004 IEEE International Workshop on Behavioral Modeling and Simulation (BMAS 2004)

    Page(s): 812
    Save to Project icon | Request Permissions | PDF file iconPDF (155 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (34 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Information for authors

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (23 KB)  
    Freely Available from IEEE

Aims & Scope

The purpose of this Transactions is to publish papers of interest to individuals in the areas of computer-aided design of integrated circuits and systems.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief

VIJAYKRISHNAN NARAYANAN
Pennsylvania State University
Dept. of Computer Science. and Engineering
354D IST Building
University Park, PA 16802, USA
vijay@cse.psu.edu