By Topic

Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on

Issue 7 • Date July 2004

Filter Results

Displaying Results 1 - 21 of 21
  • Table of contents

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (38 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (37 KB)  
    Freely Available from IEEE
  • Skew measurements in clock distribution circuits using an analytic signal method

    Page(s): 997 - 1009
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (808 KB) |  | HTML iconHTML  

    This paper presents the application of a new analytic signal method for measuring several different kinds of clock skew in the clock distribution network of microprocessors. First, key terms are defined, and other existing skew measurement methods are reviewed. Then, detailed steps are given for applying the new method for measuring skew between a master and distributed clocks, between two distributed clocks, and between different frequency clocks that are related by frequency division. An indirect procedure for measuring deterministic clock skew is also proposed. Next, the new method is validated with experimental data from a prototype microprocessor test. Performance comparison is performed between the analytic signal method and the two-probe method. Finally, the measurement requirements of the proposed analytic signal method are compared with those of conventional methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Resource budgeting for Multiprocess High-level synthesis

    Page(s): 1010 - 1019
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB) |  | HTML iconHTML  

    This paper presents a new high-level synthesis methodology to generate optimized register-transfer level (RTL) implementations for multiprocess behavioral descriptions. The concurrent communicating processes specification paradigm is widely used in digital circuit and system design, and is employed in all popular hardware description languages. It has been shown that interprocess communication and synchronization can result in complex timing interdependencies, which significantly affect the performance of a multiprocess system. In this paper, we demonstrate that state-of-the-art high-level synthesis tools can generate significantly suboptimal implementations for behaviors that contain concurrent communicating processes. We present an analysis of how interprocess communication impacts high-level synthesis steps, and describe a new methodology to adapt existing high-level synthesis tools to optimize multiprocess descriptions. Our methodology is based on executing multiprocess performance analysis and process-by-process scheduling in an iterative manner. We present algorithms for key steps in the proposed methodology. We have performed extensive experiments in the context of a commercial high-level design flow to evaluate the proposed techniques. The results clearly demonstrate the utility of our techniques in synthesizing implementations with superior area, performance, and energy consumption. For example, up to 40% performance improvement (average of 35.6%) was achieved with little or no area overhead (average of 4.8%). In effect, the proposed techniques lead to a shift of the entire area-delay tradeoff curve for a design, to include superior designs that were hitherto infeasible. Our techniques also simultaneously result in up to 50% (average of 33.5%) improvement in energy and up to 69% (average of 58.3%) improvement in the energy-delay product. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SPFD-based wire removal in standard-cell and network-of-PLA circuits

    Page(s): 1020 - 1030
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (552 KB) |  | HTML iconHTML  

    Wire removal is a technique by which the total number of wires between individual circuit nodes is reduced, either by removing wires or replacing them with other new wires. The wire removal techniques we describe in this paper are based on both binary and multivalued sets of pairs of functions to be distinguished (SPFDs). Recently, it was shown that a design style based on a multilevel network of approximately equal-sized programmable logic arrays (PLAs) results in a dense, fast, and crosstalk-resistant layout. This paper describes the application of SPFD-based wire removal techniques for circuit implementations utilizing networks of PLAs as well as standard-cells. In our first set of wire removal experiments (which utilize binary SPFD-based wire removal), we demonstrate that the benefit of SPFD-based wire removal is insignificant when the circuit is mapped using standard cells. We demonstrate that this technique is very effective in the context of a network of PLAs. In the next set of wire removal experiments, we focus only on circuits implemented using a network of PLAs. Three separate wire removal experiments are performed. Wire removal is invoked before clustering the original netlist into a network of PLAs, or after clustering, or both before and after clustering. For wire removal before clustering, binary SPFD-based wire removal is used. For wire removal after clustering, multivalued SPFD-based wire removal is used since the multioutput PLAs can be viewed as multivalued single output nodes. We demonstrate that these techniques are effective. The most effective approach is to perform wire removal both before and after clustering. Using these techniques, we obtain a reduction in placed and routed circuit area of about 11%. This reduction is significantly higher (about 20%) for the larger circuits we used in our experiments. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Tag compression for low power in dynamically customizable embedded processors

    Page(s): 1031 - 1047
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (560 KB) |  | HTML iconHTML  

    We present a methodology for power reduction by instruction/data cache-tag compression for low-power embedded processors. By statically analyzing the code/data memory layouts for the application hot spots, a variety of proposed schemes for effective tag-size reduction can be employed for power minimization in instruction and data caches. The schemes rely on significantly reducing the number of tag bits stored in the tag arrays for cache-conflict identification, thus considerably decreasing the number of active bitlines, sense amps, and comparator cells. We present a set of tag compression techniques and evaluate each of them separately in terms of efficiency and required hardware support. A detailed very large scale integrated implementation has been performed and a number of experimental results on a set of embedded applications is reported for each technique. Energy dissipation decreases of up to 95% can be observed for the tag arrays, implying significant energy reductions in the range of 50% when amortized across the overall cache subsystem. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Markov chain sequence generator for power macromodeling

    Page(s): 1048 - 1062
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (584 KB) |  | HTML iconHTML  

    In this paper, we present a novel sequence generator based on a Markov chain (MC) model. Specifically, we formulate the problem of generating a sequence of vectors with given average input probability p, average transition density d, and spatial correlation s as a transition matrix computation problem, in which the matrix elements are subject to constraints derived from the specified statistics. We also give a practical heuristic that computes such a matrix and generates a sequence of l n-bit vectors in O(nl+n2) time. Derived from a strongly mixing MC, our generator yields binary vector sequences with accurate statistics, high uniformity, and high randomness. Experimental results show that our sequence generator can cover more than 99% of the parameter space. Sequences of 2000 48-bit vectors are generated in less than 0.05 s, with average deviations of the signal statistics p,d, and s equal to 1.6%, 1.8%, and 2.8%, respectively. Our generator enables the detailed study of power macromodeling. Using our tool and the ISCAS'85 benchmark circuits, we have assessed the sensitivity of power dissipation to the three input statistics p,d, and s. Our investigation reveals that power is most sensitive to transition density, while only occasionally exhibiting high sensitivity to signal probability and spatial correlation. Our experiments also show that input signal imbalance can cause estimation errors as high as 100% in extreme cases, although errors are usually within 25%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel fast Fourier transform on multipoles (FFTM) algorithm for electrostatics analysis of three-dimensional structures

    Page(s): 1063 - 1072
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (576 KB) |  | HTML iconHTML  

    A fast algorithm, called the fast Fourier transform on multipoles (FFTM) method, is developed for efficient solution of the integral equation in the boundary element method (BEM). This method employs the multipole and local expansions to approximate far field potentials, and uses the fast Fourier transform (FFT) to accelerate the multipole to local translation operator based on its convolution nature. The series of uncoupled convolutions allows further speed up in the algorithm through parallel computation. In this paper, we present the results of using the FFTM algorithm for solving large-scale three-dimensional electrostatic problems. It is demonstrated that the method can give accurate results with relatively low order of expansion. It is also found that the serial version of the algorithm has computational complexities of O(Na), where a ranges from 1.0 to 1.4 for computational time, and from 1.1 to 1.2 for memory storage requirement. Significant speedup is also observed in the parallel implementation of FFTM using up to 16 processors on an IBM-p690 supercomputer. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multilevel circuit clustering for delay minimization

    Page(s): 1073 - 1085
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (712 KB) |  | HTML iconHTML  

    In this paper, an effective algorithm is presented for multilevel circuit clustering for delay minimization, and is applicable to hierarchical field programmable gate arrays. With a novel graph contraction technique, which allows some crucial delay information of a lower-level clustering to be maintained in the contracted graph, our algorithm recursively divides the lower-level clustering into the next higher-level one in a way that each recursive clustering step is accomplished by applying a modified single-level circuit clustering algorithm based on . We test our algorithm on the two-level clustering problem and compare it with the latest algorithm in . Experimental results show that our algorithm achieves, on average, 12% more delay reduction when compared to the best results (from TLC with full node-duplication) in . In fact, our algorithm is the first one for the general multilevel circuit clustering problem with more than two levels. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Area minimization of power distribution network using efficient nonlinear programming techniques

    Page(s): 1086 - 1094
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (304 KB) |  | HTML iconHTML  

    This paper deals with area minimization of power network for very large-scale integration designs. A new algorithm based on efficient nonlinear programming techniques is presented to solve this problem. During the optimization, a penalty method, conjugate gradient method, circuit sensitivity analysis, and merging adjoint networks are applied, which enables the algorithm to optimize large circuits. The experiment results prove that this algorithm is robust and can achieve the objective of minimizing the area of power network in a short runtime. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pseudorandom number generation with self-programmable cellular automata

    Page(s): 1095 - 1101
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (288 KB) |  | HTML iconHTML  

    We propose a new class of cellular automata, self-programming cellular automata (SPCA), with specific application to pseudorandom number generation. By changing a cell's state transition rules in relation to factors such as its neighboring cell's states, behavioral complexity can be increased and utilized. Interplay between the state transition neighborhood and rule selection neighborhood leads to a new composite neighborhood and state transition rule that is the linear combination of two different mappings with different temporal dependencies. It is proved that when the transitional matrices for both the state transition and rule selection neighborhood are nonsingular, SPCA will not exhibit nongroup behavior. Good performance can be obtained using simple neighborhoods with certain CA length, transition rules, etc. Certain configurations of SPCA pass all DIEHARD and ENT tests with an implementation cost lower than current reported work. Output sampling methods are also suggested to improve output efficiency by sampling the outputs of the new rule selection neighborhoods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-referential verification for gate-level implementations of arithmetic circuits

    Page(s): 1102 - 1112
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (336 KB) |  | HTML iconHTML  

    Verification of gate-level implementations of arithmetic circuits is challenging for a number of reasons: the existence of some hard-to-verify arithmetic operators, the use of different operand ordering, the incorporation of merged arithmetic with cross-operator implementations, and the employment of circuit transformations based on arithmetic relations. It is hence a peculiar problem that does not fit well within the existing register-transfer-level-to-gate equivalence-checking methodology. We propose a self-referential functional verification approach which uses the gate-level implementation of the arithmetic circuit under verification to verify itself. The verification task is decomposed into a sequence of equivalence-checking subproblems, each of which compares structurally similar circuit pairs derived from the implementation under verification. These equivalence-checking subproblems represent the functional equations that uniquely define the intended arithmetic function. Based on these self-referential functional equations, a decomposition heuristic using structural information is employed to guide the verification process for better efficiency. Experimental results on a number of implementations of the multipliers, the multiply-add units, and the inner product units with different architectures demonstrate the versatility of this approach. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SAT-based counterexample-guided abstraction refinement

    Page(s): 1113 - 1123
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (392 KB) |  | HTML iconHTML  

    We describe new techniques for model checking in the counterexample-guided abstraction-refinement framework. The abstraction phase "hides" the logic of various variables, hence considering them as inputs. This type of abstraction may lead to "spurious" counterexamples, i.e., traces that cannot be simulated on the original (concrete) machine. We check whether a counterexample is real or spurious with a satisfiability (SAT) checker. We then use a combination of 0-1 integer linear programming and machine learning techniques for refining the abstraction based on the counterexample. The process is repeated until either a real counterexample is found or the property is verified. We have implemented these techniques on top of the model checker NuSMV and the SAT solver Chaff. Experimental results prove the viability of these new techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Second-order approximations for RLC trees

    Page(s): 1124 - 1128
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    We propose two-pole one-zero second-order approximations for transfer functions in resistance-inductance-capacitance trees. The approximation matches the first three moments of the original transfer function. Formulas for computing step-response parameters such as delay time, rise time, overshoot, etc., are given. Simulation results show that adding the zero improves accuracy of the approximation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Indirect test architecture for SoC testing

    Page(s): 1128 - 1142
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1168 KB)  

    A generic model for test architectures in the core-based system-on-chip (SoC) designs consists of source/sink, wrapper, and test access mechanism (TAM). Current test architectures for digital cores assume a direct connection between the core and the tester. In these architectures, the tester establishes a physical link between itself and the core, such that it can directly control the core's design-for-testability (DFT), such as the scan chains or primary inputs. This direct connection undermines the modularity in the generic test architecture by tightly coupling its elements. In this paper, we propose a network-oriented indirect and modular architecture (NIMA) for postfabrication test in an SoC design methodology. In NIMA, test stimuli and expected results for digital cores are first compiled into new formats and subsequently encapsulated into packets. These packets are augmented with control and address bits such that they can autonomously be transmitted to their destination through a switching fabric. Owing to the indirect nature of the connection, embedded autonomous blocks at each core are used to apply the test to the core and compare the test results with expected values. This indirect access to the core decouples test data processing at the core from its communication providing the basis for flexible and modular test design and programming. Moreover, NIMA facilitates remote-access of single or multiple testers to an SoC, and enables the sending of test data to an SoC in-field in order to test the chip in its target system. Finally, NIMA serves in contributing toward the development of new test architectures that benefit from network-centric SoCs. We present a first implementation of NIMA when applied to a number of SoC benchmarks. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scan architecture with mutually exclusive scan segment activation for shift- and capture-power reduction

    Page(s): 1142 - 1153
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (664 KB)  

    Power dissipation during scan testing is becoming an important concern as design sizes and gate densities increase. While several approaches have been recently proposed for reducing power dissipation during the shift cycle (minimum-transition don't care fill, special scan cells, and scan chain partitioning), limited work has been carried out toward reducing the peak power during test response capture and the few existing approaches for reducing capture power rely on complex automatic test pattern generation (ATPG) algorithms. This paper proposes a scan architecture with mutually exclusive scan segment activation which overcomes the shortcomings of previous approaches. The proposed architecture achieves both shift and capture-power reduction with no impact on the performance of the design, and with minimal impact on area and testing time (typically 2%-3%). An algorithmic procedure for assigning flip-flops to scan segments enables reuse of test patterns generated by standard ATPG tools. An implementation of the proposed method had been integrated into an automated design flow using commercial synthesis and simulation tools which was used on a wide range of benchmark designs. Reductions up to 57% in average power, and up to 44% and 34% in peak-power dissipation during shift and capture cycles, respectively, were obtained when using two scan segments. Increasing the number of scan segments to six leads to reductions of 96% and 80% in average power and, respectively, maximum number of simultaneous transitions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2004 IEEE International Workshop on Behavioral Modeling and Simulation (BMAS 2004)

    Page(s): 1154
    Save to Project icon | Request Permissions | PDF file iconPDF (155 KB)  
    Freely Available from IEEE
  • 2005 IEEE International Symposium on Circuits and Systems (ISCAS 2005)

    Page(s): 1155
    Save to Project icon | Request Permissions | PDF file iconPDF (520 KB)  
    Freely Available from IEEE
  • Explore IEL IEEE's most comprehensive resource [advertisement]

    Page(s): 1156
    Save to Project icon | Request Permissions | PDF file iconPDF (341 KB)  
    Freely Available from IEEE
  • IEEE Circuits and Systems Society Information

    Page(s): c3
    Save to Project icon | Request Permissions | PDF file iconPDF (33 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Information for authors

    Page(s): c4
    Save to Project icon | Request Permissions | PDF file iconPDF (23 KB)  
    Freely Available from IEEE

Aims & Scope

The purpose of this Transactions is to publish papers of interest to individuals in the areas of computer-aided design of integrated circuits and systems.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief

VIJAYKRISHNAN NARAYANAN
Pennsylvania State University
Dept. of Computer Science. and Engineering
354D IST Building
University Park, PA 16802, USA
vijay@cse.psu.edu