By Topic

Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on

Issue 12 • Date Dec. 2007

Filter Results

Displaying Results 1 - 16 of 16
  • Table of contents

    Publication Year: 2007 , Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (34 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information

    Publication Year: 2007 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • Hierarchical Harmonic-Balance Methods for Frequency-Domain Analog-Circuit Analysis

    Publication Year: 2007 , Page(s): 2089 - 2101
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (997 KB) |  | HTML iconHTML  

    As a widely adopted frequency-domain method, harmonic balance (HB) provides efficient steady-state circuit analysis for analog and RF circuits. The conventional matrix-implicit Krylov subspace technique with the block-diagonal (BD) preconditioner has made it possible to compute the steady-state responses of large-scale circuits. However, not all HB problems, particularly strongly nonlinear circuit problems, can be solved reliably or efficiently using the standard BD-preconditioning technique. In this paper, hierarchical HB methods are proposed wherein robust preconditioning is provided via solution of a set of approximate linearized HB problems of progressively smaller size across multiple levels of the problem hierarchy. These subproblems are constructed using the same matrix-implicit formulation to retain the memory efficiency of Krylov subspace methods. Moreover, the number of allocated Krylov subspace matrix solvers, hence the memory usage, is significantly reduced via a recently introduced solver-sharing technique. The efficiency of our hierarchical preconditioning technique is further improved by adopting a one-step correction to the standard BD preconditioner and a multigrid-motivated iterative scheme. It has been shown that the proposed approaches can achieve up to 10 runtime speedup over the popular BD preconditioner and robust convergence even for strongly nonlinear circuits for which the BD preconditioner fails to converge. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic Design Space Exploration of Register Bypasses in Embedded Processors

    Publication Year: 2007 , Page(s): 2102 - 2115
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1293 KB) |  | HTML iconHTML  

    Register bypassing is a popular and powerful architectural feature to improve processor performance in pipelined processors by eliminating certain data hazards. However, extensive bypassing comes with a significant impact on cycle time, area, and power consumption of the processor. Recent research therefore advocates the use of partial bypassing in a processor. However, accurate performance evaluation of partially bypassed processors is still a challenge, primarily due to the lack of bypass-sensitive retargetable compilation techniques. No existing partial bypass exploration framework estimates the power and area overhead of partial bypassing. As a result, the designers end up making suboptimal design decisions during the exploration of partial bypass design space. This paper presents PBExplore - an automatic design-space-exploration framework for register bypasses. PBExplore accurately evaluates the performance of a partially bypassed processor using a bypass-sensitive compilation technique. It synthesizes the bypass control logic and estimates the area and energy overhead of each bypass configuration. PBExplore is thus able to effectively perform multidimensional exploration of the partial bypass design space. We present experimental results of benchmarks from the MiBench suite on the Intel XScale architecture on and demonstrate the need, utility, and exploration capabilities of PBExplore. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Piecewise-Linear Moment-Matching Approach to Parameterized Model-Order Reduction for Highly Nonlinear Systems

    Publication Year: 2007 , Page(s): 2116 - 2129
    Cited by:  Papers (20)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (812 KB) |  | HTML iconHTML  

    This paper presents a parameterized reduction technique for highly nonlinear systems. In our approach, we first approximate the nonlinear system with a convex combination of parameterized linear models created by linearizing the nonlinear system at points along training trajectories. Each of these linear models is then projected using a moment-matching scheme into a low-order subspace, resulting in a parameterized reduced-order nonlinear system. Several options for selecting the linear models and constructing the projection matrix are presented and analyzed. In addition, we propose a training scheme which automatically selects parameter-space training points by approximating parameter sensitivities. Results and comparisons are presented for three examples which contain distributed strong nonlinearities: a diode transmission line, a microelectromechanical switch, and a pulse-narrowing nonlinear transmission line. In most cases, we are able to accurately capture the parameter dependence over the parameter ranges of plusmn50% from the nominal values and to achieve an average simulation speedup of about 10x. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BoxRouter: A New Global Router Based on Box Expansion and Progressive ILP

    Publication Year: 2007 , Page(s): 2130 - 2143
    Cited by:  Papers (6)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1030 KB) |  | HTML iconHTML  

    In this paper, we propose a new global router, BoxRouter, powered by the concept of box expansion, progressive integer linear programming (PILP), and adaptive maze routing (AMR). BoxRouter first uses a simple prerouting strategy to predict and capture the most congested region with high fidelity as compared to the final routing. Based on progressive box expansion initiated from the most congested region, BoxRouting is performed with PILP and AMR. Our PILP is shown to be much more efficient than the traditional ILP in terms of speed and quality, and the AMR based on multisource multitarget with bridge model is effective in minimizing the congestion and wirelength. It is followed by an effective postrouting step, which reroutes without rip-up to enhance the routing solution further and obtain smooth tradeoff between wirelength and routability. Our experimental results show that the BoxRouter significantly outperforms the state-of-the-art published global routers, e.g., 91 % better routability than Labyrinth (with 14% less wirelength and 3.3times speedup), 79% better routability than Chi-dispersion router (with similar wirelength and 2times speedup), and 4.2% less wirelength and 16times speedup than a multicommodity flow-based router (with similar routability). Additional enhancement in box expansion and postrouting further improves the result with similar wirelength but much better routability than the latest work in global routing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detailed Placement for Enhanced Control of Resist and Etch CDs

    Publication Year: 2007 , Page(s): 2144 - 2157
    Cited by:  Papers (4)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1232 KB) |  | HTML iconHTML  

    Subresolution assist feature (SRAF) and etch-dummy-insertion techniques have been absolutely essential for process-window enhancement and CD control in photo and etch processes. However, as focus levels change during lithography manufacturing, CDs at a given ldquolegalrdquo pitch can fail to achieve manufacturing tolerances. Placed standard-cell layouts may not have the ideal whitespace distribution to allow for an optimal assist-feature insertion. This paper first describes a novel dynamic-programming-based technique for Assist-Feature Correctness (AFCorr) in detailed placement of standard-cell designs. At the same time, etch-dummy features are used in the mask data preparation flow to reduce CD skew between resist and etch processes and to improve the printability of layouts. However, etch-dummy rules conflict with the SRAF insertion because each of the two techniques requires specific design rules. We further present a novel SRAF-aware etch-dummy-insertion method (SAEDM) which optimizes the etch-dummy insertion to make the layout more conducive to the assist-feature insertion after the etch-dummy features have been inserted. Since placement of cells can create forbidden-pitch violations of resist process and can increase etch skew, the placer must also generate etch-dummy-correct placement. This can be solved by Etch-dummy Correctness (EtchCorr), which is an intelligent whitespace management for etch-dummy-corrected placement, an extension of the AFCorr methodology. These methods for enhanced resist and etch CD controls are validated on industrial test cases with respect to wafer printability, database complexity, and device performance. For benchmark designs, we validate the four methodologies: 1) AFCorr; 2) SAEDM; 3) AFCorr SAEDM; and 4) AFCorr EtchCorr SAEDM. The AFCorr placement perturbation achieves a significant reduction in forbidden pitches between polysilicon shapes. Using 1) flow, forbidden-pitch count of photo process is reduced by 76%-100% for 130 nm - - and by 87%-100% for 90 nm. Our novel Corr design-perturbation technique, which combines the AFCorr and EtchCorr methods, facilitates additional SRAF and etch-dummy insertions and, thus, reduces the CD skew between the photo and etch processes. After Corr with SAEDM, edge-placement-error count is also reduced by 91%-100% in the resist CD and by 72%-98% in the etch CD. Our methods provide a substantial improvement in CD control with negligible timing, area, and CPU overhead. The advantages of such correctness methods are expected to increase in future technology nodes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Diffusion-Based Placement Migration With Application on Legalization

    Publication Year: 2007 , Page(s): 2158 - 2172
    Cited by:  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (867 KB) |  | HTML iconHTML  

    Placement migration is the movement of cells within an existing placement to address a variety of postplacement design-closure issues, such as timing, routing congestion, signal integrity, and heat distribution. To fix a design problem, one would like to perturb the design as little as possible while preserving the integrity of the original placement. This paper presents a new diffusion-based placement method based on a discrete approximation to the closed-form solution of the continuous diffusion equation. It has the advantage of smooth spreading, which helps preserve neighborhood characteristics of the original placement. Applying this technique to placement legalization demonstrates significant improvements in wire length and timing compared with other commonly used techniques. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ECO-System: Embracing the Change in Placement

    Publication Year: 2007 , Page(s): 2173 - 2185
    Cited by:  Papers (3)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1016 KB) |  | HTML iconHTML  

    In a realistic design flow, circuit and system optimizations must interact with physical aspects of the design. For example, improvements in timing and power may require the replacement of large modules with variants that have different power/delay tradeoff, shape, and connectivity. New logic may be added late in the design flow, which is subject to interconnect optimization. To support such flexibility in design flows, we develop a robust system in performing Engineering Change Orders (ECOs). In contrast with the existing stand-alone tools that offer poor interfaces to the design flow and cannot handle a full range of modern very large scale integration layouts, our ECO-system reliably handles fixed objects and movable macros in instances with widely varying amounts of whitespace. It detects geometric regions and sections of the netlist that require modification and applies an adequate amount of change in each case. Given a reasonable initial placement, it applies minimal changes but is capable of replacing large regions to handle pathological cases. The ECO-system can be used in the range from high-level synthesis to physical synthesis and detail placement. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast and Accurate Cosimulation of MPSoC Using Trace-Driven Virtual Synchronization

    Publication Year: 2007 , Page(s): 2186 - 2200
    Cited by:  Papers (11)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1218 KB) |  | HTML iconHTML  

    As MPSoC has become an effective solution to ever-increasing design complexity of modern embedded systems, fast and accurate cosimulation of such systems is becoming a tough challenge. Cosimulation performance is in inverse proportion to the number of processor simulators in conventional cosimulation frameworks with lock-step synchronization schemes. To overcome this problem, we propose a novel time synchronization technique called trace-driven virtual synchronization. Having separate phases of event generation and event alignment in the cosimulation, time synchronization overhead is reduced to almost zero, boosting cosimulation speed while accuracy is almost preserved. In addition, this technique enables (1) a fast mixed level cosimulation where different abstraction level simulators are easily integrated communicating with traces and (2) a distributed parallel cosimulation where each simulator can run at its full speed without synchronizing with other simulator too frequently. We compared the performance and the accuracy with MaxSim, a well-known commercial System C simulation framework, and the proposed framework showed 11 times faster performance for H.263 decoder example, while the error was below 5%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Testing Network-on-Chip Communication Fabrics

    Publication Year: 2007 , Page(s): 2201 - 2214
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (614 KB) |  | HTML iconHTML  

    Network-on-chip (NoC) communication fabrics will be increasingly used in many large multicore system-on-chip designs in the near future. A relevant challenge that arises from this trend is that the test costs associated with NoC infrastructures may account for a significant part of the total test budget. In this paper, we present a novel methodology for testing such NoC architectures. The proposed methodology offers a tradeoff between test time and on-chip self-test resources. The fault models used are specific to deep submicrometer technologies and account for crosstalk effects due to interwire coupling. The novelty of our approach lies in the progressive reuse of the NoC infrastructure to transport test data to the components under test in a recursive manner. It exploits the inherent parallelism of the data transport mechanism to reduce the test time and, implicitly, the test cost. We also describe a suitable test-scheduling approach. In this manner, the test methodology developed in this paper is able to reduce the test time significantly as compared to previously proposed solutions, offering speedup factors ranging from 2x to 34x for the NoCs considered for experimental evaluation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Delay Fault Coverage Enhancement by Partial Clocking for Low-Power Designs With Heavily Gated Clocks

    Publication Year: 2007 , Page(s): 2215 - 2221
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (342 KB) |  | HTML iconHTML  

    Testing for delay faults in heavily gated clock designs has the major test challenges of reduced fault coverage and high test power consumption. In the scan-test method, gated clocks are often simplified and replaced with global test clocks. As such, partial clocking by the gated clocks is not inherited in test operations. Global clocking suffers from delay fault coverage loss because a sensitization state cannot easily be created due to the increased state dependence in functional paths, as compared to partial clocking. The global clocking scheme in the test mode is not adequate for low-power designs either, because the power consumed during a test operation exceeds that used during a normal operation. The power grid may not be sufficient to support the power drawn during testing, perhaps resulting in overkilled devices. It is therefore critical that power consumption be maintained under a safe limit, even during testing. In the proposed method, partial clocking in gated designs is preserved to the maximum possible to create more reachable states, thereby increasing transition fault coverage and reducing test power during launch and capture cycles. A transition fault simulator was developed, and it demonstrated higher transition fault coverage and reduced test power for ISCAS-89 circuits when partial clocking is used. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Bus-Encoding Scheme for Crosstalk Elimination in High-Performance Processor Design

    Publication Year: 2007 , Page(s): 2222 - 2227
    Cited by:  Papers (5)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (234 KB) |  | HTML iconHTML  

    A crosstalk effect leads to increases in delay and power consumption and, in the worst-case scenario, to inaccurate results. With the scale down of technology to deep-submicrometer level, the crosstalk effect between adjacent wires becomes more and more serious, particularly between long on-chip buses. In this paper, we propose a deassembler/assembler technique to eliminate undesirable crosstalk effects on bus transmission. By taking advantage of the prefetch process, where the instruction/data fetch rate is always higher than the instruction/data commit rate, the proposed method incurs almost no penalty in terms of dynamic instruction count. In addition, when the bus width is 128 b, the required number of extra bus wires is only 7 as compared to the 85 extra bus wires needed in the work of Victor and Keutzer. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Information for authors

    Publication Year: 2007 , Page(s): 2228
    Save to Project icon | Request Permissions | PDF file iconPDF (24 KB)  
    Freely Available from IEEE
  • 2007 Index IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Vol. 26

    Publication Year: 2007 , Page(s): 2229 - 2256
    Save to Project icon | Request Permissions | PDF file iconPDF (329 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems society information

    Publication Year: 2007 , Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (23 KB)  
    Freely Available from IEEE

Aims & Scope

The purpose of this Transactions is to publish papers of interest to individuals in the areas of computer-aided design of integrated circuits and systems.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief

VIJAYKRISHNAN NARAYANAN
Pennsylvania State University
Dept. of Computer Science. and Engineering
354D IST Building
University Park, PA 16802, USA
vijay@cse.psu.edu