By Topic

Very Large Scale Integration (VLSI) Systems, IEEE Transactions on

Issue 7 • Date July 2010

Filter Results

Displaying Results 1 - 16 of 16
  • Table of contents

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Very Large Scale Integration (VLSI) Systems publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (40 KB)  
    Freely Available from IEEE
  • Discrete Buffer and Wire Sizing for Link-Based Non-Tree Clock Networks

    Page(s): 1025 - 1035
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (701 KB) |  | HTML iconHTML  

    Clock network is a vulnerable victim of variations as well as a main power consumer in many integrated circuits. Recently, link-based non-tree clock network attracts people's attention due to its appealing tradeoff between variation tolerance and power overhead. In this work, we investigate how to optimize such clock networks through buffer and wire sizing. A two-stage hybrid optimization approach is proposed. It considers the realistic constraint of discrete buffer/wire sizes and is based on accurate delay models. In order to provide reliable and efficient guidance for the optimization, we suggest to apply support vector machine (SVM)-based machine learning as a surrogate for expensive circuit-level simulation. Experimental results on benchmark circuits show that our sizing method can reduce clock skew by 45% on average with very small increase on power dissipation. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Low-Complexity All-Digital Sample Clock Dither for OFDM Timing Recovery

    Page(s): 1036 - 1042
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1610 KB) |  | HTML iconHTML  

    Based on phase adjustment, this work investigates a low-complexity all-digital sample clock dither (ADSCD) to perform coherent sampling for orthogonal frequency-division multiplexing (OFDM) timing recovery. To reduce complexity, only tri-state buffers are acquired to build a multiphase all-digital clock management (ADCM), which can generate more than 32 phases over gigahertz without phase-locked or delay-locked loops. Following divide-and-conquer search and triangulated approximation, the phase adjustment is simple but efficient, such that four preambles are adequate to make analog-to-digital (A/D) sampling coherent. Performance evaluation indicates that the proposed ADSCD can tolerate ±400-ppm clock offsets with 0.8 ~ 1.3-dB signal-to-noise ratio (SNR) losses at 8% PER in frequency-selective fading. Hence, this scheme involves a little overhead to ensure fast recovery and wide offset tolerance for OFDM packet transmissions. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Adaptively Pipelined Mixed Synchronous-Asynchronous Digital FIR Filter Chip Operating at 1.3 Gigahertz

    Page(s): 1043 - 1056
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1695 KB) |  | HTML iconHTML  

    A high-throughput low-latency digital finite impulse response (FIR) filter has been designed for use in partial-response maximum-likelihood (PRML) read channels of modern disk drives. The filter is a hybrid synchronous-asynchronous design. The speed-critical portion of the filter is designed as a high-performance asynchronous pipeline sandwiched between synchronous input and output portions, making it possible for the entire filter to be embedded within a clocked system. A novel feature of the filter is that the degree of pipelining is dynamically variable, depending upon the input data rate. This feature is critical in obtaining a very low filter latency throughout the range of operating frequencies. The filter is a ten-tap six-bit FIR filter, fabricated in a 0.18-μm CMOS process. Resulting chips were fully functional over a wide range of supply voltages, and exhibited throughputs of over 1.3 giga-items/s, and latencies of 2-5 clock cycles. Interestingly, the filter throughput was limited by the synchronous portion of the chip; the internal asynchronous pipeline was estimated to be capable of significantly higher throughputs, around 1.8 giga-items/s. More importantly though, the adaptively pipelined nature of the filter allows it to offer a worst-case latency of only 10 ns, which is half the worst-case latency of the best previously reported comparable fully-synchronous implementation by Rylov et al. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Complexity Analysis and Efficient Implementations of Bit Parallel Finite Field Multipliers Based on Karatsuba-Ofman Algorithm on FPGAs

    Page(s): 1057 - 1066
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1323 KB) |  | HTML iconHTML  

    This paper presents complexity analysis [both in application-specific integrated circuits (ASICs) and on field-programmable gate arrays (FPGAs)] and efficient FPGA implementations of bit parallel mixed Karatsuba-Ofman multipliers (KOM) over GF(2m) . By introducing the common expression sharing and the complexity analysis on odd-term polynomials, we achieve a lower gate bound than previous ASIC discussions. The analysis is extended by using 4-input/6-input lookup tables (LUT) on FPGAs. For an arbitrary bit-depth, the optimum iteration step is shown. The optimum iteration steps differ for ASICs, 4-input LUT-based FPGAs and 6-input LUT-based FPGAs. We evaluate the LUT complexity and area-time product tradeoffs on FPGAs with different computer-aided design (CAD) tools. Furthermore, the experimental results on FPGAs for bit parallel modular multipliers are shown and compared with previous implementations. To the best of our knowledge, our bit parallel multipliers consume the least resources among known FPGA implementations to date. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive and Deadlock-Free Tree-Based Multicast Routing for Networks-on-Chip

    Page(s): 1067 - 1080
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1890 KB) |  | HTML iconHTML  

    This paper presents the first synthesizable network-on-chip (NoC) based on a mesh topology, which supports adaptive and deadlock-free tree-based multicast routing without virtual channels. The deadlock-free routing algorithms for unicast and multicast packets are the same. Therefore, the routing function gate-level implementation is very efficient. Multicast packets are injected to the network by sending multiple packet headers beforehand. The packet headers contain destination addresses to set up multicast trees connecting a source with multiple destination nodes. An additional locally uniform identification (ID) field is packetized together with flits belonging to the same packet. Therefore, flits of different unicast or multicast packets can be interleaved in the same queue because of the local ID-tags, which are updated and mapped dynamically to support bandwidth scalability of interconnection links. Deadlocks in tree-based multicast routing are handled using a flit-by-flit round arbitration and a fair hold-release tagging mechanism. The effectiveness of the novel mechanism has been experimented under multiple multicast conflicts scenarios, where the experimental results show that all traffic is accepted in-order and lossless in their destination nodes even if adaptive routing functions are used and the sizes of the multicast messages are very long. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • X-Filling for Simultaneous Shift- and Capture-Power Reduction in At-Speed Scan-Based Testing

    Page(s): 1081 - 1092
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1096 KB) |  | HTML iconHTML  

    Power consumption during at-speed scan-based testing can be significantly higher than that during normal functional mode in both shift and capture phases, which can cause circuits' reliability concerns during manufacturing test. This paper proposes a novel X-filling technique, namely “iFill”, to address the above issue, by analyzing the impact of X-bits on switching activities of the circuit nodes in the two different phases. In addition, different from prior X -filling methods for shift-power reduction that can only reduce shift-in power, our method is able to cut down power consumptions in both shift-in and shift-out processes. Experimental results on benchmark circuits show that the proposed technique can guarantee the power safety in both shift and capture phases during at-speed scan-based testing. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asynchronous Data-Driven Circuit Synthesis

    Page(s): 1093 - 1106
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (394 KB) |  | HTML iconHTML  

    A method is described for synthesizing asynchronous circuits based on the Handshake Circuit paradigm but employing a data-driven, rather than a control-driven, style. This approach attempts to combine the performance advantages of data-driven asynchronous design styles with the handshake circuit style of construction used in existing syntax-directed synthesis. The method is demonstrated on a significant design-a 32-bit microprocessor. This example shows that the data-driven circuit style provides better performance than control-driven synthesized circuits. This paper extends previous reported work by illustrating how conditional execution, oft-cited as a problem for data-driven descriptions, is handled within the system, and by a more detailed analysis of the design example. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asynchronous Current Mode Serial Communication

    Page(s): 1107 - 1117
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1275 KB) |  | HTML iconHTML  

    An asynchronous high-speed wave-pipelined bit-serial link for on-chip communication is presented as an alternative to standard bit-parallel links. The link employs the differential level encoded dual-rail (LEDR) two-phase asynchronous protocol, avoiding per-bit handshake and eliminating per-bit synchronization, in contrast with synchronous serial links that rely on complex clock recovery. Novel low-power current signaling driver and receiver circuits are presented, enabling high-speed communication at a very low voltage swing over long wires. In contrast, previous methods employed voltage sensing, resulting in higher swing, higher dynamic power, shorter wires or slower operation. The asynchronous current mode driver is designed to support varying data rates, and it eliminates the need for balanced codes and busy toggling that prevent deep discharge. The data cycle time of the link is equal to a single gate delay, enabling 67 Gb/s throughput in 65-nm technology. Wave-pipelining is employed also by the asynchronous SERDES circuits, to enable such high speed operation. The link was SPICE simulated for 65-nm technology, using wire models obtained by a 3-D EM solver. The link incurs lower power and area relative to synchronous and asynchronous bit-parallel communications, and these relative benefits also scale with technology. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transistor Variability Modeling and its Validation With Ring-Oscillation Frequencies for Body-Biased Subthreshold Circuits

    Page(s): 1118 - 1129
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1104 KB) |  | HTML iconHTML  

    This paper presents transistor variability modeling and its validation for body-biased subthreshold circuits based on measurements of a device-array circuit using a 90-nm technology. The device array consists of p/nMOS transistors and ring oscillators. We examine and confirm the correlation between the performance variation model extracted from measured I-V characteristics and fabricated oscillation frequencies. We demonstrate that delay variations in subthreshold circuits are well characterized with two parameters, i.e., threshold voltage and subthreshold swing parameter. We also reveal that threshold voltage shift by body biasing can be deterministically modeled and statistical modeling is less meaningful. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Antiharmonic, Programmable, DLL-Based Frequency Multiplier for Dynamic Frequency Scaling

    Page(s): 1130 - 1134
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (621 KB) |  | HTML iconHTML  

    This paper describes a new delay-locked loop (DLL)-based frequency multiplier, which includes a lock controller and a phase detector to solve the false lock problem and overcome the limited locking range of conventional DLLs. By using the multiple clock phases of the DLL, the lock controller is able to indicate whether the delay time of the VCDL is within the correct locking range or not. A differentially controlled edge combiner is also proposed for the frequency multiplication. The antiharmonic DLL-based frequency multiplier, implemented in a 0.18-μ.m CMOS process, occupies an active area of 0.043 mm2, and dissipates 36.7 mW at 1.7 GHz. The measured root mean square jitter and peak-to-peak jitter for the multiplied output clock at 1.7 GHz are 2.64 and 16.8 ps, respectively. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On-Chip SOC Test Platform Design Based on IEEE 1500 Standard

    Page(s): 1134 - 1139
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (493 KB) |  | HTML iconHTML  

    IEEE 1500 Standard defines a standard test interface for embedded cores of a system-on-a-chip (SOC) to simplify the test problems. In this paper we present a systematic method to employ this standard in a SOC test platform so as to carry out on-chip at-speed testing for embedded SOC cores without using expensive external automatic test equipment. The cores that can be handled include scan-based logic cores, BIST-based memory cores, BIST-based mixed-signal devices, and hierarchical cores. All required test control signals for these cores can be generated on-chip by a single centralized test access mechanism (TAM) controller. These control signals along with test data formatted in a single buffer are transferred to the cores via a dedicated test bus, which facilitates parallel core testing. A number of design techniques, including on-chip comparison, direct memory access, hierarchical core test architecture, and hierarchical test bus design, are also employed to enhance the efficiency of the test platform. A sample SOC equipped with the test platform has been designed. Experimental results on both FPGA prototyping and real chip implementation confirm that the test platform can efficiently execute all test procedures and effectively identify potential defect(s) in the target circuit(s). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New Architectural Design of CA-Based Codec

    Page(s): 1139 - 1144
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (261 KB) |  | HTML iconHTML  

    Cellular automata (CA) has already established its novelty for bits and bytes error correcting codes (ECC). The current work identifies weakness and limitation of existing CA-based byte ECC and proposes an improved CA-based double byte ECC which overcomes the identified weakness. The code is very much suited from VLSI design viewpoint and requires significantly less hardware and power for decoding compared to the existing techniques employed for Reed-Solomon (RS) Codes. Also it has been shown that the CA-based scheme can easily be extended for correcting more than two byte errors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Transactions on Very Large Scale Integration (VLSI) Systems society information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (27 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Very Large Scale Integration (VLSI) Systems Information for authors

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (28 KB)  
    Freely Available from IEEE

Aims & Scope

Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing, and systems applications. Generation of specifications, design, and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor, and process levels.

To address this critical area through a common forum, the IEEE Transactions on VLSI Systems was founded. The editorial board, consisting of international experts, invites original papers which emphasize the novel system integration aspects of microelectronic systems, including interactions among system design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and system level qualification. Thus, the coverage of this Transactions focuses on VLSI/ULSI microelectronic system integration.

Topics of special interest include, but are not strictly limited to, the following: • System Specification, Design and Partitioning, • System-level Test, • Reliable VLSI/ULSI Systems, • High Performance Computing and Communication Systems, • Wafer Scale Integration and Multichip Modules (MCMs), • High-Speed Interconnects in Microelectronic Systems, • VLSI/ULSI Neural Networks and Their Applications, • Adaptive Computing Systems with FPGA components, • Mixed Analog/Digital Systems, • Cost, Performance Tradeoffs of VLSI/ULSI Systems, • Adaptive Computing Using Reconfigurable Components (FPGAs) 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Yehea Ismail
CND Director
American University of Cairo and Zewail City of Science and Technology
New Cairo, Egypt
y.ismail@aucegypt.edu