By Topic

Very Large Scale Integration (VLSI) Systems, IEEE Transactions on

Issue 7 • Date July 2013

Filter Results

Displaying Results 1 - 25 of 28
  • Table of contents

    Publication Year: 2013 , Page(s): C1 - C4
    Save to Project icon | Request Permissions | PDF file iconPDF (789 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Very Large Scale Integration (VLSI) Systems publication information

    Publication Year: 2013 , Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (136 KB)  
    Freely Available from IEEE
  • Nonvolatile Nanopipelining Logic Using Multiferroic Single-Domain Nanomagnets

    Publication Year: 2013 , Page(s): 1181 - 1188
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1073 KB) |  | HTML iconHTML  

    Multiferroic single-domain nanomagnetics is a promising emerging nanotechnology poised to usher in ultralow energy nanomagnetic nonvolatile logic circuits in numerous medical applications, such as implants and prosthesis, where battery longevity is paramount. This paper evaluates the fundamental mode of signal propagation over ferromagnetically and antiferromagnetically coupled wires and interaction between the magnetic nanoparticles to perform nonvolatile logic functions, such the majority gate that sets its output to 1 when the majority of the inputs is 1. By taking advantage of magnetic nonvolatility, the paper demonstrates nanopipelining signal processing, data propagation performance, and functionality of basic building blocks. Our results indicate that effective nanopipeling can be achieved with clock periods approaching 9 ns and energy dissipation of 20 aJ per nanomagnet switch with the device sizes considered. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Impact of III–V and Ge Devices on Circuit Performance

    Publication Year: 2013 , Page(s): 1189 - 1200
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (939 KB) |  | HTML iconHTML  

    III-V and germanium (Ge) field-effect transistors (FETs) have been studied as candidates for post Si CMOS. In this paper, the performance of various digital blocks and static random access memory (SRAM) with different combinations of Si, III-V and Ge devices are studied. SPICE-compatible III-V n-channel FET (nFET) and Ge p-channel FET (pFET) models are developed for the analysis. The delay and energy of the different combinations are estimated and compared. In typical digital design, the driving capability of the nFET and pFET should be matched for optimum noise margin and performance. The combination of III-V nFET with low input capacitance and Ge pFET achieves the best energy-delay performance for many digital logic circuits. The read margin of SRAM is maximized with a Si pass-gate, and an inverter of III-V nFET and Ge pFET. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of Testable Reversible Sequential Circuits

    Publication Year: 2013 , Page(s): 1201 - 1209
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (847 KB) |  | HTML iconHTML  

    In this paper, we propose the design of two vectors testable sequential circuits based on conservative logic gates. The proposed sequential circuits based on conservative logic gates outperform the sequential circuits implemented in classical gates in terms of testability. Any sequential circuit based on conservative logic gates can be tested for classical unidirectional stuck-at faults using only two test vectors. The two test vectors are all 1's, and all 0's. The designs of two vectors testable latches, master-slave flip-flops and double edge triggered (DET) flip-flops are presented. The importance of the proposed work lies in the fact that it provides the design of reversible sequential circuits completely testable for any stuck-at fault by only two test vectors, thereby eliminating the need for any type of scan-path access to internal memory cells. The reversible design of the DET flip-flop is proposed for the first time in the literature. We also showed the application of the proposed approach toward 100% fault coverage for single missing/additional cell defect in the quantum-dot cellular automata (QCA) layout of the Fredkin gate. We are also presenting a new conservative logic gate called multiplexer conservative QCA gate (MX-cqca) that is not reversible in nature but has similar properties as the Fredkin gate of working as 2:1 multiplexer. The proposed MX-cqca gate surpasses the Fredkin gate in terms of complexity (the number of majority voters), speed, and area. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Test Path Selection for Capturing Delay Failures Under Statistical Timing Model

    Publication Year: 2013 , Page(s): 1210 - 1219
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (432 KB) |  | HTML iconHTML  

    This paper proposes a test path selection approach for capturing delay failures caused by the accumulated distributed small delay variations. First, a universal path candidate set U, which contains testable long paths, is generated. Second, given a path number threshold, path selection from U is performed with the objective of maximizing the capability to capture potential delay failures. The path selection problem is converted to a minimal space intersection problem, and a greedy path selection heuristics is proposed, the key point of which is to calculate the probability that all the paths in a specified path set meet the delay constraint. Statistical timing analysis technologies and heuristics are used in the calculation. Experimental results show that the proposed approach is time efficient and achieves higher probability of capturing delay failures than traditional path selection approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic Test Program Generation Using Executing-Trace-Based Constraint Extraction for Embedded Processors

    Publication Year: 2013 , Page(s): 1220 - 1233
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (926 KB) |  | HTML iconHTML  

    Software-based self-testing (SBST) has been a promising method for processor testing, but the complexity of the state-of-art processors still poses great challenges for SBST. This paper utilizes the executing trace collected during executing training programs on the processor under test to simplify mappings and functional constraint extraction for ports of inner components, which facilitate structural test generation with constraints at gate level, and automatic test instruction generation (ATIG) even for hidden control logic (HCL). In addition, for sequential HCL, we present a test routine generation technique on the basis of an extended finite state machine, so that structural patterns for combinational subcircuits in the sequential HCL can be mapped into the test routines to form a test program. Experimental results demonstrate that the proposed ATIG method can achieve good structural fault coverage with compact test programs on modern processors. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Piecewise Linear Modulation Technique for Spread Spectrum Clock Generation

    Publication Year: 2013 , Page(s): 1234 - 1245
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1949 KB) |  | HTML iconHTML  

    We propose a novel modulation profile for a spread spectrum clock generator (SSCG). The proposed piecewise linear (PWL) modulation profile significantly reduces electromagnetic interference with a simple implementation. Two SSCGs with two- and three-slope-PWL modulation profiles are used. Both SSCGs consist of the proposed spread spectrum control profile generator and a phase-locked loop that includes a high-resolution fractional divider to reduce quantization noise from a delta-sigma modulator. The SSCG with the two-slope-PWL modulation profile was fabricated in a 0.18 μm 1P4M CMOS technology. The measured peak power reduction level of the two-slope-PWL modulation profile is 14.2 dB with 5000 ppm down spreading at 1.5 GHz. The SSCG occupies an active area of 0.49 mm2 and consumes 40 mW of power at 1.5 GHz. The SSCG with the three-slope-PWL modulation profile was fabricated in a 0.13 μm 1P6M CMOS technology. The measured peak power reduction level of the three-slope-PWL modulation profile is 10.3 and 10.52 dB with 5000 ppm down spreading at 162 and 270 MHz, respectively. The SSCG occupies an active area of 0.096 mm2 and dissipates 1 mW of power at 270 MHz. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Vector Graphics Rasterization Accelerator Using Optimized Scan-Line Buffer

    Publication Year: 2013 , Page(s): 1246 - 1259
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3459 KB) |  | HTML iconHTML  

    This paper presents a small and fast VLSI architecture of a vector graphics rasterization accelerator. To decide the filling regions of a graphics object, a large on-chip scan-line buffer (SB) is very often used and frequently accessed to derive the pixel's winding count. This paper, first, proposes a special 2-bit coding scheme for buffer entry along with active-edge-table rescan to record the intersection information of scan lines and the object paths. Second, for AA rendering applications, a coverage buffer is proposed to avoid the duplication of SBs. Compared with the conventional approach, the required buffer size can be reduced by up to 89%. Besides buffer reduction, this paper also proposes a hierarchical SB architecture in which the upper-level buffer indicates which scan-line sections have intersected with objects in order to skip the access to successive buffer entries. The same technique, along with the differential coverage transformation, can also be applied to coverage buffer. Our experimental results show that more than 87% of memory accesses can be reduced, which results in saving 66.4% of clock cycles in practical hardware implementation. The gate count of the proposed rasterization accelerator is only about 32 232, and can run at 250 MHz under UMC 90-nm technology for HDTV applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Write Current Self-Configuration Scheme for MRAM Yield Improvement

    Publication Year: 2013 , Page(s): 1260 - 1270
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (945 KB) |  | HTML iconHTML  

    Magnetic random access memory (MRAM) is an emerging nonvolatile memory, which is widely studied for its high speed, high density, small cell size, and almost unlimited endurance. However, for deep-submicrometer process technologies, significant variation in the MRAM cells' operating condition results in write failures in cells and reduces the production yield. Memory designers have to characterize failed MRAM chips to find a suitable current level for reconfiguring their write current, which is time consuming. In this paper, we propose an efficient operating-current search method and the corresponding built-in circuit for toggle MRAM, which can rapidly find the minimal operating current. With the built-in search circuit, an MRAM chip can dynamically configure its write current through few tester channels. The resulting chip works correctly and consumes lower power. Production yield, thus, can be increased while the test cost is greatly reduced. We also present a generator of the circuit, which determines the circuit parameters according to the memory specifications and user requirements, and automatically generates the corresponding modules. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Task Allocation on Nonvolatile-Memory-Based Hybrid Main Memory

    Publication Year: 2013 , Page(s): 1271 - 1284
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (935 KB) |  | HTML iconHTML  

    In this paper, we consider the task allocation problem on a hybrid main memory composed of nonvolatile memory (NVM) and dynamic random access memory (DRAM). Compared to the conventional memory technology DRAM, the emerging NVM has excellent energy performance since it consumes orders of magnitude less leakage power. On the other hand, most types of NVMs come with the disadvantages of much shorter write endurance and longer write latency as opposed to DRAM. By leveraging the energy efficiency of NVM and long write endurance of DRAM, this paper explores task allocation techniques on hybrid memory for multiple objectives such as minimizing the energy consumption, extending the lifetime, and minimizing the memory size. The contributions of this paper are twofold. First, we design the integer linear programming (ILP) formulations that can solve different objectives optimally. Then, we propose two sets of heuristic algorithms including three polynomial time offline heuristics and three online heuristics. Experiments show that compared to the optimal solutions generated by the ILP formulations, the offline heuristics can produce near-optimal results. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BilRC: An Execution Triggered Coarse Grained Reconfigurable Architecture

    Publication Year: 2013 , Page(s): 1285 - 1298
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1245 KB) |  | HTML iconHTML  

    We present Bilkent reconfigurable computer (BilRC), a new coarse-grained reconfigurable architecture (CGRA) employing an execution-triggering mechanism. A control data flow graph language is presented for mapping the applications to BilRC. The flexibility of the architecture and the computation model are validated by mapping several real-world applications. The same language is also used to map applications to a 90-nm field-programmable gate array (FPGA), giving exactly the same cycle count performance. It is found that BilRC reduces the configuration size about 33 times. It is synthesized with 90-nm technology, and typical applications mapped on BilRC run about 2.5 times faster than those on FPGA. It is found that the cycle counts of the applications for a commercial very long instruction word digital signal processor processor are 1.9 to 15 times higher than that of BilRC. It is also found that BilRC can run the inverse discrete cosine transform algorithm almost 3 times faster than the closest CGRA in terms of cycle count. Although the area required for BilRC processing elements is larger than that of existing CGRAs, this is mainly due to the segmented interconnect architecture of BilRC, which is crucial for supporting a broad range of applications. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault-Tolerant Embedded-Memory Strategy for Baseband Signal Processing Systems

    Publication Year: 2013 , Page(s): 1299 - 1307
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1576 KB) |  | HTML iconHTML  

    The growing density of integration and the increasing percentage of system-on-chip area occupied by embedded memories has led to an increase in the expected number of memory faults. The soft memory repair strategy proposed in this paper employs existing forward error correction at the system level and mitigates the impact of memory faults through permutation of high-sensitivity regions. The effectiveness of the proposed repair technique is evaluated on a multi-megabit de-interleaver static random access memory of an ISDB-T digital baseband orthogonal frequency-division multiplexing receiver in 65-nm CMOS. The proposed technique introduces a single multiplexer delay overhead and a configurable area overhead of ⌈M/i⌉ bits, where M is the number of memory rows and i is an integer from 1 to M, inclusive. The repair strategy achieves a measured 0.15 dB gain improvement at 2×10-4 quasi-error-free bit error rate in the presence of stuck-at memory faults for an additive white Gaussian noise channel. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Collaborative Multiobjective Global Routing

    Publication Year: 2013 , Page(s): 1308 - 1321
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (831 KB) |  | HTML iconHTML  

    This paper presents a collaborative procedure for multiobjective global routing. Our procedure takes multiple global routing solutions, which are generated independently (e.g., by one router that runs in different modes concurrently or by different routers running in parallel), as input. It then performs multiobjective optimization based on Pareto algebra and quickly generates multiple global routing solutions with a tradeoff between the considered objectives. The user can control the number of generated solutions and the degree of exploring the tradeoff between them by constraining the maximum allowable degradation in each objective. This paper then considers the following three multiobjective case studies: 1) minimization of interconnect power and wirelength; 2) minimization of routing congestion and wirelength; and 3) minimization of wirelength with respect to the (finite-capacity) routing resources. The maximum allowable degradation in wirelength is specified in all cases. Our multiobjective procedure runs in only a few minutes for each of the International Symposium on Physical Design 2008 benchmarks, even the unroutable ones, which imposes a tolerable overhead in the design flow. In our simulations, we demonstrate the effectiveness of our procedure using five modern academic global routers. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Energy-Efficient Digital Signal Processing via Voltage-Overscaling-Based Residue Number System

    Publication Year: 2013 , Page(s): 1322 - 1332
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (691 KB) |  | HTML iconHTML  

    In this paper, we apply the voltage overscaling (VOS) technique to the residue-number-system (RNS)-based digital signal processing system for achieving high energy efficiency. To mitigate the soft errors caused by VOS, we propose a new method, called joint RNS-RPR (JRR), which is the combination of RNS and the reduced precision redundancy (RPR) technique. The JRR technology inherits the properties of RNS, including shorter critical path, low complexity, and low power. Moreover, JRR can achieve higher power reduction than RNS for VOS applications. Since the soft errors caused by VOS lead to significant performance degradation of RNS, we use the information from RNS and RPR to achieve a high recovering probability of the soft errors with low hardware complexity. From the case study of finite impulse response (FIR) filter design based on the 0.25- μm 2.5-V CMOS technology, we find that JRR can save 62% more energy compared to the traditional FIR with a less than 2-dB signal noise ratio performance loss. We also find that JRR has lower complexity and better performance than the traditional soft error mitigation methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE 1500 Compatible Multilevel Maximal Concurrent Interconnect Test

    Publication Year: 2013 , Page(s): 1333 - 1337
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (334 KB) |  | HTML iconHTML  

    On-chip interconnect structures become much more complicated and dominate system performance in multicore system-on-chips. Oscillation ring (OR) test is an efficient test method for most types of faults in the interconnect structures, and previous studies show that both 100% fault coverage and the optimum diagnosis resolution for various fault models are achievable. The cost of OR test is decided by the number of test sessions required to form all the rings. Previous ring generation algorithm tries to generate long rings that usually cannot be put into the same test session, and thus the number of test sessions is not necessarily smaller. In this brief, we study techniques to generate rings that can be tested concurrently, so that the overall test time can be reduced significantly. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Block-Circulant RS-LDPC Code: Code Construction and Efficient Decoder Design

    Publication Year: 2013 , Page(s): 1337 - 1341
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (839 KB) |  | HTML iconHTML  

    This brief presents a method for constructing block-circulant (BC) Reed-Solomon-based low-density parity-check (RS-LDPC) codes and an efficient decoder design. The proposed construction method results in a BC form of a parity-check matrix from a random parity-check matrix for RS-LDPC codes. A decoder architecture and switch network for BC-RS-LDPC code are then developed based on the new BC parity-check matrix. Thus, an efficient decoder architecture dedicated to a promising class of high-performance BC-RS-LDPC codes is presented for the first time. Moreover, a (2048, 1723) BC-RS-LDPC decoder architecture is designed to demonstrate the efficiency of the presented techniques. Synthesis results show that the proposed decoder requires 1.3-M gates and can operate at 450 MHz to achieve a data throughput of 41 Gb/s with eight iterations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Enhanced Secure Architecture for Joint Action Test Group Systems

    Publication Year: 2013 , Page(s): 1342 - 1345
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB) |  | HTML iconHTML  

    The implementation of debugging tools through joint action test group (JTAG) has led to increased exposure of intellectual property through the interface. In this brief, the first hardware implementation of a flexible multilevel access security system for the JTAG interface is detailed. The proposed method is user-privilege aware, which allows for higher granularity for controlling user access of individual scan chains. The loading of individual JTAG instructions into scan chains can be blocked based on the credentials of the user. The hardware modifications proposed are compliant with IEEE 1149.1, have minimal timing overhead, and require no modifications to the core logic of the integrated circuit. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Throughput/Resource-Efficient Reconfigurable Processor for Multimedia Applications

    Publication Year: 2013 , Page(s): 1346 - 1350
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (891 KB) |  | HTML iconHTML  

    This brief presents the implementation and evaluation of an 8-bit adaptable processor core to be part of the power-throughput-area efficient multimedia oriented reconfigurable architecture reconfigurable array. The design of the processor core was custom implemented in IBM's 90 nm CMOS technology and occupies 0.115 mm2 silicon area with approximately 70% area utilized by core circuits. The processor shows a peak throughput performance of 75 MOPS/mW. Benchmarking results show estimated throughputs of 9.5, 21.36, 39.78, 170.88, and 4.54 MSamples/s for variants of 2-D discrete cosine transform (DCT), 4 × 4 H.264 integer transform, and 2-D discrete wavelet transform, respectively. Our analysis shows that the proposed design provides approximately 4-8 times higher throughput for 2-D DCT when compared against popular architectures. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error Rate-Based Wear-Leveling for nand Flash Memory at Highly Scaled Technology Nodes

    Publication Year: 2013 , Page(s): 1350 - 1354
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (469 KB) |  | HTML iconHTML  

    This brief presents a NAND Flash memory wear-leveling algorithm that explicitly uses memory raw bit error rate (BER) as the optimization target. Although NAND Flash memory wear-leveling has been well studied, all the existing algorithms aim to equalize the number of programming/erase cycles among all the memory blocks. Unfortunately, such a conventional design practice becomes increasingly suboptimal as inter-block variation becomes increasingly significant with the technology scaling. This brief presents a dynamic BER-based greedy wear-leveling algorithm that uses BER statistics as the measurement of memory block wear-out pace, and guides dynamic memory block data swapping to fully maximize the wear-leveling efficiency. Simulations have been carried out to quantitatively demonstrate its advantages over existing wear-leveling algorithms. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reduced Power Transition Fault Test Sets for Circuits With Independent Scan Chain Modes

    Publication Year: 2013 , Page(s): 1354 - 1359
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (255 KB) |  | HTML iconHTML  

    This brief considers circuits with multiple scan chains where each scan chain can operate in shift, functional, or hold mode independently of the other scan chains. For circuits where the hardware overhead of controlling the scan chains independently is acceptable, this brief describes a procedure whose goal is to generate a test set that achieves the same transition fault coverage as a test set that consists of both broadside and skewed-load tests, but where the shift mode is used as few times as possible during the first patterns of the tests. This allows the circuit to operate closer to its functional operation conditions, and reduces the power dissipation during the second patterns of the tests, which are applied at-speed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transition Fault Simulation Considering Broadside Tests as Partially-Functional Broadside Tests

    Publication Year: 2013 , Page(s): 1359 - 1363
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (162 KB) |  | HTML iconHTML  

    The scan-in states of functional broadside tests are reachable states, which are states that the circuit can enter during functional operation. This is used for ensuring functional operation conditions during the functional clock cycles of the tests. For a partially-functional broadside test, the scan-in state has a known Hamming distance to a reachable state. This ensures measurable deviations from functional operation conditions during the functional clock cycles of the test. It is important for addressing overtesting as well as excessive power dissipation. This brief develops a fault-simulation procedure for transition faults under arbitrary (functional and nonfunctional) broadside tests that considers the tests as partially-functional broadside tests. The procedure can be used for evaluating the proximity to functional operation conditions of arbitrary broadside test sets. For illustration, the procedure is used for comparing a low-power test set with an arbitrary broadside test set. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault Demotion Using Reconfigurable Slack (FaDReS)

    Publication Year: 2013 , Page(s): 1364 - 1368
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (400 KB) |  | HTML iconHTML  

    We propose an active dynamic redundancy-based fault-handling approach exploiting the partial dynamic reconfiguration capability of static random-access memory-based field-programmable gate arrays. Fault detection is accomplished in a uniplex hardware arrangement while an autonomous fault isolation scheme is employed, which neither requires test vectors nor suspends the computational throughput. The deterministic flow of the fault-handling scheme achieves an improved recovery in a bounded number of reconfigurations. This approach extends existing signal processing properties to accommodate fault handling, and is validated by implementing an H.263 video encoder discrete cosine transform (DCT) block. The peak signal-to-noise ratio measure of the video sequences indicates fault tolerance in the DCT block with only limited quality degradation, during the isolation and recovery phases spanning a few frames. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ADDLL for Clock-Deskew Buffer in High-Performance SoCs

    Publication Year: 2013 , Page(s): 1368 - 1373
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (865 KB) |  | HTML iconHTML  

    In this brief, we propose an all-digital delay locked loop (ADDLL) for a clock-deskew buffer. A low static phase offset at a high operating frequency is achieved by adopting a high-resolution window phase detector (PD) and a tristate-inverter-based ladder type coarse delay line (CDL). The proposed PD generates a high-resolution detection window that is adaptive to the process-voltage-temperature variation and reduces the static phase offset to nearly half of the fine delay line (FDL) resolution using a dual-output FDL. A proposed CDL is adopted in order to attain a small coarse delay step using tristate-inverters. The proposed ADDLL is designed using 0.13- μm process technology with a supply voltage of 1.2 V. The operating frequency range is 700 MHz to 2.0 GHz. The maximum static phase offset is less than 14.75 ps at all conditions and the power consumption is 4.0 mW at 2.0 GHz. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Transactions on Very Large Scale Integration (VLSI) Systems information for authors

    Publication Year: 2013 , Page(s): 1374
    Save to Project icon | Request Permissions | PDF file iconPDF (98 KB)  
    Freely Available from IEEE

Aims & Scope

Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing, and systems applications. Generation of specifications, design, and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor, and process levels.

To address this critical area through a common forum, the IEEE Transactions on VLSI Systems was founded. The editorial board, consisting of international experts, invites original papers which emphasize the novel system integration aspects of microelectronic systems, including interactions among system design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and system level qualification. Thus, the coverage of this Transactions focuses on VLSI/ULSI microelectronic system integration.

Topics of special interest include, but are not strictly limited to, the following: • System Specification, Design and Partitioning, • System-level Test, • Reliable VLSI/ULSI Systems, • High Performance Computing and Communication Systems, • Wafer Scale Integration and Multichip Modules (MCMs), • High-Speed Interconnects in Microelectronic Systems, • VLSI/ULSI Neural Networks and Their Applications, • Adaptive Computing Systems with FPGA components, • Mixed Analog/Digital Systems, • Cost, Performance Tradeoffs of VLSI/ULSI Systems, • Adaptive Computing Using Reconfigurable Components (FPGAs) 

Full Aims & Scope

Meet Our Editors

Editor-in-Chief

Krishnendu Chakrabarty
Department of Electrical Engineering
Duke University
Durham, NC 27708 USA
Krish@duke.edu