# IEEE Transactions on Very Large Scale Integration (VLSI) Systems

## Filter Results

Displaying Results 1 - 25 of 37

Publication Year: 2017, Page(s):C1 - C4
| PDF (417 KB)
• ### IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Publication Year: 2017, Page(s): C2
| PDF (78 KB)
• ### Editorial

Publication Year: 2017, Page(s):1 - 20
| PDF (4813 KB) | HTML
• ### Generation of Customized Accelerators for Loop Pipelining of Binary Instruction Traces

Publication Year: 2017, Page(s):21 - 34
| | PDF (2767 KB)

Many embedded applications process large amounts of data using regular computational kernels, amenable to acceleration by specialized hardware coprocessors. To reduce the significant design effort, the dedicated hardware may be automatically generated, usually starting from the application's source or binary code. This paper presents a moduloscheduled loop accelerator capable of executing multiple... View full abstract»

• ### EBSCam: Background Subtraction for Ubiquitous Computing

Publication Year: 2017, Page(s):35 - 47
| | PDF (4445 KB) | HTML

Background subtraction (BS) is a crucial machine vision scheme for detecting moving objects in a scene. With the advent of smart cameras, the embedded implementation of BS finds ever-increasing applications. This paper presents a new BS scheme called efficient BS for smart cameras (EBSCam). EBSCam thresholds the change in the estimated background model, which suppresses variance of the estimates, ... View full abstract»

• ### An Efficient Component for Designing Signed Reverse Converters for a Class of RNS Moduli Sets of Composite Form $\{2^{k}, 2^{P}-1\}$

Publication Year: 2017, Page(s):48 - 59
| | PDF (2414 KB)

The application of residue number system (RNS) to digital signal processing lies in the ability to operate on signed numbers. However, the available RNS-to-binary (reverse) converters have been designed for unsigned numbers, which means that they do not produce signed outputs. Usually, some additional circuits are introduced at the output of the reverse converter to map the unsigned generated outp... View full abstract»

• ### VLSI Extreme Learning Machine: A Design Space Exploration

Publication Year: 2017, Page(s):60 - 74
| | PDF (3548 KB) | HTML

In this paper, we describe a compact low-power high-performance hardware implementation of extreme learning machine for machine learning applications. Mismatches in current mirrors are used to perform the vector-matrix multiplication that forms the first stage of this classifier and is the most computationally intensive. Both regression and classification (on UCI data sets) are demonstrated and a ... View full abstract»

• ### Sign-Magnitude Encoding for Efficient VLSI Realization of Decimal Multiplication

Publication Year: 2017, Page(s):75 - 86
| | PDF (2688 KB) | HTML

Decimal X × Y multiplication is a complex operation, where intermediate partial products (IPPs) are commonly selected from a set of precomputed radix-10 X multiples. Some works require only [0, 5] × X via recoding digits of Y to one-hot representation of signed digits in [-5,5]. This reduces the selection logic at the cost of one extra IPP. Two's complement signed-digit (TCSD) encodi... View full abstract»

• ### Efficient Soft Cancelation Decoder Architectures for Polar Codes

Publication Year: 2017, Page(s):87 - 99
| | PDF (2473 KB) | HTML

The flooding belief propagation (FO-BP) and the soft-cancelation (SCAN) algorithms are the two most popular soft-output BP algorithms for the decoding of capacity-achieving polar codes. The FO-BP algorithm has high throughput at the cost of performance degradation in high signal-to-noise ratio (SNR) region or with large block length. The SCAN algorithm has much better decoding performance while su... View full abstract»

• ### Hybrid Hardware/Software Floating-Point Implementations for Optimized Area and Throughput Tradeoffs

Publication Year: 2017, Page(s):100 - 113
| | PDF (3515 KB)

Hybrid floating-point (FP) implementations improve software FP performance without incurring the area overhead of full hardware FP units. The proposed implementations are synthesized in 65-nm CMOS and integrated into small fixed-point processors with a RISC-like architecture. Unsigned, shift carry, and leading zero detection (USL) support is added to a processor to augment an existing instruction ... View full abstract»

• ### A 2.5-ps Bin Size and 6.7-ps Resolution FPGA Time-to-Digital Converter Based on Delay Wrapping and Averaging

Publication Year: 2017, Page(s):114 - 124
| | PDF (3791 KB)

A high-resolution time-to-digital converter (TDC) implemented with field programmable gate array (FPGA) based on delay wrapping and averaging is presented. The fundamental idea is to pass a single clock through a series of delay elements to generate multiple reference clocks with different phases for input time quantization. Due to periodicity, those phases will be equivalently wrapped within one ... View full abstract»

• ### Subthreshold Operation of CAAC-IGZO FPGA by Overdriving of Programmable Routing Switch and Programmable Power Switch

Publication Year: 2017, Page(s):125 - 138
| | PDF (4376 KB) | HTML

A field-programmable gate array (FPGA) using a crystalline oxide semiconductor of c-axis-aligned crystal indium-gallium-zinc oxide (CAAC-IGZO) has been developed, which is capable of subthreshold operation used for energy harvesting. To achieve subthreshold operation, the CAAC-IGZO FPGA has a structure designed as an extension of a boosting pass gate using a CAAC-IGZO FET and employs overdriving o... View full abstract»

• ### Efficient Designs of Multiported Memory on FPGA

Publication Year: 2017, Page(s):139 - 150
| | PDF (2187 KB) | HTML

The utilization of block RAMs (BRAMs) is a critical performance factor for multiported memory designs on field-programmable gate arrays (FPGAs). Not only does the excessive demand on BRAMs block the usage of BRAMs from other parts of a design, but the complex routing between BRAMs and logic also limits the operating frequency. This paper first introduces a brand new perspective and a more efficien... View full abstract»

• ### Floorplanning Automation for Partial-Reconfigurable FPGAs via Feasible Placements Generation

Publication Year: 2017, Page(s):151 - 164
| | PDF (5314 KB) | HTML

When dealing with partially reconfigurable designs on field-programmable gate array, floorplanning represents a critical step that highly impacts system's performance and reconfiguration overhead. However, current vendor design tools still require the floorplan to be manually defined by the designer. Within this paper, we provide a novel floorplanning automation framework, integrated in the Xilinx... View full abstract»

• ### High-Speed and Low-Latency ECC Processor Implementation Over GF( $2^{m})$ on FPGA

Publication Year: 2017, Page(s):165 - 176
| | PDF (3195 KB) | HTML

In this paper, a novel high-speed elliptic curve cryptography (ECC) processor implementation for point multiplication (PM) on field-programmable gate array (FPGA) is proposed. A new segmented pipelined full-precision multiplier is used to reduce the latency, and the Lopez-Dahab Montgomery PM algorithm is modified for careful scheduling to avoid data dependency resulting in a drastic reduction in t... View full abstract»

• ### ENFIRE: A Spatio-Temporal Fine-Grained Reconfigurable Hardware

Publication Year: 2017, Page(s):177 - 188
| | PDF (2916 KB) | HTML

Field programmable gate arrays (FPGAs) are well-established as fine-grained reconfigurable computing platforms. However, FPGAs demonstrate poor scalability in advanced technology nodes due to the large negative impact of the elaborate programmable interconnects (PIs). The need for such vast PIs arises from two key factors: 1) fine-grained bit-level data manipulation in the configurable logic block... View full abstract»

• ### Temporarily Fine-Grained Sleep Technique for Near- and Subthreshold Parallel Architectures

Publication Year: 2017, Page(s):189 - 197
| | PDF (1753 KB) | HTML

This paper presents a design approach for improving energy-efficiency and throughput of parallel architectures in near- and subthreshold voltage circuits. The focus is to suppress leakage energy dissipation of the idle portions of circuits during active modes, which can allow us to wholly transform the throughput improvement from parallel architectures into energy savings via deep voltage scaling.... View full abstract»

• ### OptiFEX: A Framework for Exploring Area-Efficient Floating Point Expressions on FPGAs With Optimized Exponent/Mantissa Widths

Publication Year: 2017, Page(s):198 - 209
Cited by:  Papers (1)
| | PDF (1700 KB) | HTML

Field-programmable gate arrays (FPGAs) could outperform microprocessors on floating point computations due to massive parallelism, freedom on the selection of exponent/mantissa width, and utilization of simplified adders and multipliers. However, optimized use of resources and accuracy of the final implemented expression are two important issues in the implementation of floating point arithmetic e... View full abstract»

• ### Logic-Base Interconnect Design for Near Memory Computing in the Smart Memory Cube

Publication Year: 2017, Page(s):210 - 223
| | PDF (4781 KB) | HTML

Hybrid memory cube (HMC) has promised to improve bandwidth, power consumption, and density for the next-generation main memory systems. In addition, 3-D integration gives a second shot for revisiting near memory computation to fill the gap between processors and memories. In this paper, we study the required infrastructure inside the HMC to support near memory computation in a modular and flexible... View full abstract»

• ### A Fault Tolerance Technique for Combinational Circuits Based on Selective-Transistor Redundancy

Publication Year: 2017, Page(s):224 - 237
Cited by:  Papers (2)
| | PDF (4481 KB) | HTML

With fabrication technology reaching nanolevels, systems are becoming more prone to manufacturing defects with higher susceptibility to soft errors. This paper is focused on designing combinational circuits for soft error tolerance with minimal area overhead. The idea is based on analyzing random pattern testability of faults in a circuit and protecting sensitive transistors, whose soft error dete... View full abstract»

• ### Scalable Approach for Power Droop Reduction During Scan-Based Logic BIST

Publication Year: 2017, Page(s):238 - 246
| | PDF (2074 KB) | HTML

The generation of significant power droop (PD) during at-speed test performed by Logic Built-In Self Test (LBIST) is a serious concern for modern ICs. In fact, the PD originated during test may delay signal transitions of the circuit under test (CUT): an effect that may be erroneously recognized as delay faults, with consequent erroneous generation of test fails and increase in yield loss. In this... View full abstract»

• ### Soft Error Rate Reduction of Combinational Circuits Using Gate Sizing in the Presence of Process Variations

Publication Year: 2017, Page(s):247 - 260
Cited by:  Papers (1)
| | PDF (2980 KB) | HTML

Soft errors in combinational logic circuits are emerging as a significant reliability concern for nanoscale VLSI designs. This paper presents a novel sensitivity-based gate sizing methodology to reduce the soft error rate (SER) of combinational circuits in the presence of process variations. The proposed method is based on modeling the statistics of SER of the circuit gates as a random variable to... View full abstract»

• ### Parallel High-Order Envelope-Following Method for Fast Transient Analysis of Highly Oscillatory Circuits

Publication Year: 2017, Page(s):261 - 270
| | PDF (5627 KB) | HTML

In this paper, a parallel high-order envelope-following (EF) method is presented. The proposed method exploits the high-order and A-stable Obreshkov formula (ObF) to provide superior accuracy and speedup for the EF technique. Utilizing ObF provides accurate and faster analysis while keeping the same accuracy as the conventional low-order integration methods. In addition, a parallel method that is ... View full abstract»

• ### Lithography Defect Probability and Its Application to Physical Design Optimization

Publication Year: 2017, Page(s):271 - 285
| | PDF (4471 KB) | HTML

Modern standard cells contain intercell margins at the left and right ends for better lithography. We introduce defect probability, which is the probability that a lithography defect occurs if the margins between two adjacent cells are missing. Computing the defect probability of all cell pairs is impractical due to lengthy lithography simulations and huge number of cell pair combinations. Two app... View full abstract»

• ### Modeling Size Limitations of Resistive Crossbar Array With Cell Selectors

Publication Year: 2017, Page(s):286 - 293
| | PDF (2469 KB) | HTML

Due to recent developments in emerging memory technologies, resistive crossbar arrays have gained increasing importance. The size of the crossbar arrays is, however, limited due to challenges brought by the interconnect resistance, sneak path currents, and the physical area of the peripheral circuitry. In this paper, three figures of merit that characterize the limitations of resistive crossbar ar... View full abstract»

## Aims & Scope

Design and realization of microelectronic systems using VLSI/ULSI technologies requires close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing, and systems applications. Generation of specifications, design, and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor, and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems was founded.

Full Aims & Scope

## Meet Our Editors

Editor-in-Chief

Krishnendu Chakrabarty
Department of Electrical Engineering
Duke University
Durham, NC 27708 USA
Krish@duke.edu