• ### Design of Power and Area Efficient Approximate Multipliers

Publication Year: 2017, Page(s):1782 - 1786
Approximate computing can decrease the design complexity with an increase in performance and power efficiency for error resilient applications. This brief deals with a new design approach for approximation of multipliers. The partial products of the multiplier are altered to introduce varying probability terms. Logic complexity of approximation is varied for the accumulation of altered partial pro... View full abstract»

• ### VLSI Implementation of Deep Neural Network Using Integral Stochastic Computing

Publication Year: 2017, Page(s):2688 - 2699
The hardware implementation of deep neural networks (DNNs) has recently received tremendous attention: many applications in fact require high-speed operations that suit a hardware implementation. However, numerous elements and complex interconnections are usually required, leading to a large area occupation and copious power consumption. Stochastic computing (SC) has shown promising results for lo... View full abstract»

• ### An Energy-Efficient Architecture for Binary Weight Convolutional Neural Networks

Publication Year: 2018, Page(s):280 - 293
Binary weight convolutional neural networks (BCNNs) can achieve near state-of-the-art classification accuracy and have far less computation complexity compared with traditional CNNs using high-precision weights. Due to their binary weights, BCNNs are well suited for vision-based Internet-of-Things systems being sensitive to power consumption. BCNNs make it possible to achieve very high throughput ... View full abstract»

• ### A Fully Integrated Discrete-Time Superheterodyne Receiver

Publication Year: 2017, Page(s):635 - 647
The zero/low intermediate frequency (IF) receiver (RX) architecture has enabled full CMOS integration. As the technology scales and wireless standards become ever more challenging, the issues related to time-varying dc offsets, the second-order nonlinearity, and flicker noise become more critical. In this paper, we propose a new architecture of a superheterodyne RX that attempts to avoid such issu... View full abstract»

• ### Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns

Publication Year: 2017, Page(s):2220 - 2233
Deep convolutional neural networks (DCNNs) have been successfully used in many computer vision tasks. Previous works on DCNN acceleration usually use a fixed computation pattern for diverse DCNN models, leading to imbalance between power efficiency and performance. We solve this problem by designing a DCNN acceleration architecture called deep neural architecture (DNA), with reconfigurable computa... View full abstract»

• ### Analysis and Design of a Low-Voltage Low-Power Double-Tail Comparator

Publication Year: 2014, Page(s):343 - 352
The need for ultra low-power, area efficient, and high speed analog-to-digital converters is pushing toward the use of dynamic regenerative comparators to maximize speed and power efficiency. In this paper, an analysis on the delay of the dynamic comparators will be presented and analytical expressions are derived. From the analytical expressions, designers can obtain an intuition about the main c... View full abstract»

• ### VLSI Design of an ML-Based Power-Efficient Motion Estimation Controller for Intelligent Mobile Systems

Publication Year: 2018, Page(s):262 - 271
In this paper, a machine learning (ML)-based power-efficient motion estimation (ME) controller algorithm and VLSI architecture incorporating coding bandwidth and rate-distortion (R-D) cost using convex optimization are proposed to effectuate a smart and bandwidth-efficient ME design for intelligent mobile systems. To be smart and adapt to time-altering coding bandwidth using intelligent power-mana... View full abstract»

• ### An FPGA-Based Phase Measurement System

Publication Year: 2018, Page(s):133 - 142
Phase measurement is required in electronic applications where a synchronous relationship between the signals needs to be preserved. Traditional electronic systems used for time measurement are designed using a classical mixed-signal approach. With the advent of reconfigurable hardware such as field-programmable gate arrays (FPGAs), it is more advantageous for designers to opt for all-digital arch... View full abstract»

• ### Passive Noise Shaping in SAR ADC With Improved Efficiency

Publication Year: 2018, Page(s):416 - 420
This brief reports a passive noise-shaping (PNS) scheme for successive approximation register (SAR) analog-to-digital converter (ADC) based on the two-step integration with passive gain and comparator gain techniques. The analysis shows that the proposed method achieves a better noise-shaping (NS) efficiency than prior arts, which enhances the noise attenuation by 14 dB. A design example is provid... View full abstract»

• ### Performance Analysis of a Low-Power High-Speed Hybrid 1-bit Full Adder Circuit

Publication Year: 2015, Page(s):2001 - 2008
In this paper, a hybrid 1-bit full adder design employing both complementary metal-oxide-semiconductor (CMOS) logic and transmission gate logic is reported. The design was first implemented for 1 bit and then extended for 32 bit also. The circuit was implemented using Cadence Virtuoso tools in 180-and 90-nm technology. Performance parameters such as power, delay, and layout area were compared with... View full abstract»

• ### Accelerating Recurrent Neural Networks: A Memory-Efficient Approach

Publication Year: 2017, Page(s):2763 - 2775
Recurrent neural networks (RNNs) have achieved the state-of-the-art performance on various sequence learning tasks due to their powerful sequence modeling capability. However, RNNs usually require a large number of parameters and high computational complexity. Hence, it is quite challenging to implement complex RNNs on embedded devices with stringent memory and latency requirement. In this paper, ... View full abstract»

• ### Design of Low-Voltage High-Speed CML D-Latches in Nanometer CMOS Technologies

Publication Year: 2017, Page(s):3509 - 3520
This paper presents the design of a novel low-voltage high-speed D-latch circuit suitable for nanometer CMOS technologies. The proposed topology is compared against the low-voltage triple-tail D-latch and its advantages are demonstrated both by simulations, under different performance/power consumption tradeoffs with a 40-nm CMOS technology, and theoretically, thanks to a simple model of the propa... View full abstract»

• ### An Ultralow Power Subthreshold CMOS Voltage Reference Without Requiring Resistors or BJTs

Publication Year: 2018, Page(s):201 - 205
This brief presents a novel ultralow power CMOS voltage reference (CVR) with only 4.6-nW power consumption. In the proposed CVR circuit, the proportional-to-absolute-temperature voltage is generated by feeding the leakage current of a zero-Vgs nMOS transistor to two diode-connected nMOS transistors in series, both of which are in subthreshold region; while the complementary-to-absolute-temperature... View full abstract»

• ### A 32-nm Subthreshold 7T SRAM Bit Cell With Read Assist

Publication Year: 2017, Page(s):3473 - 3483
The implementation of the six-transistor (6T) static random access memory cell in deep submicrometer region has become difficult due to the compromise between area, power, and performance, with local and global variations only exacerbating the problem further. To impede the read-write conflict of the 6T cell, the seven-transistor (7T) cell with a noise-margin-free read operation has previously bee... View full abstract»

• ### Design and FPGA Implementation of a Reconfigurable Digital Down Converter for Wideband Applications

Publication Year: 2017, Page(s):3548 - 3552
This brief presents a field-programmable gate array-based implementation of a reconfigurable digital down converter (DDC) that can process input bandwidth of up to 3.6 GHz and provide a flexible down-converted output. The proposed DDC consists of a mixer and a resampling filter. The resampling filter can work at much higher clock rate. The reason is that all the single-cycle recursive loops in the... View full abstract»

• ### AES Datapath Optimization Strategies for Low-Power Low-Energy Multisecurity-Level Internet-of-Things Applications

Publication Year: 2017, Page(s):3281 - 3290
Connected devices are getting attention because of the lack of security mechanisms in current Internet-of-Thing (IoT) products. The security can be enhanced by using standardized and proven-secure block ciphers as advanced encryption standard (AES) for data encryption and authentication. However, these security functions take a large amount of processing power and power/energy consumption. In this... View full abstract»

• ### An External Capacitor-Less Ultralow-Dropout Regulator Using a Loop-Gain Stabilizing Technique for High Power-Supply Rejection Over a Wide Range of Load Current

Publication Year: 2017, Page(s):3006 - 3018
An external capacitor-less ultra low-dropout (LDO) regulator that can continue to provide high power-supply rejection (PSR) over a wide range of the load current is proposed. Using the loop-gain stabilizer (LGS) to fix the dc level of the output voltage of the error amplifier to the optimal value, the LDO can keep maximizing the unity-gain frequency, while the load current changes widely up to 200... View full abstract»

• ### A Low-Power High-Speed Hybrid ADC With Merged Sample-and-Hold and DAC Functions for Efficient Subranging Time-Interleaved Operation

Publication Year: 2017, Page(s):3193 - 3206
An 8-bit 1-GS/s hybrid analog-to-digital converter (ADC) for high-speed low-power applications is introduced. It has a subranging architecture with a 3-bit flash ADC as a first stage and a 5-bit four-channel time-interleaved comparator-based asynchronous binary search (CABS) ADC as a second stage. In each channel, a merged sample-and-hold and capacitive digital-to-analog converter (SHDAC) performs... View full abstract»

• ### Low-Power and Area-Efficient Carry Select Adder

Publication Year: 2012, Page(s):371 - 375
Carry Select Adder (CSLA) is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. From the structure of the CSLA, it is clear that there is scope for reducing the area and power consumption in the CSLA. This work uses a simple and efficient gate-level modification to significantly reduce the area and power of the CSLA. Based on this modification 8... View full abstract»

• ### An Energy-Efficient and Wide-Range Voltage Level Shifter With Dual Current Mirror

Publication Year: 2017, Page(s):3534 - 3538
This brief presents an energy-efficient level shifter (LS) to convert a subthreshold input signal to an above-threshold output signal. In order to achieve a wide range of conversion, a dual current mirror (CM) structure consisting of a virtual CM and an auxiliary CM is proposed. The circuit has been implemented and optimized in SMIC 40-nm technology. The postlayout simulation demonstrates that the... View full abstract»

• ### Embedded DRAM-Based Memory Customization for Low-Cost FFT Processor Design

Publication Year: 2017, Page(s):3484 - 3494
In this paper, we present embedded dynamic random access memory (eDRAM)-based memory customization techniques for low-cost fast Fourier transform (FFT) processor design. The main idea is based on the observation that the FFT processor has regular and predictable memory access patterns, and it can be efficiently exploited for memory customization using eDRAM. The memory customization approaches are... View full abstract»

• ### Low-Power 19-Transistor True Single-Phase Clocking Flip-Flop Design Based on Logic Structure Reduction Schemes

Publication Year: 2017, Page(s):3033 - 3044
In this paper, an ultralow-power true single-phase clocking flip-flop (FF) design achieved using only 19 transistors is proposed. The design follows a master-slave-type logic structure and features a hybrid logic design comprising both static-CMOS logic and complementary pass-transistor logic. In the design, a logic structure reduction scheme is employed to reduce the number of transistors for ach... View full abstract»

• ### A Scalable In-Memory Logic Synthesis Approach Using Memristor Crossbar

Publication Year: 2018, Page(s):355 - 366
Because of their resistive switching properties and ease of controlling the resistive states, memristors have been proposed in nonvolatile storage as well as logic design applications. Memristors can be fabricated in a crossbar and suitable voltages applied to the row and column nanowires to control their states. This makes it possible to move toward new non-von Neumann-type architectures, usually... View full abstract»

• ### Security Beyond CMOS: Fundamentals, Applications, and Roadmap

Publication Year: 2017, Page(s):3420 - 3433
Hardware-oriented security and trust has traditionally relied on the dominant CMOS technology to develop security primitives and provide protection against different attacks and vulnerabilities. With CMOS nearly reaching its fundamental scaling limit and the shortcomings of current solutions, researchers are now looking to exploit emerging nanoelectronic devices for various security applications. ... View full abstract»

• ### An 11-bit 100-MS/s Subranged-SAR ADC in 65-nm CMOS

Publication Year: 2017, Page(s):3434 - 3443
This paper presents an 11-bit successive approximation register (SAR) analog-to-digital converter (ADC). The subranged-SAR ADC architecture is applied to achieve a sampling rate of 100 MHz. The proposed gain error compensation helps attenuate the gain error between coarse and fine ADCs. An up-then-down digital-to-analog converter (DAC) switching scheme is used to maintain a small common-mode varia... View full abstract»

• ### Extending 3-bit Burst Error-Correction Codes With Quadruple Adjacent Error Correction

Publication Year: 2018, Page(s):221 - 229
The use of error-correction codes (ECCs) with advanced correction capability is a common system-level strategy to harden the memory against multiple bit upsets (MBUs). Therefore, the construction of ECCs with advanced error correction and low redundancy has become an important problem, especially for adjacent ECCs. Existing codes for mitigating MBUs mainly focus on the correction of up to 3-bit bu... View full abstract»

• ### A 400-MS/s 10-b 2-b/Step SAR ADC With 52-dB SNDR and 5.61-mW Power Dissipation in 65-nm CMOS

Publication Year: 2017, Page(s):3444 - 3454
We present a single-channel 10-b 400-MS/s successive approximation register (SAR) analog-to-digital converter (ADC) embodying a proposed 2-b/step conversion scheme with single reference voltage for the IEEE 802.11ac. By means of the said scheme, the proposed ADC requires only three capacitor arrays instead of at least four capacitor arrays in other capacitor digital-to-analog converter-based 2-b/s... View full abstract»

• ### Designing Energy-Efficient Intermittently Powered Systems Using Spin-Hall-Effect-Based Nonvolatile SRAM

Publication Year: 2018, Page(s):294 - 307
Intermittently powered systems represent a new class of batteryless devices that operate solely on energy harvested from their environment. Due to the unreliable nature of ambient energy sources, these devices experience frequent intervals of power loss, leading to sudden reboots. Tolerating such power supply disruptions require the ability to rapidly checkpoint/save system state when power loss i... View full abstract»

• ### VLSI Extreme Learning Machine: A Design Space Exploration

Publication Year: 2017, Page(s):60 - 74
In this paper, we describe a compact low-power high-performance hardware implementation of extreme learning machine for machine learning applications. Mismatches in current mirrors are used to perform the vector-matrix multiplication that forms the first stage of this classifier and is the most computationally intensive. Both regression and classification (on UCI data sets) are demonstrated and a ... View full abstract»

• ### Bandwidth Enhancement to Continuous-Time Input Pipeline ADCs

Publication Year: 2018, Page(s):404 - 415
This paper presents design analysis and insights for a new continuous-time input pipeline (CTIP) analog-to-digital converter (ADC) architecture that has enhanced bandwidth. An all-pass filter-based analog delay in the signal path allows bandwidth extension to Nyquist signal bandwidths. A resetting integrator gain stage provides a signal path delay helping to increase the bandwidth while reducing t... View full abstract»

• ### Automatic Correction of Dynamic Power Management Architecture in Modern Processors

Publication Year: 2018, Page(s):308 - 318
The increasing demand for lower power forces designers to use sophisticated power management strategies such as multivoltage and power gating which are often accompanied with many design bugs. Correcting such bugs can be a time-consuming process that requires considerable manual efforts. In this paper, we propose a scalable automated method for correcting dynamic power management architectures by ... View full abstract»

• ### An FPGA-Based Hardware Accelerator for Traffic Sign Detection

Publication Year: 2017, Page(s):1362 - 1372
Traffic sign detection plays an important role in a number of practical applications, such as intelligent driver assistance and roadway inventory management. In order to process the large amount of data from either real-time videos or large off-line databases, a high-throughput traffic sign detection system is required. In this paper, we propose an FPGA-based hardware accelerator for traffic sign ... View full abstract»

• ### On the Implementation of Computation-in-Memory Parallel Adder

Publication Year: 2017, Page(s):2206 - 2219
Today's computer architectures suffer from many challenges, such as the near end of CMOS downscaling, the memory/communication bottleneck, the power wall, and the programming complexity. As a consequence, these architectures become inefficient in solving big data problems or general data intensive applications. Computation-in-memory (CIM) is a novel architecture that tries to solve/alleviate the i... View full abstract»

• ### Sense-Amplifier-Based Flip-Flop With Transition Completion Detection for Low-Voltage Operation

Publication Year: 2018, Page(s):1 - 12
A novel high-speed and highly reliable sense-amplifier-based flip-flop with transition completion detection (SAFF-TCD) is proposed for low supply voltage (VDD) operation. The SAFF-TCD adopts the internally generated detection signal to indicate the completion of sense-amplifier stage transition. The detection signal gates the pull-down path of the sense-amplifier stage and the slave latch, thus ov... View full abstract»

• ### Application of Machine Learning for Optimization of 3-D Integrated Circuits and Systems

Publication Year: 2017, Page(s):1856 - 1865
The 3-D integration helps improve performance and density of electronic systems. However, since electrical and thermal performance for 3-D integration is related to each other, their codesign is required. Machine learning, a promising approach in artificial intelligence, has recently shown promise for addressing engineering optimization problems. In this paper, we apply machine learning for the op... View full abstract»

• ### Near-Threshold RISC-V Core With DSP Extensions for Scalable IoT Endpoint Devices

Publication Year: 2017, Page(s):2700 - 2713
Endpoint devices for Internet-of-Things not only need to work under extremely tight power envelope of a few milliwatts, but also need to be flexible in their computing capabilities, from a few kOPS to GOPS. Near-threshold (NT) operation can achieve higher energy efficiency, and the performance scalability can be gained through parallelism. In this paper, we describe the design of an open-source RI... View full abstract»

• ### An On-Chip Technique to Detect Hardware Trojans and Assist Counterfeit Identification

Publication Year: 2017, Page(s):3317 - 3330
This paper introduces an embedded solution for the detection of hardware trojans (HTs) and counterfeits. The proposed method, which considers that HTs are necessarily inserted on production lots and not on a single device, is based on the fingerprinting of the static distribution of the supply voltage (Vdd) over the whole surface of an integrated circuit. The measurement of this fingerp... View full abstract»

• ### RoBA Multiplier: A Rounding-Based Approximate Multiplier for High-Speed yet Energy-Efficient Digital Signal Processing

Publication Year: 2017, Page(s):393 - 401
In this paper, we propose an approximate multiplier that is high speed yet energy efficient. The approach is to round the operands to the nearest exponent of two. This way the computational intensive part of the multiplication is omitted improving speed and energy consumption at the price of a small error. The proposed approach is applicable to both signed and unsigned multiplications. We propose ... View full abstract»

• ### A Capacitor-Less LDO With High-Frequency PSR Suitable for a Wide Range of On-Chip Capacitive Loads

Publication Year: 2016, Page(s):2970 - 2982
This paper presents an on-chip, low drop-out (LDO) voltage regulator with improved power-supply rejection (PSR) able to drive large capacitive loads. The LDO compensation is achieved via a custom, wide bandwidth capacitance multiplier (c-multiplier) that emulates a nanofarad-range capacitance at the LDO output node. The LDO frequency response resembles that of externally compensated LDOs, leading ... View full abstract»

• ### A Noise-Power-Area Optimized Biosensing Front End for Wireless Body Sensor Nodes and Medical Implantable Devices

Publication Year: 2017, Page(s):2917 - 2928
In this paper, we present a noise, power, and area efficient biosensing front-end application specified integrated circuit (ASIC) for the next-generation wireless body sensor nodes and implantable devices. We identify the key design parameter tradeoffs in the biomedical recording systems and carry out a thorough analysis and optimization to maximize them. Based on our analysis and optimization of ... View full abstract»

• ### An Interference-Robust Reconfigurable Receiver With Automatic Frequency-Calibrated LNA in 65-nm CMOS

Publication Year: 2017, Page(s):3113 - 3124
An interference-robust reconfigurable receiver in 65-nm CMOS is presented. The front end is split into a lowband (LB) RF path (0.1-1.5 GHz) and a high-band (HB) RF path (1-5 GHz). By utilizing a harmonic recombination technique, the LB path could reject the third /fifth-order harmonic interferences. A tunable narrowband dual-feedback common-gate low-noise amplifier (LNA) with LC resonant load prov... View full abstract»

• ### Low-Latency Successive-Cancellation List Decoders for Polar Codes With Multibit Decision

Publication Year: 2015, Page(s):2268 - 2280
Polar codes, as the first provable capacity-achieving error-correcting codes, have received much attention in recent years. However, the decoding performance of polar codes with traditional successive-cancellation (SC) algorithm cannot match that of the low-density parity-check or Turbo codes. Because SC list (SCL) decoding algorithm can significantly improve the error-correcting performance of po... View full abstract»

• ### DRAM-Based Intrinsic Physically Unclonable Functions for System-Level Security and Authentication

Publication Year: 2017, Page(s):1085 - 1097
A physically unclonable function (PUF) is an irreversible probabilistic function that produces a random bit string. It is simple to implement but hard to predict and emulate. PUFs have been widely proposed as security primitives to provide device identification and authentication. In this paper, we propose a novel dynamic-memory-based PUF [dynamic RAM PUF (DRAM PUF)] for the authentication of elec... View full abstract»

• ### A Study of the Effect of RRAM Reliability Soft Errors on the Performance of RRAM-Based Neuromorphic Systems

Publication Year: 2017, Page(s):3125 - 3137
Resistive RAM (RRAM) device has been extensively used as a scalable nonvolatile memory cell in neuromorphic systems due to its several advantages, including its small size and low-power requirements. However, resulting from the stochastic nature of the oxygen vacancies, the RRAM device suffers from reliability soft errors. In this paper, we provide for the first time a modeling framework to comput... View full abstract»

• ### A Single-Ended With Dynamic Feedback Control 8T Subthreshold SRAM Cell

Publication Year: 2016, Page(s):373 - 377
A novel 8-transistor (8T) static random access memory cell with improved data stability in subthreshold operation is designed. The proposed single-ended with dynamic feedback control 8T static RAM (SRAM) cell enhances the static noise margin (SNM) for ultralow power supply. It achieves write SNM of 1.4× and 1.28× as that of isoarea 6T and read-decoupled 8T (RD-8T), respectively, at 3... View full abstract»

• ### An R2R-DAC-Based Architecture for Equalization-Equipped Voltage-Mode PAM-4 Wireline Transmitter Design

Publication Year: 2017, Page(s):3260 - 3264
This brief presents a wireline transmitter architecture, enabling multilevel signaling with feedforward equalization (FFE) in voltage-mode. A compact R2R-DAC-based front end is proposed and analyzed in terms of its speed, power consumption, and linearity. A voltage-mode PAM-4 transmitter with 2-tap FFE utilizing the proposed architecture is implemented in the 65-nm CMOS technology. It achieves a d... View full abstract»

• ### Low-VDD Operation of SRAM Synaptic Array for Implementing Ternary Neural Network

Publication Year: 2017, Page(s):2962 - 2965
For Internet of Things (IoT) edge devices, it is very attractive to have the local sensemaking capability instead of sending all the data back to the cloud for information processing. For image pattern recognition, neuro-inspired machine learning algorithms have demonstrated enormous powerfulness. To effectively implement learning algorithms on-chip for IoT edge devices, on-chip synaptic memory ar... View full abstract»

• ### Pipelined Radix-$2^{k}$ Feedforward FFT Architectures

Publication Year: 2013, Page(s):23 - 32
The appearance of radix-22 was a milestone in the design of pipelined FFT hardware architectures. Later, radix-22 was extended to radix-2k . However, radix-2k was only proposed for single-path delay feedback (SDF) architectures, but not for feedforward ones, also called multi-path delay commutator (MDC). This paper presents the radix-2k feedforward (MDC) ... View full abstract»

• ### The Low Area Probing Detector as a Countermeasure Against Invasive Attacks

Publication Year: 2018, Page(s):392 - 403
Microprobing allows intercepting data from on-chip wires as well as injecting faults into data or control lines. This makes it a commonly used attack technique against security-related semiconductors, such as smart card controllers. We present the low area probing detector (LAPD) as an efficient approach to detect microprobing. It compares delay differences between symmetric lines such as bus line... View full abstract»

• ### Hardware Trojan Detection Through Chip-Free Electromagnetic Side-Channel Statistical Analysis

Publication Year: 2017, Page(s):2939 - 2948
The hardware Trojan (HT) has become a major threat for the integrated circuit (IC) industry and supply chain, and has motivated numerous developments of Trojan detection schemes. Although the side-channel method is the most promising one, nearly all of the side-channel methods require fabricated golden chips, which are very difficult to obtain in reality. In this paper, we propose a novel strategy... View full abstract»

