Loading [MathJax]/extensions/MathMenu.js
Woorham Bae - IEEE Xplore Author Profile

Showing 1-25 of 50 results

Filter Results

Show

Results

We present a 256Gbps WDM transceiver macro for use in co-packaged AI scale-out interconnects. The macro adopts a $8\lambda\times 32\text{Gbps}$ configuration to achieve 256Gbps per fiber. The transceiver macro operates error-free across a dynamic temperature ramp of 50-110°C, has >4dB optical link margin with $4\text{dBm}/\lambda$ input laser power, and consumes 3.45pJ/b ($\text{TX}+\text{RX}$). A...Show More
This letter presents a quadrature error and duty-cycle correction circuit based on an asynchronous sampling technique. Sampling the multiphased clocks provides their phase and duty-cycle information, which are then utilized to compensate for the quadrature and duty-cycle errors. The intrinsic downconversion operation of the proposed sampling approach enhances the correction accuracy by mitigating ...Show More
A conventional figure-of-merit (FOM) for a phase-locked loop (PLL) has served as the most powerful indicator to compare and to normalize performance of different PLL designs. Simply, the conventional FOM is based on the jitter-power trade-off. With a few assumptions, theoretically, it provides a fair comparison. However, as the PLL design techniques have advanced, the assumptions have started brea...Show More
This article proposes an in-memory Hamming error-correcting code (ECC) in the memristor crossbar array (CBA). Based on the unique ${I}{-}{V}$ characteristic of complementary resistive switching (CRS) memristor, this work discovers that a combination of three memristors behaves as a stateful EXCLUSIVE-OR (XOR) logic device. In addition, a two-step (build-up and fire) current-mode CBA driving sche...Show More
This work presents a RISC-V system-on-chip (SoC) with eight application cores containing programmable-precision vector accelerators. The SoC is built by using a generator-based design methodology, which enables the integration of open-source and project-specific building blocks to develop differentiated functionality. The digital component generators use Chisel, the analog component generators use...Show More
For the first time, we demonstrate an error-free, 128Gbps (8x16Gbps) optical transceiver using a microring-based wavelength-division multiplexed (WDM) architecture. The optical transceiver ran for 12 hours with zero errors, resulting in a measured bit-error rate of <1.45e-15 per optical lane. The total number of bits sent during this time was ~691 terabits per lane and ~5.5 petabits aggregate acro...Show More
For the first time, we demonstrate an error-free, 128Gbps (8x16Gbps) optical transceiver using a microring-based wavelength-division multiplexed (WDM) architecture. The optical transceiver ran for 12 hours with zero errors, resulting in a measured bit-error rate of <1.45e-15 per optical lane. The total number of bits sent during this time was ~691 terabits per lane and ~5.5 petabits aggregate acro...Show More
We demonstrate 128 Gbps/port (8-λ×16 Gbps/λ) natively error-free transmission across eight optical ports using a 8-port, 8-λ/port WDM remote laser source and a pair of monolithically integrated CMOS optical I/O chiplets with 4.96-5.56 pJ/bit optical Tx+Rx chiplet energy efficiency.Show More
A conventional figure-of-merit for a phase-locked loop (PLL) based on integrated RMS jitter and power consumption has been a strong indicator to compare and to normalize PLL performance over different designs. However, it has some limitations because any impact from reference clock is not reflected. As a result, it is not enough to evaluate state-of-the- art PLL designs such as injection-locked PL...Show More
Modern workloads, such as deep neural networks (DNNs), increasingly rely on dense arithmetic compute patterns that are ill-suited for general-purpose processors, leading to a rise in domain-specific compute accelerators [1]. Many of these workloads can benefit from varying precision during computation, e.g. different precisions among layers and between training and inference for DNNs has been show...Show More
LAYout with Gridded Objects (LAYGO), a Python-based layout-generation engine for enhancing the design productivity of custom circuit layouts in advanced CMOS processes, is presented and verified by implementing a time-interleaved SAR (TI-SAR) ADC instance in a 16 nm CMOS FinFET technology. LAYGO supports rapid generation by placing customized templates on process-specific placement grids, thereby ...Show More
We demonstrate an electro-optic platform enabling a direct optical I/O interface in an ASIC package. The 5.5×8.9mm2 chiplet uses the Advanced Interface Bus (AIB), a parallel digital interface, to communicate to a host ASIC and integrates high-speed digital/analog circuits, optical modulators, photodetectors, and waveguides. Transmitters and receivers demonstrate data-rates up to 25Gbps at 4.9pJ/bi...Show More
It can never be overemphasized the importance of understanding noise in electronics. However, it has not been deeply considered in memristor-based circuits and systems, while there have been tons of efforts to utilize the memristors to improve the present computing system. This article presents comprehensive read margin and bit-error-rate (BER) analyses, where the noise contributions from the memr...Show More
This paper presents a 7GS/s 8-bit time-interleaved SAR ADC instance produced from a generator-based design flow in a 16nm CMOS FinFET technology. Design techniques such as cross-coupled routing and clock delay modulation are utilized for compatibility with a FinFET process. The time-interleaved SAR ADC layout is automatically generated by placing templates and wires on a grid to abstract design ru...Show More
This paper demonstrates a signal analysis systemon-chip (SoC) consisting of a general-purpose RISC-V core with vector extensions and a fixed-function signal-processing accelerator. Both the application core and the accelerators are design instances produced through an agile design-space exploration process by generators that allow for a wide range of parameter configurations. The signal processing...Show More
A 1.89-GHz bandwidth, 175-kHz resolution spectral analysis system-on-chip (SoC), integrating a subsampling analog-to-digital converter (ADC) frontend with a digital reconstruction backend and implementing a 21 600-point sparse Fourier transform based on the fast Fourier aliasing-based sparse transform (FFAST) algorithm has been co-designed by using the Constructing Hardware in a Scala Embedded Lan...Show More
This paper presents the reference spur reduction techniques for an analog phase-locked loop (PLL). A simple leakage compensation loop is proposed, which cancels the leakage current of the PLL loop filter with a negligible power overhead. This leakage compensation loop senses the leakage current of the loop filter from the up and down pulse widths in the steady state and compensates for the charge ...Show More
This paper demonstrates a signal analysis SoC consisting of a general-purpose RISC-V core with vector extensions and a fixed-function signal-processing accelerator. Both the core and the accelerators are instances produced by novel generators that allow for a wide range of parameter configurations and rapid design space exploration. The signal processing chain consists of generated instances of a ...Show More
A 1.89-GHz bandwidth, 175-kHz resolution spectral analysis SoC, integrating a subsampling ADC frontend with a digital reconstruction backend and implementing a 21,600-point FFAST sparse FFT [1] has been generated using the Chisel [2] and BAG [3] frameworks in 16-nm CMOS. Three sets of 25×, 27×, and 32×subsampling SAR ADCs acquire signal with ~5.4-6.3 ENOB/slice. The digital backend consists of mix...Show More
A variation-tolerant and sneak-current-free readout technique for cross-point non-volatile memory is presented. The proposed readout circuit has a sneak current compensation port, which collects sneak current information from multiple unselected cells, and efficiently cancels it out without delay and area overhead. In addition, this scheme averages out the random mismatches in the unselected cells...Show More
A single-loop referenceless clock and data recovery (CDR) with a compact frequency acquisition scheme is presented. A bang-bang phase-frequency detector (BBPFD) is proposed that tracks the frequency difference by detecting the drift direction of the non-return to zero bit stream with respect to the multi-phase clock and generates UP/DN output signals accordingly. When frequency locked, the BBPFD i...Show More
We present BAG2, a framework for the development of process-portable Analog and Mixed Signal (AMS) circuit generators. Such generators are parametrized design procedures that produce schematics, layouts, and verification testbenches for a circuit given input specifications. This paper expands on previous work by introducing a universal AMS circuit verification framework into BAG2, as well as two n...Show More
A 2.5-5.6 GHz low-phase-noise subharmonically injection-locked sub-sampling all-digital phase-locked loop with a dual-edge complementary switched injection technique is presented. While previously reported injection-locked phase-locked loops (ILPLLs) require additional circuitry for resolving a phase alignment mismatch between the PLL loop and injection path, the presented ILPLL exhibits a simplif...Show More
Supporting a wide operating range for industrial-standard backward-compatible transmitters often results in energy inefficiency. This paper describes an energy-efficient voltage-mode-serializing transmitter with an operating range of up to 32 Gb/s. The proposed transmitter uses a programmable internal supply to set the voltage level for various data rates optimally, thus improving overall energy e...Show More
This brief presents a 32 Gb/s driver for a Mach-Zehnder modulator (MZM) and an electro-absorption modulator (EAM). A push-pull current-mode logic driver is chosen to achieve a better power efficiency and a large voltage swing. A double cascode with thin oxide transistors is employed to mitigate the over-voltage stress associated with a large output voltage swing. At the same time, shunt-peaking in...Show More