Abstract:
The widespread and ever-increasing demand for performing in situ inference, signal processing, and other computationally intensive applications in mobile Internet-of-Thin...Show MoreMetadata
Abstract:
The widespread and ever-increasing demand for performing in situ inference, signal processing, and other computationally intensive applications in mobile Internet-of-Things (IoT) devices requires fast, compact, and energy-efficient vector-by-matrix multipliers (VMMs). The time-domain VMMs based on emerging nonvolatile memory devices exhibit significantly higher circuit density and energy efficiency than their current-mode counterparts. However, the load capacitors used to accumulate the weighted summation of the inputs in the time-domain-based circuits dominate their energy dissipation and footprint area. The true potential of the time-domain-based VMMs may be realized only when this overhead is minimized. To this end, in this brief, we propose a novel successive integration and rescaling (SIR) approach for implementing a highly efficient mixed-signal time-domain VMM for low-to-medium-precision computing. For a proof of concept, we quantitatively evaluated the performance of the proposed SIR VMM and compared it with the results of the conventional time-domain VMM, using a similar 1T-1R array. Preliminary simulation results for the 4-bit 200\, \times \, 200 VMM, implemented using a 55-nm technology node, show area and energy efficiencies of 1.33 bits/m2 and ~1.3 POp/J-the numbers, respectively, \sim 2.5\times and \sim 2.65\times higher than those for the prior-work time-domain VMM. Furthermore, we analyze the system-level performance of the proposed SIR VMM engine in the neuromorphic accelerator architectures and provide the preliminary estimates for various deep/recurrent neural network (DNN/RNN) applications.
Published in: IEEE Transactions on Very Large Scale Integration (VLSI) Systems ( Volume: 28, Issue: 3, March 2020)