Loading [MathJax]/jax/output/HTML-CSS/autoload/mtable.js
The marriage of training and inference for scaled deep learning analog hardware | IEEE Conference Publication | IEEE Xplore

The marriage of training and inference for scaled deep learning analog hardware


Abstract:

Resistive crossbar arrays are promising candidates for efficient execution of deep neural network (DNN) inference workloads. The weight matrices of a neural network are m...Show More

Abstract:

Resistive crossbar arrays are promising candidates for efficient execution of deep neural network (DNN) inference workloads. The weight matrices of a neural network are mapped to the conductance values on crossbar arrays and then used as vector-matrix multiply engines. Although this mapping seems straightforward, we show that for large scale DNNs the weights must come from a training procedure that accounts for hardware induced constraints, such as ADC, DAC, noise and device fails, for the inference task to run successfully on analog hardware composed of crossbar arrays.
Date of Conference: 07-11 December 2019
Date Added to IEEE Xplore: 13 February 2020
ISBN Information:

ISSN Information:

Conference Location: San Francisco, CA, USA

I. Introduction

The inference of a DNN mainly involves computing a series vector-matrix multiplies, \begin{equation*}{\mathbf{y}}{\text{ = W}}{\mathbf{x}}\tag{1}\end{equation*}

and nonlinear transformations, such as pooling, ReLU and softmax. The vector-matrix multiply is performed on resistive crossbar arrays all in parallel and constant time using Ohm’s and Kirchhoff’s law and was proposed more than 50 years ago [1]. The weight matrix W is stored as differential conductance values in the crossbar array. The input vector x is transmitted as voltage pulses through each of the rows and resulting vector y is read as an integrated current signal from the columns as illustrated in Fig. 1. However, the vector-matrix multiply performed by the crossbar arrays and the supporting peripheral circuitry is only an approximation of (1) and can be written as \begin{equation*}{\mathbf{y}}{\text{ = ADC}}\left( {{{\text{W}}_{\text{r}}}{\text{ DAC(}}{\mathbf{x}}{\text{) + noise}}} \right)\tag{2}\end{equation*}
where DAC discretizes the input vector x, noise is introduced due to analog computation (such as thermal, 1/f, or opamp noise) and ADC clips and discretizes the output. In addition, Wr, the weight matrix stored on the crossbar array, also deviates from the original weight matrix W of the model due to transfer errors, limited conductance ranges, device fails and variability. All these hardware induced constraints and variations therefore make inference of DNNs on analog hardware a non-trivial task. As we show here, if one naïvely tries to get a better approximate of (1) by designing a low noise and high accuracy analog hardware, it results in very challenging hardware specifications that makes this approach impractical, especially for large scale networks. Instead, if hardware non-idealities are introduced in the training process, high accuracy in the inference process can be maintained. When the hardware induced non-idealities are introduced into the training process, then the system solves a constraint optimization problem that is guaranteed to perform better once run on the analog hardware that has the same constraints applied during training. Furthermore, noise terms introduced during training act as regularization terms and improve the robustness of the DNN inference to hardware fails. Therefore, the training process of a DNN should be tightly coupled (married) to the hardware that the trained network would run on during the inference task as illustrated in Fig. 2. If the conventional model trained with floating point arithmetic is used for the inference task on analog hardware, then the model’s performance will significantly degrade compared to a model obtained using hardware aware (HWA) training [2]–[3].

Contact IEEE to Subscribe

References

References is not available for this document.