Abstract:
Deep learning has recently shown extensive usage in a wide variety of applications. To accelerate deep learning in hardware, non-volatile memory (NVM) technologies have r...Show MoreMetadata
Abstract:
Deep learning has recently shown extensive usage in a wide variety of applications. To accelerate deep learning in hardware, non-volatile memory (NVM) technologies have recently been used to perform neural network (NN) computation based upon their unique crossbar structure and multiple resistance states in a cell. We observe that the weight matrices in convolutional layers, although being small, are used many times by the input data. As a result, they have better chances to be replicated and co-located in the same crossbar array to improve data processing parallelism. The first scheme proposed in this paper therefore utilizes the shared input in replicating weight matrices and overlapping them to improve parallelism. Furthermore, this paper proposes a heterogeneous accelerator consisting of both large and small crossbar arrays, by mapping fully-connected layers to large crossbar arrays to obtain the area/power reductions and keeping convolutional layers in conventional (small) crossbar arrays to retain performance. Experimental results show significant benefits of the proposed schemes in performance, energy efficiency, and area.
Published in: 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC)
Date of Conference: 17-19 November 2018
Date Added to IEEE Xplore: 13 May 2019
ISBN Information:
ISSN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Deep Learning ,
- Efficient Allocation ,
- Compositional Heterogeneity ,
- Non-volatile Memory ,
- Crossbar Array ,
- Neural Network ,
- Energy Efficiency ,
- Convolutional Layers ,
- Weight Matrix ,
- Large Array ,
- Fully-connected Layer ,
- Wide Variety Of Applications ,
- Neural Network Approximation ,
- Small Array ,
- Convolutional Neural Network ,
- Deep Neural Network ,
- Feature Maps ,
- Input Vector ,
- Weight Vector ,
- Large Matrix ,
- Output Feature Map ,
- Synaptic Weights ,
- Multiple Arrays ,
- Data Window ,
- Output Pixel ,
- Output Feature Vector
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Deep Learning ,
- Efficient Allocation ,
- Compositional Heterogeneity ,
- Non-volatile Memory ,
- Crossbar Array ,
- Neural Network ,
- Energy Efficiency ,
- Convolutional Layers ,
- Weight Matrix ,
- Large Array ,
- Fully-connected Layer ,
- Wide Variety Of Applications ,
- Neural Network Approximation ,
- Small Array ,
- Convolutional Neural Network ,
- Deep Neural Network ,
- Feature Maps ,
- Input Vector ,
- Weight Vector ,
- Large Matrix ,
- Output Feature Map ,
- Synaptic Weights ,
- Multiple Arrays ,
- Data Window ,
- Output Pixel ,
- Output Feature Vector
- Author Keywords