Abstract:
This work observes that a large fraction of the computations performed by Deep Neural Networks (DNNs) are intrinsically ineffectual as they involve a multiplication where...Show MoreMetadata
Abstract:
This work observes that a large fraction of the computations performed by Deep Neural Networks (DNNs) are intrinsically ineffectual as they involve a multiplication where one of the inputs is zero. This observation motivates Cnvolutin (CNV), a value-based approach to hardware acceleration that eliminates most of these ineffectual operations, improving performance and energy over a state-of-the-art accelerator with no accuracy loss. CNV uses hierarchical data-parallel units, allowing groups of lanes to proceed mostly independently enabling them to skip over the ineffectual computations. A co-designed data storage format encodes the computation elimination decisions taking them off the critical path while avoiding control divergence in the data parallel units. Combined, the units and the data storage format result in a data-parallel architecture that maintains wide, aligned accesses to its memory hierarchy and that keeps its data lanes busy. By loosening the ineffectual computation identification criterion, CNV enables further performance and energy efficiency improvements, and more so if a loss in accuracy is acceptable. Experimental measurements over a set of state-of-the-art DNNs for image classification show that CNV improves performance over a state-of-the-art accelerator from 1.24x to 1.55x and by 1.37x on average without any loss in accuracy by removing zero-valued operand multiplications alone. While CNV incurs an area overhead of 4.49%, it improves overall EDP (Energy Delay Product) and ED2P (Energy Delay Squared Product) on average by 1.47x and 2.01x, respectively. The average performance improvements increase to 1.52x without any loss in accuracy with a broader ineffectual identification policy. Further improvements are demonstrated with a loss in accuracy.
Date of Conference: 18-22 June 2016
Date Added to IEEE Xplore: 25 August 2016
ISBN Information:
Print ISSN: 1063-6897
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Neural Network ,
- Deep Neural Network ,
- Accuracy Loss ,
- Improve Energy Efficiency ,
- Critical Path ,
- Hardware Accelerators ,
- Convolutional Neural Network ,
- Convolutional Layers ,
- Per Cycle ,
- Sparse Matrix ,
- Single Neuron ,
- Output Neurons ,
- Number Of Filters ,
- Input Neurons ,
- Partial Sums ,
- Single Lane ,
- Fraction Of Neurons ,
- External Memory ,
- Work Assignments ,
- 3D Array ,
- Baseline Architecture ,
- Current Window ,
- Sparse Format
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Neural Network ,
- Deep Neural Network ,
- Accuracy Loss ,
- Improve Energy Efficiency ,
- Critical Path ,
- Hardware Accelerators ,
- Convolutional Neural Network ,
- Convolutional Layers ,
- Per Cycle ,
- Sparse Matrix ,
- Single Neuron ,
- Output Neurons ,
- Number Of Filters ,
- Input Neurons ,
- Partial Sums ,
- Single Lane ,
- Fraction Of Neurons ,
- External Memory ,
- Work Assignments ,
- 3D Array ,
- Baseline Architecture ,
- Current Window ,
- Sparse Format