Conferences >2018 ACM/IEEE 45th Annual Int...

Computation Reuse in DNNs by Exploiting Input Similarity

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs ...Show More

Metadata

Abstract:

In recent years, Deep Neural Networks (DNNs) have achieved tremendous success for diverse problems such as classification and decision making. Efficient support for DNNs on CPUs, GPUs and accelerators has become a prolific area of research, resulting in a plethora of techniques for energy-efficient DNN inference. However, previous proposals focus on a single execution of a DNN. Popular applications, such as speech recognition or video classification, require multiple back-to-back executions of a DNN to process a sequence of inputs (e.g., audio frames, images). In this paper, we show that consecutive inputs exhibit a high degree of similarity, causing the inputs/outputs of the different layers to be extremely similar for successive frames of speech or images of a video. Based on this observation, we propose a technique to reuse some results of the previous execution, instead of computing the entire DNN. Computations related to inputs with negligible changes can be avoided with minor impact on accuracy, saving a large percentage of computations and memory accesses. We propose an implementation of our reuse-based inference scheme on top of a state-of-the-art DNN accelerator. Results show that, on average, more than 60% of the inputs of any neural network layer tested exhibit negligible changes with respect to the previous execution. Avoiding the memory accesses and computations for these inputs results in 63% energy savings on average.

Published in: 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA)

Date of Conference: 01-06 June 2018

Date Added to IEEE Xplore: 23 July 2018

ISBN Information:

Electronic ISSN: 2575-713X

DOI: 10.1109/ISCA.2018.00016

Conference Location: Los Angeles, CA, USA

Contents

References is not available for this document.

Computation Reuse in DNNs by Exploiting Input Similarity

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Computation Reuse in DNNs by Exploiting Input Similarity

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?