Abstract:
Deep learning neural networks are of critical importance to enable next-generation IoT devices. However, due to the limited computation power, memory space, and energy, i...Show MoreMetadata
Abstract:
Deep learning neural networks are of critical importance to enable next-generation IoT devices. However, due to the limited computation power, memory space, and energy, it remains a grand challenge to deploy those algorithms on IoT devices efficiently as they demand high computation, energy, and memory footprint. Numerous pruning methods of deep learning algorithms have been proposed to minimize the latency, energy, and weights. However, few consider the running time memory footprint and the overhead caused by data movement between the volatile memory and non-volatile memory. This paper proposes four novel memory-ware mechanisms for implementing CNN models on self-restrained embedded systems. The proposed techniques maximize the use of high-speed volatile memory and provide three implementation choices to achieve the minimum energy cost, SRAM space usage, and inference latency, as well as a hybrid trade-off choice of the three features. The experimental evaluation compares their energy cost, time latency, and required run-time memory footprint and demonstrates high implementation efficiency.
Published in: 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP)
Date of Conference: 07-09 July 2021
Date Added to IEEE Xplore: 23 August 2021
ISBN Information: