Journals & Magazines >IEEE Journal on Emerging and ... >Volume: 13 Issue: 1

E-UPQ: Energy-Aware Unified Pruning-Quantization Framework for CIM Architecture

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The wide adoption of convolutional neural networks (CNNs) in many applications has given rise to unrelenting computational demand and memory requirements. Computing-in-Me...Show More

Metadata

Abstract:

The wide adoption of convolutional neural networks (CNNs) in many applications has given rise to unrelenting computational demand and memory requirements. Computing-in-Memory (CIM) architecture has demonstrated great potential to break the memory wall and effectively execute CNN workloads. Ongoing research focuses on pruning or quantizing CNNs to achieve higher efficiency on CIM. However, prior works preclude the possibility of integrating both techniques in the same framework. On the other hand, directly incorporating energy estimation during the model compression process has not been well explored in the literature. In this paper, we present an Energy-aware Unified Pruning-Q uantization (E-UPQ) mechanism, a novel framework for automated compression (pruning + quantization) of CNNs while considering the energy-accuracy trade-off. Specifically, E-UPQ interweaves pruning and quantization seamlessly by viewing pruning as a special case of “0-bit” quantization during the mixed-precision search. In addition, E-UPQ introduces a set of trainable parameters to incorporate energy information during the compression process, closing the gap between compression policy and energy optimization. Experimental results evaluated on DNN+NeuroSim show that E-UPQ reduces energy consumption by up to 79.3% and 66.6% for VGG-16 and ResNet-18, respectively, compared with the state-of-the-art work, while achieving similar accuracy on CIFAR-100. Layer-wise analysis and ablation studies are provided to validate the effectiveness of the E-UPQ. We also present the corresponding CIM architecture to support the proposed E-UPQ framework.

Published in: IEEE Journal on Emerging and Selected Topics in Circuits and Systems ( Volume: 13, Issue: 1, March 2023)

Page(s): 21 - 32

Date of Publication: 06 February 2023

ISSN Information:

DOI: 10.1109/JETCAS.2023.3242761

Funding Agency:

No metrics found for this document.

Contents

I. Introduction

Recent years have witnessed the superiority and persistent improvement of convolutional neural networks (CNNs) in many applications, such as computer vision and natural language processing [1]. However, the high accuracy of modern CNNs comes with many weights, leading to massive data movements between memory and computing units. The so-called von Neumann bottleneck causes enormous energy consumption. It also hinders the applicability of CNNs on conventional hardware platforms and resource-constrained devices.

Usage

Select a Year

View as

Total usage sinceFeb 2023:1,077

Year Total:83

Data is updated monthly. Usage includes PDF downloads and HTML views.

Citations

Crossref^®

Scopus^®

Web
of Science^®

Search for
Citations in
Google Scholar^®

References is not available for this document.

E-UPQ: Energy-Aware Unified Pruning-Quantization Framework for CIM Architecture

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

View as

References

IEEE Account

Purchase Details

Profile Information

Need Help?

E-UPQ: Energy-Aware Unified Pruning-Quantization Framework for CIM Architecture

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

View as

Footnotes

References

IEEE Account

Purchase Details

Profile Information

Need Help?