Journals & Magazines >IEEE Embedded Systems Letters >Volume: 17 Issue: 1

Hardware-Aware Bayesian Neural Architecture Search of Quantized CNNs

Download PDF
Download References
Request Permissions
Save to
Alerts

0 seconds of 0 secondsVolume 90%

00:00

In this letter, we present a NAS method to search for efficient CNN architectures that require moderate computational resources at inference. The idea is to first learn a...

Abstract:

Advances in neural architecture search (NAS) now provide a crucial assistance to design hardware-efficient neural networks (NNs). This letter presents NAS for resource-ef...Show More

Metadata

Abstract:

Advances in neural architecture search (NAS) now provide a crucial assistance to design hardware-efficient neural networks (NNs). This letter presents NAS for resource-efficient, weight-quantized convolutional NNs (CNNs), under computational complexity constraints (model size and number of computations). Bayesian optimization is used to efficiently search for traceable CNN architectures within a continuous embedding space. This embedding is the latent space of a neural architecture autoencoder, regularized with a maximum mean discrepancy penalization and a convex latent predictor of parameters. On CIFAR-100, and without quantization, we obtain 75% test accuracy with less than 2.5M parameters and 600M operations. NAS experiments on STL-10 with 32, 8, and 4 bit weights outperform some high-end architectures while enabling drastic model size reduction (6 Mb–840 kb). It demonstrates our method’s ability to discover lightweight and high-performing models, while showcasing the importance of quantization to improve the tradeoff between accuracy and model size.

0 seconds of 0 secondsVolume 90%

00:00

In this letter, we present a NAS method to search for efficient CNN architectures that require moderate computational resources at inference. The idea is to first learn a...

Published in: IEEE Embedded Systems Letters ( Volume: 17, Issue: 1, February 2025)

Page(s): 42 - 45

Date of Publication: 26 July 2024

ISSN Information:

DOI: 10.1109/LES.2024.3434379

Contents

I. Introduction

Compact neural networks (NNs) should run on low-power devices whose memory and computational complexity are limited [1], [2]. To provide hardware-efficient models, neural architecture search (NAS) can take into account estimates of latency and energy [3], [4], [5], [6], parameter count [7], [8], [9], number of operations [7], [8], [10], as well as weight [11], [12] and activation [13], [14], [15] bit widths. Indeed, model quantization enables drastic memory savings while maintaining remarkable performance, especially with quantization aware training (QAT) [16]. This letter thus introduces LBQ-NAS for latent Bayesian quantized NAS, to optimize the architecture of convolutional NN (CNN) image classifiers with quantized weights, by combining hardware-aware cost functions and QAT. First, LBQ-NAS trains a Wasserstein autoencoder (WAE) [17] to encode and decode CNN architectures through a low dimensional and continuous latent space embedding (LSE). Second, Bayesian optimization (BO) is used in order to discover efficient weight-quantized models in this LSE (Fig. 1), with a cost function accounting for accuracy, parameter count and operation count. Finally, we perform an architecture retraining (AR) phase with the best candidates to achieve competitive performance. Fig. 1.

Given a pretrained LSE, our NAS works as follows: at iteration i, is decoded into a neural architecture , that undergoes QAT. Its validation accuracy , parameter count #P_i, and operation count #Op_i are then measured. Finally, these metrics are aggregated in the cost function , whose value is fed to the BO algorithm to determine the next point .

References is not available for this document.

Hardware-Aware Bayesian Neural Architecture Search of Quantized CNNs

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Hardware-Aware Bayesian Neural Architecture Search of Quantized CNNs

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

I. Introduction

Authors

Figures

References

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?