FAT: Frequency-Aware Transformation for Bridging Full-Precision and Low-Precision Deep Representations | IEEE Journals & Magazine | IEEE Xplore

FAT: Frequency-Aware Transformation for Bridging Full-Precision and Low-Precision Deep Representations


Abstract:

Learning low-bitwidth convolutional neural networks (CNNs) is challenging because performance may drop significantly after quantization. Prior arts often quantize the net...Show More

Abstract:

Learning low-bitwidth convolutional neural networks (CNNs) is challenging because performance may drop significantly after quantization. Prior arts often quantize the network weights by carefully tuning hyperparameters such as nonuniform stepsize and layerwise bitwidths, which are complicated since the full- and low-precision representations have large discrepancies. This work presents a novel quantization pipeline, named frequency-aware transformation (FAT), that features important benefits: 1) instead of designing complicated quantizers, FAT learns to transform network weights in the frequency domain to remove redundant information before quantization, making them amenable to training in low bitwidth with simple quantizers; 2) FAT readily embeds CNNs in low bitwidths using standard quantizers without tedious hyperparameter tuning and theoretical analyses show that FAT minimizes the quantization errors in both uniform and nonuniform quantizations; and 3) FAT can be easily plugged into various CNN architectures. Using FAT with a simple uniform/logarithmic quantizer can achieve the state-of-the-art performance in different bitwidths on various model architectures. Consequently, FAT serves to provide a novel frequency-based perspective for model quantization.
Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 35, Issue: 2, February 2024)
Page(s): 2640 - 2654
Date of Publication: 22 July 2022

ISSN Information:

PubMed ID: 35867358

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.