Loading [MathJax]/extensions/MathMenu.js
PSQ: An Automatic Search Framework for Data-Free Quantization on PIM-based Architecture | IEEE Conference Publication | IEEE Xplore

PSQ: An Automatic Search Framework for Data-Free Quantization on PIM-based Architecture


Abstract:

Crossbar-based Process-In-Memory (PIM) architecture has been considered as a promising solution for Deep Neural Networks (DNNs) acceleration. Due to the ever increasing m...Show More

Abstract:

Crossbar-based Process-In-Memory (PIM) architecture has been considered as a promising solution for Deep Neural Networks (DNNs) acceleration. Due to the ever increasing model size and computational budget of DNNs, model compression is a critical step for the deployment of DNNs. However, when deploying DNNs in PIM architectures, fine-grained quantization on DNN weight matrices is not easy due to the inflexible data path inside the crossbar.To this end, in this paper, we study the feasibility and efficiency of a novel fine-grained quantization scheme called PSQ for PIM-based design. The scheme tightly combines the search principle of quantization and the PIM architecture to provide smooth hardware-friendly quantization. We leverage the weight locality and the variety of weight distributions in different blocks to facilitate the fine-grained quantization process. Meanwhile, we propose a lightweight search framework to adaptively allocate the quantization parameters (e.g., scale, bitwidth, etc.). During the search process, suitable quantization parameters are assigned directly to each fine-grained block, keeping the weight distributions before and after quantization as close as possible, thus minimizing the quantization errors. Our evaluation shows that the proposed PSQ achieves 3.5× reduction in occupied crossbars while the accuracy loss is negligible. What’s more, PSQ can perform such a process in just a few seconds on a single CPU, without model retraining and expensive computation.
Date of Conference: 06-08 November 2023
Date Added to IEEE Xplore: 22 December 2023
ISBN Information:

ISSN Information:

Conference Location: Washington, DC, USA

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.