A Convolution Neural Network Accelerator Design with Weight Mapping and Pipeline Optimization | IEEE Conference Publication | IEEE Xplore

A Convolution Neural Network Accelerator Design with Weight Mapping and Pipeline Optimization


Abstract:

The pipeline is an efficient solution to boost performance in non-volatile memory based computing in memory (nvCIM) convolution neural network (CNN) accelerators. However...Show More

Abstract:

The pipeline is an efficient solution to boost performance in non-volatile memory based computing in memory (nvCIM) convolution neural network (CNN) accelerators. However, the previous works seldom focus on pipeline optimization from the perspective of the whole system, especially overlooking the effect of buffer access. In this work, we propose a high-performance NVM-based CNN accelerator with a balanced pipeline design, which takes account of both the macro computing and the buffer access. At the operator level, a matrix-based weight mapping method is proposed to reduce buffer access delay. At the macro level, decoupled access and execution design is introduced to shorten the single-layer latency. At the system level, a hybrid inter/intra-tile design is presented to balance the overall latency across CNN layers. With the collaboration among three methods, we construct a well-balanced pipeline for the nvCIM accelerator at a smaller hardware cost. Experiments show that our pipeline design can achieve 3.7х, 7.5х, and 3.5х throughput improvement for recognition of ImageNet with ResNet18, VGG19, and ResNet34 models, respectively.
Date of Conference: 09-13 July 2023
Date Added to IEEE Xplore: 15 September 2023
ISBN Information:
Conference Location: San Francisco, CA, USA

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.