Loading web-font TeX/Main/Regular
FPGA-Based High-Throughput CNN Hardware Accelerator With High Computing Resource Utilization Ratio | IEEE Journals & Magazine | IEEE Xplore

FPGA-Based High-Throughput CNN Hardware Accelerator With High Computing Resource Utilization Ratio


Abstract:

The field-programmable gate array (FPGA)-based CNN hardware accelerator adopting single-computing-engine (CE) architecture or multi-CE architecture has attracted great at...Show More

Abstract:

The field-programmable gate array (FPGA)-based CNN hardware accelerator adopting single-computing-engine (CE) architecture or multi-CE architecture has attracted great attention in recent years. The actual throughput of the accelerator is also getting higher and higher but is still far below the theoretical throughput due to the inefficient computing resource mapping mechanism and data supply problem, and so on. To solve these problems, a novel composite hardware CNN accelerator architecture is proposed in this article. To perform the convolution layer (CL) efficiently, a novel multiCE architecture based on a row-level pipelined streaming strategy is proposed. For each CE, an optimized mapping mechanism is proposed to improve its computing resource utilization ratio and an efficient data system with continuous data supply is designed to avoid the idle state of the CE. Besides, to relieve the off-chip bandwidth stress, a weight data allocation strategy is proposed. To perform the fully connected layer (FCL), a single-CE architecture based on a batch-based computing method is proposed. Based on these design methods and strategies, visual geometry group network-16 (VGG-16) and ResNet-101 are both implemented on the XC7VX980T FPGA platform. The VGG-16 accelerator consumed 3395 multipliers and got the throughput of 1 TOPS at 150 MHz, that is, about 98.15% of the theoretical throughput ( 2 \times 3395 \times 150 MOPS). Similarly, the ResNet-101 accelerator achieved 600 GOPS at 100 MHz, about 96.12% of the theoretical throughput ( 2 \times 3121 \times 100 MOPS).
Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 33, Issue: 8, August 2022)
Page(s): 4069 - 4083
Date of Publication: 15 February 2021

ISSN Information:

PubMed ID: 33587711

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.