Loading web-font TeX/Main/Regular
Pruning Binarized Neural Networks Enables Low-Latency, Low-Power FPGA-Based Handwritten Digit Classification | IEEE Conference Publication | IEEE Xplore

Pruning Binarized Neural Networks Enables Low-Latency, Low-Power FPGA-Based Handwritten Digit Classification


Abstract:

As neural networks are increasingly deployed on mobile and distributed computing platforms, there is a need to lower latency and increase computational speed while decrea...Show More

Abstract:

As neural networks are increasingly deployed on mobile and distributed computing platforms, there is a need to lower latency and increase computational speed while decreasing power and memory usage. Rather than using FPGAs as accelerators in tandem with CPUs or GPUs, we directly encode individual neural network layers as combinational logic within FPGA hardware. Utilizing binarized neural networks minimizes the arithmetic computation required, shrinking latency to only the signal propagation delay. We evaluate size-optimization strategies and demonstrate network compression via weight quantization and weight-model unification, achieving 96% of the accuracy of baseline MNIST digit classification models while using only 3% of the memory. We further achieve 86% decrease in model footprint, 8mW dynamic power consumption, and < 9\text{ns} latency, validating the versatility and capability of feature-strength-based pruning approaches for binarized neural networks to flexibly meet performance requirements amid application resource constraints.
Date of Conference: 25-29 September 2023
Date Added to IEEE Xplore: 25 December 2023
ISBN Information:

ISSN Information:

Conference Location: Boston, MA, USA

Contact IEEE to Subscribe

References

References is not available for this document.