Conferences >2021 IEEE/ACM International C...

dCSR: A Memory-Efficient Sparse Matrix Representation for Parallel Neural Network Inference

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Reducing the memory footprint of neural networks is a crucial prerequisite for deploying them in small and low-cost embedded devices. Network parameters can often be redu...Show More

Metadata

Abstract:

Reducing the memory footprint of neural networks is a crucial prerequisite for deploying them in small and low-cost embedded devices. Network parameters can often be reduced significantly through pruning. We discuss how to best represent the indexing overhead of sparse networks for the coming generation of Single Instruction, Multiple Data (SIMD)-capable microcontrollers. From this, we develop Delta-Compressed Storage Row (dCSR), a storage format for sparse matrices that allows for both low overhead storage and fast inference on embedded systems with wide SIMD units. We demonstrate our method on an ARM Cortex-M55 MCU prototype with M-Profile Vector Extension (MVE). A comparison of memory consumption and throughput shows that our method achieves competitive compression ratios and increases throughput over dense methods by up to

$2.9\times$ for sparse matrix-vector multiplication (SpMV)-based kernels and

$1.06\times$ for sparse matrix-matrix multiplication (SpMM). This is accomplished through handling the generation of index information directly in the SIMD unit, leading to an increase in effective memory bandwidth.

Published in: 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)

Date of Conference: 01-04 November 2021

Date Added to IEEE Xplore: 23 December 2021

ISBN Information:

ISSN Information:

DOI: 10.1109/ICCAD51958.2021.9643506

Conference Location: Munich, Germany

Contents

References is not available for this document.

dCSR: A Memory-Efficient Sparse Matrix Representation for Parallel Neural Network Inference

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

dCSR: A Memory-Efficient Sparse Matrix Representation for Parallel Neural Network Inference

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?