C3-FIow: Compute Compression Co-Design FIow for Deep Neural Networks | IEEE Conference Publication | IEEE Xplore

C3-FIow: Compute Compression Co-Design FIow for Deep Neural Networks


Abstract:

Existing approaches to neural network compression have failed to holistically address algorithmic (training accuracy) and computational (inference performance) demands of...Show More

Abstract:

Existing approaches to neural network compression have failed to holistically address algorithmic (training accuracy) and computational (inference performance) demands of real-world systems, particularly on resource-constrained devices. We present C3-Flow, a new approach adding non-uniformity to low-rank approximations and designed specifically to enable highly-efficient computation on common hardware architectures while retaining more accuracy than competing methods. Evaluation on two state-of-the-art acoustic models (versus existing work, empirical limit study approaches, and hand-tuned models) demonstrates up to 60% lower error. Finally, we show that our co-design approach achieves up to 14X inference speedup across three Haswell- and Broadwell-based platforms.
Date of Conference: 02-06 June 2019
Date Added to IEEE Xplore: 22 August 2019
ISBN Information:
Print on Demand(PoD) ISSN: 0738-100X
Conference Location: Las Vegas, NV, USA

Contact IEEE to Subscribe

References

References is not available for this document.