Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs | IEEE Conference Publication | IEEE Xplore

Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs


Abstract:

We present dataflow mirroring, architectural support for low-overhead fine-grained systolic array allocation which overcomes the limitations of prior coarse-grained spati...Show More

Abstract:

We present dataflow mirroring, architectural support for low-overhead fine-grained systolic array allocation which overcomes the limitations of prior coarse-grained spatial-multitasking Neural Processing Unit (NPU) architectures. The key idea of dataflow mirroring is to reverse the dataflows of co-located Neural Networks (NNs) in horizontal and/or vertical directions, allowing allocation boundaries to be set between any adjacent rows and columns of a systolic array and supporting up to four-way spatial multitasking. Our detailed experiments using MLPerf NNs and a dataflow-mirroring-augmented NPU prototype which extends Google’s TPU with dataflow mirroring shows that dataflow mirroring can significantly improve the multitasking performance by up to 46.4%.
Date of Conference: 05-09 December 2021
Date Added to IEEE Xplore: 08 November 2021
ISBN Information:
Print on Demand(PoD) ISSN: 0738-100X
Conference Location: San Francisco, CA, USA

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.