Loading [MathJax]/extensions/MathMenu.js
High-Utilization GPGPU Design for Accelerating GEMM Workloads: An Incremental Approach | IEEE Conference Publication | IEEE Xplore

High-Utilization GPGPU Design for Accelerating GEMM Workloads: An Incremental Approach


Abstract:

General Purpose Graphics Processing Units (GPGPUs) have been employed primarily in domains such as graphics acceleration and high-performance computing in the past. Howev...Show More

Abstract:

General Purpose Graphics Processing Units (GPGPUs) have been employed primarily in domains such as graphics acceleration and high-performance computing in the past. However, the rise of artificial intelligence (AI), particularly the computational demands associated with matrix multiplications in AI models, has presented formidable demands on the computational power of GPGPUs. Consequently, the design of matrix multiplication units within GPGPUs and ensuring their utilization have become key issues in optimizing AI workloads. This paper explores an incremental design approach, building upon a Single Instruction Multiple Threads (SIMT) GPGPU architecture, to facilitate General Matrix Multiply (GEMM) acceleration. This approach encompasses not only the design of matrix units within the stream processors but, more crucially, the optimization of the data path within the GPGPU to maximize the utilization of the matrix units. We present a practical demonstration of our approach through the fabrication of a GPGPU on a 12 nm CMOS process node, achieving a core clock speed of 1 GHz and INT8 peak performance of 8 TOPS with memory bandwidth limited to 32 GB/s LPDDR4-4000. Notably, this design results in only a 6.57% increase in chip area compared to the original GPGPU design. In a series of fair GEMM workload tests, the GPGPU implemented in this work outperforms the recent three generations of NVIDIA GPGPUs—V100, T4, and A100—in terms of matrix unit utilization.
Date of Conference: 19-22 May 2024
Date Added to IEEE Xplore: 02 July 2024
ISBN Information:

ISSN Information:

Conference Location: Singapore, Singapore

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.