A Data-Centric Software-Hardware Co-Designed Architecture for Large-Scale Graph Processing | IEEE Journals & Magazine | IEEE Xplore

A Data-Centric Software-Hardware Co-Designed Architecture for Large-Scale Graph Processing


Abstract:

Graph processing plays an important role in many practical applications. However, the inherent characteristics of graph processing, including random memory access and the...Show More

Abstract:

Graph processing plays an important role in many practical applications. However, the inherent characteristics of graph processing, including random memory access and the low computation-to-communication ratio, make it difficult to efficiently execute on traditional computing architectures, such as CPUs and GPUs. Near-memory computing has the characteristics of low latency and high bandwidth. It is widely regarded as a promising direction for designing graph processing accelerators. However, the storage space of a single device cannot meet the demand of large-scale graph processing. Using multiple devices will bring lots of inter-device data transmission, which may counteract the benefits of near-memory computing. To fundamentally reduce the data transmission overhead, we propose a data-centric graph processing framework for systems with multiple near-memory computing devices. The framework uses a data-centric programming model as the software hardware interface. For software, we propose an optimized data flow and a heuristic multi-step weighted maximum matching algorithm to achieve efficient inter-device communication and ensure load balancing. For hardware, we design a data reuse driven task controller and a data type-aware on-chip memory, which can effectively improve the utilization of the on-chip memory. Compared with the two most recent near-memory graph accelerators, our framework significantly reduces energy consumption and inter-device communication.
Published in: IEEE Transactions on Computers ( Volume: 74, Issue: 4, April 2025)
Page(s): 1109 - 1122
Date of Publication: 09 December 2024

ISSN Information:

Funding Agency:


I. Introduction

Graph processing mines hidden information in graphs by traversing object connections. It has important uses in numerous practical applications including genomics [1], social network [2], machine learning [3], etc. The irregular structure of graphs poses great challenges to the performance of graph processing. These challenges are mainly reflected in two aspects: (1) The irregular connections in graphs bring irregular memory accesses, and 2) Graph processing typically behaves with a low computation-to-communication ratio. That is, during graph processing, the processor may access any vertex and only perform a small amount of computation. These inherent characteristics make it inefficient on traditional computing architectures such as central processing units (CPUs) and graphics processing units (GPUs). Therefore, dedicated graph processing accelerators have been proposed to solve this dilemma (e.g., [4], [5], [6], [7], [8], [9]).

Contact IEEE to Subscribe

References

References is not available for this document.