Loading [MathJax]/extensions/MathMenu.js
The Intel Programmable and Integrated Unified Memory Architecture Graph Analytics Processor | IEEE Journals & Magazine | IEEE Xplore

The Intel Programmable and Integrated Unified Memory Architecture Graph Analytics Processor


Abstract:

High-performance large-scale graph analytics are essential to timely analyze relationships in big datasets. Conventional processor architectures suffer from inefficient r...Show More

Abstract:

High-performance large-scale graph analytics are essential to timely analyze relationships in big datasets. Conventional processor architectures suffer from inefficient resource usage and bad scaling on those workloads. To enable efficient and scalable graph analysis, Intel developed the Programmable Integrated Unified Memory Architecture (PIUMA) as a part of the DARPA Hierarchical Identify Verify Exploit (HIVE) program. PIUMA consists of many multithreaded cores, fine-grained memory and network accesses, a globally shared address space, powerful offload engines, and a tightly integrated optical interconnection network. This article presents the PIUMA architecture and documents our experience in designing and building a prototype chip and its bring-up process. PIUMA silicon has successfully powered on demonstrating key aspects of the architecture, some of which will be incorporated into future Intel products.
Published in: IEEE Micro ( Volume: 43, Issue: 5, Sept.-Oct. 2023)
Page(s): 78 - 87
Date of Publication: 20 July 2023

ISSN Information:


Current practices in data analytics and artificial intelligence (AI) perform tasks such as object classification on unending streams of data. Computing infrastructure for classification is predominantly oriented toward “dense” compute, such as matrix computations. However, the next step in both AI and data analytics is reasoning about the relationships between these classified objects, typically represented as a graph. Determining the relationships between entities in a graph is the basis of graph analytics. Graph analytics poses important challenges on existing processor architectures due to its sparse structure. This sparseness leads to scattered and irregular memory accesses and communication, challenging the optimizations implemented for decades that have gone into traditional dense compute solutions. Consider the common case of pushing data along the graph edges, see the example graph in Figure 1. All vertices initially store a value locally and then proceed to add their value to all neighbors along outgoing edges. This basic computation is ubiquitous in graph algorithms such as PageRank. The resulting access stream [Figure 1(b)] is irregular and has no locality, making conventional prefetching and caching useless.

Contact IEEE to Subscribe

References

References is not available for this document.