By Topic

Scalable Hybrid Designs for Linear Algebra on Reconfigurable Computing Systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Ling Zhuo ; Ming Hsieh Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA ; Prasanna, V.K.

Recently, high-end reconfigurable computing systems have been built that employ Field Programmable Gate Arrays (FPGAs) as hardware accelerators for general-purpose processors. These systems not only provide new opportunities for high-performance computing, but also pose new challenges to application developers. In this paper, we build a design model for hybrid designs that utilize both the processors and the FPGAs. The model characterizes a reconfigurable computing system using various parameters. Based on the model, we propose a design methodology for hardware/software co-design. The methodology partitions workload between the processors and the FPGAs, maintains load balance in the system, and realizes scalability over multiple nodes. Designs are proposed for several computationally intensive applications: matrix multiplication, matrix factorization and the Floyd-Warshall algorithm for the all-pairs shortest-paths problem. To illustrate our ideas, the proposed hybrid designs are implemented on a Cray XD1. Experimental results show that our designs utilize both the processors and the FPGAs efficiently, and overlap most of the data transfer overheads and network communication costs with the computations. Our designs achieve up to 90% of the total performance of the nodes, and 90% of the performance predicted by the design model. In addition, our designs scale over a large number of nodes.

Published in:

Computers, IEEE Transactions on  (Volume:57 ,  Issue: 12 )