By Topic

Linpack evaluation on a supercomputer with heterogeneous accelerators

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Endo, T. ; Grad. Sch. of Inf. Sci. & Eng., Tokyo Inst. of Technol., Tokyo, Japan ; Nukada, A. ; Matsuoka, S. ; Maruyama, N.

We report Linpack benchmark results on the TSUBAME supercomputer, a large scale heterogeneous system equipped with NVIDIA Tesla GPUs and ClearSpeed SIMD accelerators. With all of 10,480 Opteron cores, 640 Xeon cores, 648 ClearSpeed accelerators and 624 NVIDIA Tesla GPUs, we have achieved 87.01TFlops, which is the third record as a heterogeneous system in the world. This paper describes careful tuning and load balancing method required to achieve this performance. On the other hand, since the peak speed is 163 TFlops, the efficiency is 53%, which is lower than other systems. This paper also analyses this gap from the aspect of system architecture.

Published in:

Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on

Date of Conference:

19-23 April 2010