By Topic

Tuning system-dependent applications with alternative MPI calls: a case study

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Le, T.T. ; Dept. of Electr. Eng., San Jose State Univ., CA, USA

This paper shows the effectiveness of using optimized MPI calls for MPI based applications on different architectures. Using optimized MPI calls can result in reasonable performance gain for most of MPI based applications running on most of high-performance distributed systems. Since relative performance of different MPI function calls and system architectures can be uncorrelated, tuning system-dependent MPI applications by exploring the alternatives of using different MPI calls is the simplest but most effective optimization method. The paper first shows that for a particular system, there are noticeable performance differences between using various MPI calls that result in the same communication pattern. These performance differences are in fact not similar across different systems. The paper then shows that good performance optimization for an MPI application on different systems can be obtained by using different MPI calls for different systems. The communication patterns that were experimented in this paper include the point-to-point and collective communications. The MPI based application used for this study is the general-purpose transient dynamic finite element application and the benchmark problems are the public domain 3D car crash problems. The experiment results show that for the same communication purpose, using alternative MPI calls can result in quite different communication performance on the Fujitsu HPC2500 system and the 8-node AMD Athlon cluster, but very much the same performance on the other systems such as the Intel Itanium2 and the AMD Opteron clusters.

Published in:

Software Engineering Research, Management and Applications, 2005. Third ACIS International Conference on

Date of Conference:

11-13 Aug. 2005