By Topic

A Novel Process Migration Method for MPI Applications

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Tiantian Liu ; Dept. of Dependability, Comput., Wuhan Digital Eng. Inst., Wuhan, China ; Zhong Ma ; Zhonghong Ou

Though a lot of research has been done on fault tolerance for MPI applications, process migration has not gained widespread use because the complexity of the requirement that the knowledge about the location of a migrated process has to be made known to every other process in the MPI application. In this paper, we present a novel and effective process migration method for MPI application. We implement a prototype called LAM/Migration which based on LAM/MPI + BLCR to provide transparent process migration for MPI application and the migration mechanism is built into LAM/MPI. All processes in MPI application including mpirun and MPI processes can be migrated to any different set of spare nodes in cluster under user specified in case of nodes failure in our method. Performance evaluation results showed that the checkpoint overhead is similar to plain LAM/MPI + BLCR, and the migration method is feasible and promising for overcoming nodes failure in large-scale parallel computing. By using LAM/Migration, the high availability and reliability of parallel computation can be achieved.

Published in:

Dependable Computing, 2009. PRDC '09. 15th IEEE Pacific Rim International Symposium on

Date of Conference:

16-18 Nov. 2009