By Topic

Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Laiping Zhao ; Dept. of Inf., Kyushu Univ., Fukuoka, Japan ; Yizhi Ren ; Yang Xiang ; Sakurai, K.

In the existing studies on fault-tolerant scheduling, the active replication schema makes use of ε + 1 replicas for each task to tolerate E failures. However, in this paper, we show that it does not always lead to a higher reliability with more replicas. Besides, the more replicas implies more resource consumption and higher economic cost. To address this problem, with the target to satisfy the user's reliability requirement with minimum resources, this paper proposes a new fault tolerant scheduling algorithm: MaxRe. In the algorithm, we incorporate the reliability analysis into the active replication schema, and exploit a dynamic number of replicas for different tasks. Both the theoretical analysis and experiments prove that the MaxRe algorithm's schedule can certainly satisfy user's reliability requirements. And the MaxRe scheduling algorithm can achieve the corresponding reliability with at most 70% fewer resources than the FTSA algorithm.

Published in:

High Performance Computing and Communications (HPCC), 2010 12th IEEE International Conference on

Date of Conference:

1-3 Sept. 2010