By Topic

Fault-tolerant task management and load re-distribution on massively parallel hypercube systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Ahmad, I. ; Sch. of Comput. & Inf. Sci., Syracuse Univ., NY, USA ; Ghafoor, A.

The authors present a scheme for managing real-time task allocation and load redistribution with fault-tolerance for hypercube systems. A set of processors, called fault-control processors (FCPs), can be used for keeping the duplicate copies of tasks and real locating tasks if the original processors of those tasks fail. Two-level task redundancy is used by grouping the FCPs as primary and secondary for each processor. The proposed scheme provides a high degree of fault-tolerance since each FCP itself is monitored by other FCPs. Assuming a failure-repair system environment, the performance of the proposed strategy has been evaluated and compared with a fault-free environment for 256-node and 512-node hypercubes, through simulation experiments. The authors also introduce a measure of goodness, success probability, which represents the probability of reallocated tasks meeting their deadlines despite the failures of processors. It is shown that, using the proposed scheme, a large percentage of the rescheduled tasks can still meet their deadlines. The probability of a task being lost altogether, due to multiple failures, has been shown to be extremely low

Published in:

Supercomputing '92., Proceedings

Date of Conference:

16-20 Nov 1992