By Topic

Systematic incorporation of efficient fault tolerance in systems of cooperating parallel programs

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
I-Ling Yen ; Dept. of Comput. Sci., Michigan State Univ., East Lansing, MI, USA ; Bastani, F.B.

Cooperating parallel programs are being increasingly used in critical applications that require both high performance and high reliability. A promising technique for simultaneously achieving these objectives is to embed the fault tolerance within the program instead of superimposing it via external mechanisms. We develop one such approach for a group of processes that cooperate via shared data structures. The scheme uses data structures having two or more invariant assertions. When the strong invariant is true, the performance is good. When it is false, the performance may be adversely affected, but it is guaranteed that the system will operate correctly provided the weak invariant is true. The algorithms are designed to ensure that processor failures will never cause the weak invariant to be false and to restore the strong invariant within a finite number of recovery actions. We develop a robust task handling mechanism to support the approach and illustrate it for three common data structures.<>

Published in:

Fault-Tolerant Computing, 1994. FTCS-24. Digest of Papers., Twenty-Fourth International Symposium on

Date of Conference:

15-17 June 1994