By Topic

Reaching efficient fault-tolerance for cooperative applications

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Sobe, P. ; Inst. of Comput. Eng., Med. Univ. of Lubeck, Germany

Cooperative applications are widely used, e.g. as parallel calculations or distributed information processing systems. Whereby such applications meet the users demand and offer a performance improvement, the susceptibility to faults of any used computer node is raised. Often a single fault may cause a complete application failure. On the other hand, the redundancy in distributed systems can be utilized for fast fault detection and recovery. So, we followed an approach that is based an duplication of each application process to detect crashes and faulty functions of single computer nodes. We concentrate on two aspects of efficient fault-tolerance-fast fault detection and recovery without delaying the application progress significantly. The contribution of this work is first a new fault detecting protocol for duplicated processes. Secondly, we enhance a roll forward recovery scheme so that it is applicable to a set of cooperative processes in conformity to the protocol

Published in:

Computer Performance and Dependability Symposium, 2000. IPDS 2000. Proceedings. IEEE International

Date of Conference: