By Topic

Recent results in checkpointing and failure recovery in distributed systems and wireless networks

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Mukesh Singhal ; Department of Computer Science, University of Kentucky Lexington, 40506, USA

Summary form only given. Distributed systems today are ubiquitous and enable many applications, including client-server systems, transaction processing, World Wide Web, and scientific computing, among many others. Distributed systems are not fault-tolerant and the vast computing potential of these systems is often hampered by their susceptibility to failures. Many techniques, like transactions, group communication, and rollback recovery, have been developed to add reliability and high availability to distributed systems. This talk deals with rollback recovery protocols which restore the system back to a consistent state after a failure. Fault tolerance is achieved by periodically saving the state of a process during the failure-free execution, and restarting from a saved state upon a failure to reduce the amount of lost work. The speaker will present his recent results in checkpointing and failure recovery in distributed systems and wireless networks. Specifically, he will present results in a classification of checkpointing algorithms, present a communication-induced checkpointing algorithm that prevents useless checkpoints by tracking and preventing potential Z-cycles, and present the concept of mutable checkpoints for efficient checkpointing in wireless networks. He will conclude the talk with some open problems.

Published in:

Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on

Date of Conference:

19-23 April 2010