Abstract:
This paper presents a new algorithm for supporting fault tolerant objects in distributed systems. The fault tolerance provided by the algorithm is fully user transparent....Show MoreMetadata
Abstract:
This paper presents a new algorithm for supporting fault tolerant objects in distributed systems. The fault tolerance provided by the algorithm is fully user transparent. The algorithm uses a variation of object replication scheme, which we call the Hot Replication Scheme. The algorithm supports nested object invocations. The chief advantages of the scheme are: a) No action is needed in the case of failure of a secondary replica, b) The time to recover from a primary failure is minimal, c) Separation of replication protocol and reliable communication protocol. To recover from a primary failure the system need to (detect the failure and) select one of the secondaries to become the primary. The designated secondary can become primary once it has made sure that its current state is equivalent to the state of the failed primary (it can do so by processing outstanding requests, if any). This is in contrast with the checkpointing and rollback recovery scheme, where the recovery time can be substantial. Our algorithm exploits the general features and concepts associated with the notion of the objects and object interactions to its advantage.
Published in: Conference Proceedings of the 1996 IEEE Fifteenth Annual International Phoenix Conference on Computers and Communications
Date of Conference: 27-29 March 1996
Date Added to IEEE Xplore: 06 August 2002
Print ISBN:0-7803-3255-5