The design and implementation of an experimental fault-tolerant distributed database management system is described. The system provides a logically integrated view of data with distribution transparency and a controlled data replication. A commitment protocol used to guarantee atomicity of update operations is discussed. Efficient algorithms used to recover a site from a failure and restore data consistency are described. Recovery can be interleaved with the processing of regular database transactions and does not seriously limit the availability of data. The proposed solutions to the problems of fault recovery are designed to take advantage of the properties of a high-bandwidth local area network
Published in:
Reliable Distributed Systems, 1988. Proceedings., Seventh Symposium on
Date of Conference: 10-12 Oct 1988