By Topic

Lessons from FTM: an experiment in design and implementation of a low-cost fault tolerant system

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
G. Muller ; IRISA/INRIA, Rennes, France ; M. Banatre ; N. Peyrouze ; B. Rochat

This paper describes an experiment in the design of a general purpose fault tolerant system, FTM. The main objective of the FTM design was to implement a low-cost fault-tolerant system that could be used on standard workstations. At the operating system level, the authors' goal was to offer fault-tolerance transparency to user applications. In other words, porting an application to FTM need only require compiling the source code without having to modify it. These objectives were achieved using the Mach micro-kernel and a modular set of reliable servers which implement application checkpoints and provide continuous system functions despite machine crashes. At the architectural level, their approach relies on a high-performance stable storage implementation, called stable transactional memory (STM), which can be implemented either by hardware or software. The authors first motivate their design choices, then detail the FTM implementation at both architectural and operating system level. They discuss the reasons for the evolution of their stable memory technology from hardware to software. They evaluate the performance of the FTM prototype. They conclude with lessons learned and give some assessments

Published in:

IEEE Transactions on Reliability  (Volume:45 ,  Issue: 2 )