By Topic

IEE Proceedings - Software

Issue 6 • Date Dec 1998

Filter Results

Displaying Results 1 - 6 of 6
  • Experimental investigation of message latencies in the Totem protocol in the presence of faults

    Publication Year: 1998, Page(s):219 - 227
    IEEE is not the copyright holder of this material | Click to expandAbstract | PDF file iconPDF (664 KB)

    Group communication is a powerful and easy-to-use abstraction for distributed applications. The Totem protocol is a popular and efficient implementation of group communication primitives. To use Totem in soft real-time environments, the distribution of message latencies is an important performance measure, in particular, when fault tolerance is required. An experimental study of these distribution... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis and evaluation of distributed checkpoint algorithms to avoid rollback propagation

    Publication Year: 1998, Page(s):212 - 218
    IEEE is not the copyright holder of this material | Click to expandAbstract | PDF file iconPDF (740 KB)

    Checkpointing is a very well known mechanism to achieve fault tolerance. In distributed applications where processes can checkpoint independently of each other, a local checkpoint is useful for fault tolerance purposes only if it belongs to at least one consistent global checkpoint. In this case, execution can be restarted from it without needing to rollback the execution in the past. The paper ex... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software tool combining fault masking with user-defined recovery strategies

    Publication Year: 1998, Page(s):203 - 211
    Cited by:  Papers (10)
    IEEE is not the copyright holder of this material | Click to expandAbstract | PDF file iconPDF (764 KB)

    The voting farm, a tool which implements a distributed software voting mechanism for a number of parallel message passing systems, is described. The tool, developed in the framework of EFTOS (embedded fault tolerant supercomputing), can be used in standalone mode or in conjunction with other EFTOS fault tolerance tools. In the former case, exploitation of the mechanism is described, e.g. to implem... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using two-level stable storage for efficient checkpointing

    Publication Year: 1998, Page(s):198 - 202
    Cited by:  Papers (8)  |  Patents (2)
    IEEE is not the copyright holder of this material | Click to expandAbstract | PDF file iconPDF (500 KB)

    Checkpointing and rollback recovery is a very effective technique to tolerate the occurrence of failures. Usually, checkpoint data is saved on disk, however, in some situations the time to write the data to disk can represent a considerable performance overhead. Alternative solutions would make use of main memory to maintain the checkpoint data. The paper starts by presenting two main memory check... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generating diverse software versions with genetic programming: an experimental study

    Publication Year: 1998, Page(s):228 - 236
    Cited by:  Papers (9)  |  Patents (2)
    IEEE is not the copyright holder of this material | Click to expandAbstract | PDF file iconPDF (944 KB)

    Software fault-tolerance schemes often employ multiple software versions developed to meet the same specification. If the versions fail independently of each other, they can be combined to give high levels of reliability. Although design diversity is a means to develop these versions, it has been questioned because it increases development costs and because reliability gains are limited by common-... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliability-oriented software engineering: design, testing and evaluation techniques

    Publication Year: 1998, Page(s):191 - 197
    Cited by:  Papers (1)
    IEEE is not the copyright holder of this material | Click to expandAbstract | PDF file iconPDF (624 KB)

    Software reliability engineering involves techniques for the design, testing and evaluation of software systems, focusing on reliability attributes. Design for reliability is achieved by fault-tolerance techniques that keep the system working in the presence of software faults. Testing for reliability is achieved by fault-removal techniques that detect and correct software faults before the system... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.