By Topic

Reliable Distributed Systems, 1995. Proceedings., 14th Symposium on

Date 13-15 Sept. 1995

Filter Results

Displaying Results 1 - 25 of 25
  • Proceedings. 14th Symposium on Reliable Distributed Systems

    Publication Year: 1995
    Request permission for commercial reuse | PDF file iconPDF (256 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 1995
    Request permission for commercial reuse | PDF file iconPDF (49 KB)
    Freely Available from IEEE
  • An integer programmimg approach for assigning votes in a distributed system

    Publication Year: 1995, Page(s):128 - 134
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (700 KB)

    Voting is a general approach to maintain consistency of replicated data under node failures and network partitions. In voting, each node as assigned a particular number of votes, and any group with majority of votes can perform operations. Votes assigned to the nodes have a significant impact on the performance of a voting system. In this report, we propose an integer programming approach for dete... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Configurable highly available distributed services

    Publication Year: 1995, Page(s):118 - 127
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (916 KB)

    The paper addresses the problem of providing highly available services in distributed systems. In particular, we examine the situation where a service may be used by a large continuously changing set of clients. The requirements for providing services in this environment are analysed and an architecture and partial implementation for a replicated server group meeting a range of client requirements... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchy of totally ordered multicasts

    Publication Year: 1995, Page(s):106 - 115
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (904 KB)

    The increased interest in protocols that provide a total order on message delivery has led to several different definitions of total order. In this paper we investigate these different definitions and propose a hierarchy that helps to better understand the implications of the different possibilities in terms of guarantees and communication cost. We identify two definitions: weak total order and st... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The performance of consistent checkpointing in distributed shared memory systems

    Publication Year: 1995, Page(s):96 - 105
    Cited by:  Papers (20)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (976 KB)

    This paper presents the design and implementation of a consistent checkpointing scheme for distributed shared memory (DSM) systems. Our approach relies on the integration of checkpoints within synchronization barriers already existing in applications; this avoids the need to introduce an additional synchronization mechanism. The main advantage of our checkpointing mechanism is that performance deg... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Supporting semantics-based transaction processing in mobile database applications

    Publication Year: 1995, Page(s):31 - 40
    Cited by:  Papers (25)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1068 KB)

    Advances in computer and telecommunication technologies have made mobile computing a reality. However, greater mobility implies a more tenuous network connection and a higher rate of disconnection. In order to tolerate disconnections as well as to reduce the delays and cost of wireless communication, it is necessary to support autonomous mobile operations on data shared by stationary hosts. This w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Maximum and minimum consistent global checkpoints and their applications

    Publication Year: 1995, Page(s):86 - 95
    Cited by:  Papers (11)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (820 KB)

    This paper considers the problem of constructing the maximum and the minimum consistent global checkpoints that contain a target set of checkpoints, and identify it as a generic issue in recovery-related applications. We formulate the problem as a reachability analysis problem on a directed rollback-dependency graph, and develop efficient algorithms to calculate the two consistent global checkpoin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Membership and system diagnosis

    Publication Year: 1995, Page(s):208 - 217
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1160 KB)

    A membership service is a service in a distributed system that maintains and provides information about which sites are functioning and which have failed at any given time. System diagnosis, on the other hand, is a method for detecting faulty processing elements and distributing this information to non-faulty elements. In spite of the apparent similarity of goals, these two fields have been consid... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Experimental evaluation of the impact of processor faults on parallel applications

    Publication Year: 1995, Page(s):10 - 19
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (980 KB)

    This paper addresses the problem of processor faults in distributed memory parallel systems. It shows that transient faults injected at the processor pins of one node of a commercial parallel computer, without any particular fault-tolerant techniques, can cause erroneous application results for up to 43% of the injected faults (depending on the application). In addition to these very subtle faults... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A synchronization strategy for a time-triggered multicluster real-time system

    Publication Year: 1995, Page(s):154 - 161
    Cited by:  Papers (9)  |  Patents (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB)

    The provision of a system-wide global time base with a good precision and sufficient accuracy is a fundamental prerequisite for the design of a multicluster distributed real-time system. We investigate the issues of clock synchronization in a multicluster system, where every node can have a different oscillator. Based on the parameter of a typical automotive distributed system we show that a preci... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • System support for robust collaborative applications

    Publication Year: 1995, Page(s):62 - 71
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (992 KB)

    Traditional transaction models ensure robustness for distributed applications through the properties of view and failure atomicity. It has generally been felt that such atomicity properties are restrictive for a wide range of application domains; this is particularly true for robust, collaborative applications because such applications have concurrent components that are inherently long-lived and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • TMR processing without explicit clock synchronisation

    Publication Year: 1995, Page(s):186 - 195
    Cited by:  Papers (2)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1000 KB)

    Replicated processing with majority voting is a well known method for achieving fault tolerance. Triple Modular Redundant (TMR) processing is the most commonly used version of that method. Replicated processing requires that the replicas reach agreement on the order in which messages are to be processed. Synchronous and deterministic ordering protocols published in the literature require that the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A paradigm for user-defined security policies

    Publication Year: 1995, Page(s):135 - 144
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (868 KB)

    One of today's major challenges in computer security is the ever-increasing multitude of individual, application-specific security requirements. As a positive consequence, a wide variety of security policies has been developed, each policy reflecting the specific needs of individual applications. As a negative consequence, the integration of the multitude of policies into today's system platforms ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Non blocking atomic commitment with an unreliable failure detector

    Publication Year: 1995, Page(s):41 - 50
    Cited by:  Papers (18)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (748 KB)

    In a transactional system, an atomic commitment protocol ensures that for any transaction, all data manager processes agree on the same outcome (commit or abort). A non-blocking atomic commitment protocol enables an outcome to be decided at every correct process despite the failure of others. In this paper we apply, for the first time, the fundamental result of T. Chandra and S. Toueg (1991) on so... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self diagnosis of processor arrays using a comparison model

    Publication Year: 1995, Page(s):218 - 228
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (888 KB)

    This paper introduces a diagnosing algorithm for bidimensional processor arrays, where processors are interconnected in horizontal and vertical meshes. For the purpose of diagnosis, the array is considered to be partitioned in square clusters of processors. The algorithm is based on interprocessor tests, using a comparison model. The algorithm, which is divided in four steps, called intracluster d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A correctness criterion for advanced transaction models

    Publication Year: 1995, Page(s):22 - 30
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (600 KB)

    The transaction concept was originally applied to database applications. Serializability theory captured transaction correctness and database objects consistency properties in a single notion. Today, increasingly sophisticated information requires new correctness criteria due to the limitation of classical serialisability theory which allows only a limited cooperation between its components. Sever... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MUSE: a message passing concurrent computer for on-board space systems

    Publication Year: 1995, Page(s):162 - 170
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (696 KB)

    Satellite payloads of the near future will raise the need of very powerful and dependable computers. No embeddable monoprocessor will be able to satisfy such computing power need. Thus those computers will be multi-processor systems. Satellite payloads must meet high availability requirements rather then reliability ones. This allows the use of fail stop reconfigurable computers. This paper descri... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A method for the construction and interpretation of high level models for distributed fault-tolerant systems

    Publication Year: 1995, Page(s):72 - 81
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (864 KB)

    Traditional solutions for achieving fault-tolerance are intended for use at design time and they generally capture system information at a very low (hardware or machine instruction) level. Increasing reliability of complex information systems containing many (perhaps many thousands) of autonomous components requires different solutions. This article presents a new methodology for the implementatio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of a regeneration-based dynamic voting algorithm

    Publication Year: 1995, Page(s):196 - 205
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (884 KB)

    RVC2 is a consistency control algorithm for replicated data objects in a distributed computing system. It is a dynamic voting algorithm which utilizes selective regeneration and recovery mechanisms for failed copies. Virtual copies which record information about the current state of a data object, but do not contain actual data, are used to reduce network and storage overhead. Experimental results... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A longitudinal survey of Internet host reliability

    Publication Year: 1995, Page(s):2 - 9
    Cited by:  Papers (54)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (688 KB)

    An accurate estimate of host reliability is important for correct analysis of many fault-tolerance and replication mechanisms. In a previous study, we estimated host system reliability by querying a large number of hosts to find how long they had been functioning, estimating the mean time-to-failure (MTTF) and availability from those measures, and in turn deriving an estimate of the mean time-to-r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new deadlock detection algorithms for distributed real-time database systems

    Publication Year: 1995, Page(s):146 - 153
    Cited by:  Papers (3)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (636 KB)

    Recently the concurrency control issue of real-time transactions is gaining increasing attention of researchers in the database community. One of the major design issue in concurrency control of real-time transactions is the resolution of local as well as distributed deadlocks while at the same time meeting the timing requirements of the transactions. In this paper, a new deadlock detection algori... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the design of systems of cooperating functional processes

    Publication Year: 1995, Page(s):52 - 61
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (960 KB)

    This paper describes a design concept for systems of cooperating distributed processes based on a variant of coloured Petri-nets. It cleanly separates graphical specification of processes and their interaction (or communication) from the algorithmic specifications of the computations that need to be performed by the individual processes. Designing complex process systems is aided by abstractions s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Failure detection algorithms for a reliable execution of parallel programs

    Publication Year: 1995, Page(s):229 - 238
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (692 KB)

    We report on the design and simulation of novel algorithms which will ensure that application software runs correctly on a MIMD system in which processing units (PU) can fail. The effect of these algorithms is evaluated for random task graphs using simulation as failure rates increase. An example of a specific application is also examined (the Fast Fourier Transform) for which we construct the tas... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing masking fault-tolerance via nonmasking fault-tolerance

    Publication Year: 1995, Page(s):174 - 185
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1232 KB)

    Masking fault-tolerance guarantees that programs continually satisfy their specification in the presence of faults. By way of contrast, nonmasking fault-tolerance does not guarantee as much: it merely guarantees that when faults stop occurring, program executions converge to states from where programs continually (re)satisfy their specification. In this paper, we show that a practical method to de... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.