By Topic

[1992] Proceedings 11th Symposium on Reliable Distributed Systems

5-7 Oct. 1992

Filter Results

Displaying Results 1 - 25 of 28
  • Proceedings 11th Symposium on Reliable Distributed Systems (Cat. No.92CH3187-2)

    Publication Year: 1992
    Request permission for commercial reuse | PDF file iconPDF (22 KB)
    Freely Available from IEEE
  • Dependability analysis of distributed computing systems using stochastic Petri nets

    Publication Year: 1992, Page(s):85 - 92
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (604 KB)

    Models based on stochastic Petri nets are developed to estimate the reliability/availability of distributed systems. The modeling approach discussed combines stochastic Petri net modeling and previous reliability algorithms so that not only can failures of nodes and links be modeled, but also the effect of global repairs can be considered. The models can be used to derive estimations on the succes... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sensitivity and uncertainty analysis in performability modelling

    Publication Year: 1992, Page(s):93 - 102
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (748 KB)

    The authors present an exact Taylor series model of steady state performability measures and discuss the validity region of this approximation. They also describe an uncertainty analysis approach based on Monte Carlo simulation, apply both approaches to a performability mode, and discuss their relative merits. They vary the dependencies between model parameters and discuss their influence. For the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Managing replicated data in heterogeneous database systems

    Publication Year: 1992, Page(s):12 - 19
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (696 KB)

    In a heterogeneous database system, supporting transaction semantics is expensive, and sometimes impossible. thus, in such systems, traditional methods such as quorum consensus cannot be used directly for managing replicated data. A method to manage replicated data in a heterogeneous database system is proposed. The method is based on the idea of quorum consensus but does not rely on transaction s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Availability of coding based replication schemes

    Publication Year: 1992, Page(s):103 - 110
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (592 KB)

    The availability of a coding-based replication scheme where simple voting is used to maintain correctness of replicated data is evaluated. It is shown that the storage requirement for maintaining the data with a given availability is reduced significantly. The ways that some of the extensions of the voting scheme can be modified to manage this coding-based replication are also described. The avail... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic management of highly replicated data

    Publication Year: 1992, Page(s):20 - 27
    Cited by:  Papers (4)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (672 KB)

    An efficient replication control protocol, called the dynamic group protocol, for managing replicated data objects that have more than five replicas is presented. Like the grid protocol, the dynamic group protocol requires only O(√n) messages per access to enforce mutual consistency among n replicas. Unlike other protocols aimed at providing fast access, this proto... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • xAMp: a multi-primitive group communications service

    Publication Year: 1992, Page(s):112 - 121
    Cited by:  Papers (17)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (812 KB)

    The xAMp is a highly versatile group communications service aimed at supporting the development of distributed applications with different dependability, functionality, and performance requirements. These range from unreliable and nonordered to atomic multicast, and are enhanced by efficient group addressing and management support. The basic protocols are synchronous, clockless and design... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The two-phase commit performance of the DECdtm services

    Publication Year: 1992, Page(s):29 - 38
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (624 KB)

    The performance of the two-phase commit protocols implemented by Digital Equipment Corporation's (DEC)'s distributed transaction manager (DECdtm V1.1) is characterized. Throughput, response time, resource utilization and other performance aspects of reliable distributed transactions are analyzed. Efficient implementation and effective group-commit allowed DECdtm to reach up to 176 transactions per... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimistic message logging for independent checkpointing in message-passing systems

    Publication Year: 1992, Page(s):147 - 154
    Cited by:  Papers (40)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB)

    Message-passing systems with a communication protocol transparent to the applications typically require message logging to ensure consistency between checkpoints. A periodic independent checkpointing scheme with optimistic logging to reduce performance degradation during normal execution while keeping the recovery cost acceptable is described. Both time and space overhead for message logging can b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the verification and validation of protocols with high fault coverage using UIO sequences

    Publication Year: 1992, Page(s):196 - 203
    Cited by:  Papers (1)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB)

    Various new classes of unique input/output (UIO) sequences for verification and validation (conformance testing) of protocols modeled as finite state machines (FSMs) are presented. The proposed sequences are referred to as adaptive because test sequence generation is not a mere concatenation of test subsequences for all edges of the FSM, but rather subsequences are concatenated using appropriate c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliable broadcasting in faulty hypercube computers

    Publication Year: 1992, Page(s):122 - 129
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (608 KB)

    A nonredundant broadcasting algorithm for faulty hypercube computers is proposed. The concept of unsafe nodes is introduced to identify those nonfaulty nodes that will cause a detour or backtracking because of their proximity to faulty nodes. It is assumed that each healthy node, safe or unsafe, knows the status of all the neighboring nodes. The broadcasting is optimal, meaning that a message is s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error attenuation in distributed systems

    Publication Year: 1992, Page(s):48 - 55
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (472 KB)

    A method for achieving fault-tolerance in distributed systems is proposed and compared with the known method of error masking based on distributed voting over the results of several variants. The method is called error attenuation, since the errors produced by nonpermanent faults attenuate eventually during the further execution of the variants. The term attenuation is used to represent the fact t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed problem solving in spite of processor failures

    Publication Year: 1992, Page(s):164 - 171
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (632 KB)

    Processor failures not leading to a network partition are considered, and the issue of computing associative functions in spite of processor failures is addressed. An intuitive and fundamental result formally proved is that failure detection and computing associative functions are equivalent in faulty networks: one can be performed if and only if the other can be performed. Protocols and impossibi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dataflow-like languages for real-time systems: issues of computational models and notation

    Publication Year: 1992, Page(s):214 - 221
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (716 KB)

    The use of dataflow-like models for the in-the-large design of real-time applications is discussed. In these models, modules can only communicate by (asynchronously) receiving messages when activated and transmitting result messages when terminating. This rather restrictive computational model allows the description of typical, cyclic control programs, with predictable, well-verifiable behavior. I... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient decentralized approach to processor-group membership maintenance in real-time LAN systems: the PRHB/ED scheme

    Publication Year: 1992, Page(s):74 - 83
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (840 KB)

    In constructing highly reliable LAN systems, a mechanism that enables every active node to maintain timely and consistent knowledge about the health status of all cooperating nodes can be used as a cornerstone. The authors consider the case where maintenance of such knowledge is achieved in a decentralized manner and timely and consistent recognition of newly joining nodes is also facilitated. The... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A replicated object server for a distributed object-oriented system

    Publication Year: 1992, Page(s):4 - 11
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (780 KB)

    The design and implementation of a replicated object server called Goofy, which is intended to provide a fault-tolerant storage system for distributed object-oriented applications, are described. Goofy supplies the object storage of the GUIDE (Grenoble Universities Integrated Distributed Environment) distributed object-oriented system with integrity, reliability and availability, while preserving ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • KITLGO: a generic logging service

    Publication Year: 1992, Page(s):139 - 146
    Cited by:  Papers (8)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB)

    A generic logging service should center to the variable and even antagonistic needs of clients, without imposing unnecessary overhead on clients that do not use all of its functions. A solution to this problem, called KITLGO, is described in detail. It solves the problem by separating logging characteristics into five mechanisms: buffering policy, distri... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Neural networks for the design of distributed, fault-tolerant, computing environments

    Publication Year: 1992, Page(s):189 - 195
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (456 KB)

    Binary optimization models for the design of distributed, fault-tolerant computing systems are considered, with a focus on the task allocation and file assignment modeling schema proposed by J. Bannister and K. Trivedi (Proc. Second Symp. on Reliability in Distributed Software and Database Systems, 1982). It is shown that R. Graham's (1969) partitioning algorithm, S, when applied to this ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The performance of consistent checkpointing

    Publication Year: 1992, Page(s):39 - 47
    Cited by:  Papers (154)  |  Patents (14)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (656 KB)

    Consistent checkpointing provides transparent fault tolerance for long-running distributed applications. Performance measurements of an implementation of consistent checkpointing are described. The measurements show that consistent checkpointing performs remarkably well. Eight computation-intensive distributed applications were executed on a network of 16 diskless Sun-3/60 workstations, and the pe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Global checkpointing for distributed programs

    Publication Year: 1992, Page(s):155 - 162
    Cited by:  Papers (41)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (680 KB)

    A novel algorithm for checkpointing and rollback recovery in distributed systems is presented. Processes belonging to the same program must take periodically a nonblocking coordinated global checkpoint, but only a minimum overhead is imposed during normal computation. Messages can be delivered out of order, and the processes are not required to be deterministic. The nonblocking structure is an imp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A state machine approach to reliable distributed systems

    Publication Year: 1992, Page(s):204 - 212
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (844 KB)

    In many distributed applications, processes synchronize with one another in a complex way and execute for a long period of time. Atomic transactions are inadequate for designing reliable applications with these characteristics, because transactions restrict the types of synchronization than can be specified. An alternative approach that exploits behavior specified in a hierarchical finite-state ma... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The triangular lattice protocol: a highly fault tolerant and highly efficient protocol for replicated data

    Publication Year: 1992, Page(s):66 - 73
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (536 KB)

    A protocol for managing replicated data in which data copies are organized as a triangular lattice is introduced. The smallest quorum size is O(√N), where N is the number of data copies, which is currently considered optimal for a fully distributed environment. The protocol has the property of graceful degradation. The quorum sizes increase gradually as data copy f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Communication protocols for fault-tolerant clock synchronization in not-completely connected networks

    Publication Year: 1992, Page(s):130 - 137
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (664 KB)

    Communications protocols for not-completely-connected networks are presented, and their cost is evaluated in terms of message exchanges. An efficient protocol tailored to convergence function clock synchronization is introduced. The number of messages used by this approach is equal to a proven lower bound on the number of messages and, hence, the approach is optimal. This protocol can be combined ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulation of the Adapt on-line diagnosis algorithm for general topology networks

    Publication Year: 1992, Page(s):180 - 187
    Cited by:  Papers (17)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB)

    As dependence on wide-area and other point-to-point networks increases, the need for diagnosis of the distributed resources becomes critical. Continuous online distributed diagnosis at the system-level provides a desirable solution. The Adapt algorithm, which performs online adaptive distributed diagnosis in arbitrary networks in the presence of node failures, is examined. Simulation results depic... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal replica control protocols for ring networks

    Publication Year: 1992, Page(s):57 - 65
    Cited by:  Papers (1)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB)

    Distributed computing environments are expected to offer highly available services. Replication of data is one of the main techniques used to achieve this goal. Protocols that achieve optimal performance in replicating data for ring networks are discussed. Coteries, proposed by H. Garcia-Molina and D. Barbara (1985), provide the most general framework for analyzing static pessimistic protocols. It... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.