By Topic

Reliable Distributed Systems, 1988. Proceedings., Seventh Symposium on

Date 10-12 Oct. 1988

Filter Results

Displaying Results 1 - 22 of 22
  • Proceedings. Seventh Symposium on Reliable Distributed Systems (IEEE Cat. No.88CH2612-0)

    Publication Year: 1988
    Request permission for commercial reuse | PDF file iconPDF (75 KB)
    Freely Available from IEEE
  • Recovery in the Clouds kernel

    Publication Year: 1988, Page(s):167 - 176
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (982 KB)

    The Clouds kernel is a native-layer distributed kernel supporting the Clouds operating system. Clouds provides atomic actions to support reliable computation. The data-recovery mechanism supporting atomic actions in the Clouds kernel is the responsibility of a component called the storage manager. This recovery mechanism uses a pessimistic shadowing technique and is designed to be an efficient, lo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An implementation of reliable broadcast using an unreliable multicast facility

    Publication Year: 1988, Page(s):101 - 111
    Cited by:  Papers (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (820 KB)

    The authors consider the problem of reliable broadcast in a point-to-point asynchronous network. Such a network consists of host computers and a communication subnetwork. The latter, in turn, is a collection of switches (special-purpose computers that have the ability to store and forward messages), interconnected by point-to-point bidirectional communication links. The subnetwork is unreliable, i... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fault-tolerant protocol for atomic broadcast

    Publication Year: 1988, Page(s):112 - 126
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1120 KB)

    A novel general protocol for atomic broadcast in networks is presented. The protocol tolerates loss, duplication, reordering, delay of messages, and network partitioning in an arbitrary network of `fail-stop' sites (i.e. no Byzantine site behavior is tolerated). The protocol is fully decentralized and is based on majority-consensus decisions to commit on unique ordering of received broadcast messa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An experimental investigation of software diversity in a fault-tolerant avionics application

    Publication Year: 1988, Page(s):63 - 70
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (592 KB)

    Highly reliable and effective failure detection and isolation (FDI) software is crucial in modern avionics systems that tolerate hardware failures in real time. The FDI function is an excellent opportunity for applying the principal of software design diversity to the fullest, i.e., algorithm diversity, in order to provide gains in functional performance as well as potentially enhancing the reliab... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Transaction management in a distributed database system for local area networks

    Publication Year: 1988, Page(s):177 - 182
    Cited by:  Papers (1)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (540 KB)

    The design and implementation of an experimental fault-tolerant distributed database management system is described. The system provides a logically integrated view of data with distribution transparency and a controlled data replication. A commitment protocol used to guarantee atomicity of update operations is discussed. Efficient algorithms used to recover a site from a failure and restore data ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of RAID

    Publication Year: 1988, Page(s):157 - 166
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (748 KB)

    RAID is a robust and adaptable distributed system for transaction processing. It is a message-passing system, with server processes on each site. A high-level, layered communications package provides a clean, location independent interface between servers. RAID processes concurrent updates and retrievals on multiple sites. The servers manage concurrent processing, consistent replicated copies duri... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recovering imprecise transactions with real-time constraints

    Publication Year: 1988, Page(s):185 - 193
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (652 KB)

    In real-time database systems, a transaction may not have enough time to complete. In such cases, partial, or imprecise, results can still be produced. The authors have proposed an imprecise result mechanism for producing partial results, which is used to implement timing error recovery in real-time database systems. They also present a model of real-time systems that distinguishes the external da... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Vote assignments in weighted voting mechanisms

    Publication Year: 1988, Page(s):138 - 143
    Cited by:  Papers (16)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (388 KB)

    Majority voting is commonly used in distributed computing systems to control mutual exclusion, and in fault-tolerant computing to achieve reliability. Different vote assignments may result in different reliabilities. The authors present vote assignment algorithms aimed at maximizing the reliability. In their approach the voting weight assigned to each node is readily determined if the link failure... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The commit/abort problem in type-specific locking

    Publication Year: 1988, Page(s):204 - 213
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (864 KB)

    Type-specific locking is designed to increase performance of distributed transactions by allowing competing transactions to concurrently alter shared objects concurrently, provided their changes are commutative. When one of the set of compatible transactions commits or aborts, a problem arises due to the indeterminacy of the alterations of the others. A tree of nineteen solutions to the problem ha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pessimistic protocols for quasi-partitioned distributed database systems

    Publication Year: 1988, Page(s):35 - 43
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (816 KB)

    The authors propose two protocols for transaction processing in quasi-partitioned databases. The protocols are pessimistic in that they permit the execution of update transactions in exactly one partition. The first protocol is defined for a fully partition-replicated database in which every partition contains a copy of every data object. The second protocol is defined for a partially partition-re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interactive consistency with multiple failure modes

    Publication Year: 1988, Page(s):93 - 100
    Cited by:  Papers (50)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (552 KB)

    The authors address the problem of reaching Byzantine agreement in a distributed system in the presence of different types of faults and show that significant improvements in reliability and performance are possible if faults can be partitioned into disjoint classes. They show that, in a distributed system, to guarantee Byzantine agreement requires N>2a+2s+b+... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A robust, distributed election protocol

    Publication Year: 1988, Page(s):54 - 60
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (492 KB)

    The authors present an election protocol that does not assume an underlying ring structure and that tolerates failures, including lost messages and network partitioning, during the execution of the protocol itself. The major problem to be solved is that when nodes cannot communicate with one another or messages are lost, a conflict in resolving the election will often arise. In the authors' approa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed locking: a mechanism for constructing highly available objects

    Publication Year: 1988, Page(s):194 - 203
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (952 KB)

    A description is given of the results of a study of methods of achieving fault tolerance in the Clouds system and, in particular, of achieving increased availability of objects. The problems explored in this work, the model of distributed computation in which the problems posed by the research were examined (the Clouds system), the tools that were used to address these problems (the Aeolus program... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the diagnosis of Byzantine faults

    Publication Year: 1988, Page(s):144 - 153
    Cited by:  Papers (4)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (760 KB)

    The class of evidence-based diagnosis algorithms is developed to identify Byzantine (and any other faulty) processors. Such algorithms are said to be fair if they identify no failure-free processor as faulty. This paper makes two significant contributions: (i) it introduces a very general and simple formal model of the evidence-based diagnosis algorithms; and (ii) it derives a simple fair diagnosi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A commit protocol for checkpointing transactions

    Publication Year: 1988, Page(s):22 - 31
    Cited by:  Papers (6)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (772 KB)

    A commit protocol is described for checkpointing distributed transactions. Commit protocols are used by distributed transaction management systems to ensure that the multiple nodes participating in a distributed transaction will commit or abort together. This commit protocol is different from others in that a process executing on behalf of a transaction can be interrupted and restarted at some pre... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A realistic evaluation of optimistic dynamic voting

    Publication Year: 1988, Page(s):129 - 137
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (708 KB)

    When data are replicated an access protocol must be chosen to ensure the presentation of a consistent view of the data. Protocols based on quorum consensus provide good availability with the added benefit of mutual exclusion. Of the protocols based on quorum consensus, the dynamic voting protocols provide the highest known availability. A dynamic voting protocol that does not need the instantaneou... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Quorum consensus algorithms for secure and reliable data

    Publication Year: 1988, Page(s):44 - 53
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (660 KB)

    The authors address the issue of maintaining security in a fault-tolerant replicated database. They present a data-management protocol that integrates the information-dispersal algorithm (for security) and the quorum-consensus algorithm (for reliability). Although this protocol provides the desired level of security, it does not achieve the same level of availability for both read and write operat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Independent checkpointing and concurrent rollback for recovery in distributed systems-an optimistic approach

    Publication Year: 1988, Page(s):3 - 12
    Cited by:  Papers (84)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (796 KB)

    A checkpoint algorithm is presented that benefits from the research in concurrency control, commit, and site recovery algorithms in transaction processing. In the authors' approach a number of checkpointing processes, a number of rollback processes, and computations on operational processes can proceed concurrently while tolerating the failure of an arbitrary number of processes. Each process take... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Task allocation for optimized system reliability

    Publication Year: 1988, Page(s):82 - 90
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (572 KB)

    The authors deal with the task allocation problem in distributed software design, with the goal of maximizing the system reliability. A quantitative problem model, algorithms for optimal and suboptimal solutions, and simulation results are provided and discussed. Because the authors use a new allocation goal-to maximize system reliability-this paper complements the existing body of knowledge in ta... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Checkpointing and rollback recovery in a distributed system using common time base

    Publication Year: 1988, Page(s):13 - 21
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    An approach to checkpointing and rollback recovery in a distributed computing system using a common time base is proposed. First, a common time base is established in the system using a hardware clock synchronization algorithm. This common time base is coupled with a pseudorecovery block approach to develop a checkpointing algorithm that has the following advantages: (i) maximum process autonomy, ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An analysis of the performance impacts of lookahead execution in the conversation scheme

    Publication Year: 1988, Page(s):71 - 81
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (724 KB)

    The lookahead execution approach, which allows early finishing participant processes to exit from a conversation before other participants finish their conversation activities, is adopted as a fundamental approach to reducing the synchronization overhead. Queueing network models are developed for both the system operating under the basic conversation scheme and the system operating under the conve... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.