By Topic

Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000

16-18 Oct. 2000

Filter Results

Displaying Results 1 - 25 of 27
  • Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000

    Publication Year: 2000
    Request permission for commercial reuse | PDF file iconPDF (188 KB)
    Freely Available from IEEE
  • Deterministic scheduling for transactional multithreaded replicas

    Publication Year: 2000, Page(s):164 - 173
    Cited by:  Papers (29)  |  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1031 KB)

    One way to implement a fault-tolerant service is by replicating it at sites that fail independently. One of the replication techniques is active replication where each request is executed by all the replicas. Thus, the effects of failures can be completely masked, resulting in an increase of service availability. In order to preserve consistency among replicas, replicas must exhibit a deterministi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic node management and measure estimation in a state-driven fault injector

    Publication Year: 2000, Page(s):248 - 257
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (983 KB)

    The following topics were dealt with: visual querying and data exploration; graphs and hierarchies; taxonomies, frameworks and methodology; document visualization and collaborative visualization; algorithm visualization; and 3D navigation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Index of authors

    Publication Year: 2000, Page(s): 263
    Request permission for commercial reuse | PDF file iconPDF (47 KB)
    Freely Available from IEEE
  • Performance analysis of the CORBA event service using stochastic reward nets

    Publication Year: 2000, Page(s):238 - 247
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (748 KB)

    The event service is the earliest CORBA solution to the message queue model of communication in distributed systems. Typical implementations, however, suffer from the lack of event delivery guarantees. The loss of messages is aggravated by the presence of burstiness in the input to the event service, and occurrences of isolated bursts of traffic could also have serious effects. In this paper, we d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the use of model checking techniques for dependability evaluation

    Publication Year: 2000, Page(s):228 - 237
    Cited by:  Papers (23)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (776 KB)

    Over the last two decades, many techniques have been developed to specify and evaluate Markovian dependability models. Most often, these Markovian models are automatically derived from stochastic Petri nets, stochastic process algebras or stochastic activity networks. However, whereas the model specification has become very comfortable, the specification of the dependability measures of interest m... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Issues insufficiently resolved in Century 20 in the fault-tolerant distributed computing field

    Publication Year: 2000, Page(s):106 - 115
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (872 KB)

    As the 21st Century has just opened up, it is a fitting time to reflect on the evolution of the fault-tolerant distributed computing technology that occurred in the last century. The author's view of that evolution is sketched in this paper, with emphasis on the major issues that were insufficiently resolved in the 20th Century. Such issues are naturally among what the author believes to be the pr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementing a reflective fault-tolerant CORBA system

    Publication Year: 2000, Page(s):154 - 163
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (780 KB)

    The use of reflection is becoming popular today for the implementation of non-functional mechanisms such as fault tolerance. The main benefits of reflection are separation of concerns between the application and the mechanisms and transparency from the application programmer point of view. Unfortunately, metaobject protocols (MOPs) available today are not satisfactory with respect to necessary fea... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Proxy-based recovery for applications on wireless hand-held devices

    Publication Year: 2000, Page(s):2 - 10
    Cited by:  Papers (3)  |  Patents (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (824 KB)

    The low communication bandwidth, slow processor and limited memory of hand-held devices make it undesirable for them to store their own checkpoints or send process state information over a wireless network. The paper describes an approach to failure recovery for three-tier client and server application environments where the client applications execute on wireless handheld devices. The key idea is... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High availability of the memory hierarchy in a cluster

    Publication Year: 2000, Page(s):134 - 143
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1040 KB)

    A single-level store (SLS) integrating a shared virtual memory and a parallel file system with file mapping as its interface is attractive for the execution of high-performance applications in a cluster. However, the probability of a node reboot or failure is quite high. In this paper, we present the design of a highly available SLS system. Our approach combines checkpointing in memory and permane... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimistic Virtual Synchrony

    Publication Year: 2000, Page(s):42 - 51
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (896 KB)

    We present Optimistic Virtual Synchrony (OVS), a new form of group communication which provides the same capabilities as Virtual Synchrony with better performance. It does so by allowing applications to send messages during periods in which services implementing Virtual Synchrony block. OVS also allows applications to determine the policy as to when messages sent optimistically should be delivered... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of mobile, single-object, replication protocols

    Publication Year: 2000, Page(s):218 - 227
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (852 KB)

    Discusses the implementation and performance of bounded voting, which is a new object replication protocol designed for use in mobile and weakly-connected environments. We show that the protocol eliminates several restrictions of previous work, such as the need for (1) strong or complete connectivity, (2) complete knowledge of system membership, and (3) low update rates. The protocol implements an... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Consistent detection of global predicates under a weak fault assumption

    Publication Year: 2000, Page(s):94 - 103
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (816 KB)

    We study the problem of detecting general global predicates in distributed systems where all application processes and at most t<m monitor processes may be subject to crash faults, where m is the total number of monitor processes in the system. We introduce two new observation modalities called negotiably and discernibly (which correspond to possibly and definitely in fault-free systems) and pr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Pronto: a fast failover protocol for off-the-shelf commercial databases

    Publication Year: 2000, Page(s):176 - 185
    Cited by:  Papers (12)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (776 KB)

    Enterprise applications typically store their state in databases. If a database fails, the application is unavailable while the database recovers. Database recovery is time consuming because it involves replaying the persistent transaction log. To isolate end users from database failures, we introduce Pronto, a protocol to orchestrate the transaction processing by multiple, standard databases so t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An investigation of membership and clique avoidance in TTP/C

    Publication Year: 2000, Page(s):118 - 124
    Cited by:  Papers (19)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (608 KB)

    Avoiding the partitioning of a cluster into cliques that are not able to communicate with each other is an important issue in the time-triggered communication protocol TTP/C. This is achieved by a mechanism called clique avoidance. The clique avoidance algorithm always selects one partition (clique) to win and causes all nodes of other partitions to shut down. In this paper, we investigate the pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling fault-tolerant mobile agent execution as a sequence of agreement problems

    Publication Year: 2000, Page(s):11 - 20
    Cited by:  Papers (27)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (892 KB)

    Fault tolerance is fundamental to the further development of mobile agent applications. In the context of mobile agents, fault tolerance prevents a partial or complete loss of the agent, i.e. ensures that the agent arrives at its destination. Simple approaches such as checkpointing are prone to blocking. Replication can in principle improve solutions based on checkpointing. However existing soluti... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Abstractions for devising Byzantine-resilient state machine replication

    Publication Year: 2000, Page(s):144 - 153
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (760 KB)

    State machine replication is a common approach for making a distributed service highly available and resilient to failures, by replicating it on different processes. It is well known, however that the difficulty of ensuring the safety and liveness of a replicated service increases significantly when no synchrony assumptions are made, and when processes can exhibit Byzantine behaviors. The contribu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal implementation of the weakest failure detector for solving consensus

    Publication Year: 2000, Page(s):52 - 59
    Cited by:  Papers (46)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (636 KB)

    The concept of unreliable failure detector was introduced by T.D. Chandra and S. Toueg (1996) as a mechanism that provides information about process failures. Depending on the properties which the failure detectors guarantee, they proposed a taxonomy of failure detectors. It has been shown that one of the classes of this taxonomy, namely Eventually Strong (∇S), is the weakest class allowing ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A pragmatic implementation of e-transactions

    Publication Year: 2000, Page(s):186 - 195
    Cited by:  Papers (14)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (808 KB)

    Three-tier applications have nice properties, which make them scalable and manageable: clients are thin and servers are stateless. However, it is challenging to implement or even define end-to-end reliability for such applications. Furthermore, it is especially hard to make these applications reliable without violating their nice properties. In previous work, we identified e-transactions as a desi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Continuous clock synchronization in wireless real-time applications

    Publication Year: 2000, Page(s):125 - 132
    Cited by:  Papers (60)  |  Patents (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (552 KB)

    Continuous clock synchronization avoids unpredictable instantaneous corrections of clock values. This is usually achieved by spreading the clock correction over the synchronization interval. In the context of wireless real time applications, a protocol achieving continuous clock synchronization must tolerate message losses and should have a low overhead in terms of the number of messages. The pape... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improvement of the QoS via an adaptive and dynamic distribution of applications in a mobile environment

    Publication Year: 2000, Page(s):21 - 29
    Cited by:  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (708 KB)

    Mobile computing is a domain in great expansion. Wireless networks and Portable Information Appliances (PIAs) are developing very rapidly. More and more mobile users would like to perform their multimedia applications with the same facility as on their desktop station. Use of such applications in a mobile environment raises new challenges. These applications are interactive and extremely costly in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Semantically reliable multicast protocols

    Publication Year: 2000, Page(s):60 - 69
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (744 KB)

    Reliable multicast protocols can strongly simplify the design of distributed applications. However it is hard to sustain a high multicast throughput when groups are large and heterogeneous. In an attempt to overcome this limitation, previous work has focused on weakening reliability properties. The authors introduce a novel reliability model that exploits semantic knowledge to decide in which spec... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using multicast communication to reduce deadlock in replicated databases

    Publication Year: 2000, Page(s):196 - 205
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (816 KB)

    Obtaining good performance from a distributed replicated database that allows update transactions to originate at any site while ensuring one-copy serializability is a challenge. A popular analysis of deadlock probabilities in replicated databases shows that the deadlock rate for the system is high and increases as the third power of the number of replicas. We show how a replica management protoco... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliable broadcast in the crash-recovery model

    Publication Year: 2000, Page(s):32 - 41
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (764 KB)

    The paper addresses the problem of broadcasting messages in a reliable manner within a practical asynchronous system where processes and channels may crash and recover. In this crash-recovery model, we present meaningful specifications of reliable broadcast and we describe algorithms that implement those specifications. Our approach is modular and incremental. It is modular in the sense that we gi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An evolutionary algorithm for identifying faults in t-diagnosable systems

    Publication Year: 2000, Page(s):74 - 83
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (724 KB)

    The paper describes a novel approach to the problem of system-level fault diagnosis using genetic algorithms. Consider a system composed of n independent units, each of which tests a subset of the others. It is assumed that at most t of these units are permanently faulty. Such a system is said to be t-diagnosable if, given any complete collection of test results, the set of faulty units can be uni... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.