By Topic

Reliable Distributed Systems, 2001. Proceedings. 20th IEEE Symposium on

Date 31-31 Oct. 2001

Filter Results

Displaying Results 1 - 25 of 37
  • Proceedings 20th IEEE Symposium on Reliable Distributed Systems

    Publication Year: 2001
    Request permission for commercial reuse | PDF file iconPDF (41 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2001, Page(s): 267
    Request permission for commercial reuse | PDF file iconPDF (13 KB)
    Freely Available from IEEE
  • Why is it so hard to predict software system trustworthiness from software component trustworthiness?

    Publication Year: 2001
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (14 KB) | HTML iconHTML

    When software is built from components, nonfunctional properties such as security, reliability, fault-tolerance, performance, availability, safety, etc. are not necessarily composed. The problem stems from our inability to know a priori, for example, that the security of a system composed of two components can be determined from knowledge about the security of each. This is because the security of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Message logging optimization for wireless networks

    Publication Year: 2001, Page(s):182 - 185
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB) | HTML iconHTML

    This paper describes a message logging optimization that improves performance for failure recovery protocols where messages exchanged between mobile hosts are logged at base stations. The algorithm described and evaluated in this paper does not generate orphan processes in spite of base station failures and achieves run-time performance similar to that of asynchronous logging View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting heap smashing attacks through fault containment wrappers

    Publication Year: 2001, Page(s):80 - 89
    Cited by:  Papers (7)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (149 KB) | HTML iconHTML

    Buffer overflow attacks are a major cause of security breaches in modern operating systems. Not only are overflows of buffers on the stack a security threat, overflows of buffers kept on the heap can be too. A malicious user might be able to hijack the control flow of a root-privileged program if the user can initiate an overflow of a buffer on the heap when this overflow overwrites a function poi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient recovery information management schemes for the fault tolerant mobile computing systems

    Publication Year: 2001, Page(s):202 - 205
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (110 KB) | HTML iconHTML

    This paper presents region-based storage management schemes, which support the efficient implementation of checkpointing and message logging for fault tolerant mobile computing systems. In the proposed schemes, a recovery manager assigned for a group of cells takes care of the recovery for the mobile hosts within the region. As a result, the recovery information of a mobile host, which may be disp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient update diffusion in byzantine environments

    Publication Year: 2001, Page(s):90 - 98
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (166 KB) | HTML iconHTML

    We present a protocol for diffusion of updates among replicas in a distributed system where up to b replicas may suffer Byzantine failures. Our algorithm ensures that no correct replica accepts spurious updates introduced by faulty replicas, by requiring that a replica accepts an update only after receiving it from at least b+1 distinct replicas (or directly from the update source). Our algorithm ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Continental Pronto

    Publication Year: 2001, Page(s):46 - 55
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (299 KB) | HTML iconHTML

    Continental Pronto unifies high availability and disaster resilience at the specification and implementation levels. At the specification level, Continental Pronto formalizes the client's view of a system addressing local-area and wide-area data replication within a single framework. At the implementation level, Continental Pronto makes data highly available and disaster resilient. The algorithm p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of the CORBA notification service

    Publication Year: 2001, Page(s):227 - 236
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (241 KB) | HTML iconHTML

    As CORBA (Common Object Request Broker Architecture) gains popularity as a standard for portable, distributed, object-oriented computing, the need for a CORBA messaging solution is being increasingly felt. This led the Object Management Group (OMQ) to specify a Notification Service that aims to provide a more flexible and robust messaging solution than the earlier Event Service. The Notification S... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using the timely computing base for dependable QoS adaptation

    Publication Year: 2001, Page(s):208 - 217
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (180 KB) | HTML iconHTML

    In open and heterogeneous environments, where an unpredictable number of applications compete for a limited amount of resources, executions can be affected by also unpredictable delays, which may not even be bounded. Since many of these applications have timeliness requirements, they can only be implemented if they are able to adapt to the existing conditions. We present a novel approach, called d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Polynomial time synthesis of Byzantine agreement

    Publication Year: 2001, Page(s):130 - 139
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (225 KB) | HTML iconHTML

    We present a polynomial time algorithm for automatic synthesis of fault-tolerant distributed programs, starting from fault-intolerant versions of those programs. Since this synthesis problem is known to be NP-hard, our algorithm relies on heuristics to reduce the complexity. We demonstrate that our algorithm is able to synthesize an agreement program that tolerates a Byzantine fault View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Can reliability and security be joined reliably and securely?

    Publication Year: 2001, Page(s):72 - 73
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (39 KB) | HTML iconHTML

    The combined topics of reliability and security are briefly traced in relation to the past and present endeavors of the Air Force Research Laboratory's Information Directorate. It is concluded that in the realm of information assurance, system features created to tolerate benign failures and to respond to attack must be stressed and tested beforehand and their effectiveness predicted, otherwise th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Chasing the FLP impossibility result in a LAN: or, How robust can a fault tolerant server be?

    Publication Year: 2001, Page(s):190 - 193
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (92 KB) | HTML iconHTML

    Fault tolerance can be achieved in distributed systems by replication. However Fischer, Lynch and Paterson (1985) have proven an impossibility result about consensus in the asynchronous system model, and similar impossibility results exist for atomic broadcast and group membership. We investigate, with the aid of an experiment conducted in a LAN, whether these impossibility results set limits to t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Primary-backup replication: from a time-free protocol to a time-based implementation

    Publication Year: 2001, Page(s):14 - 23
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (177 KB) | HTML iconHTML

    Fault-tolerant control systems can be built by replicating critical components. However replication raises the issue of inconsistency. Multiple protocols for ensuring consistency have been described in the literature. PADRE (Protocol for Asymmetric Duplex REdundancy) is such a protocol, and an interesting case study of a complex and sensitive problem: the management of replicated traffic controlle... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparison-based system-level fault diagnosis in ad hoc networks

    Publication Year: 2001, Page(s):257 - 266
    Cited by:  Papers (42)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (257 KB) | HTML iconHTML

    The problem of identifying faulty mobiles in ad-hoc networks is considered. Current diagnostic models were designed for wired networks, thus they do not take advantage of the shared nature of communication typical of ad-hoc networks. In this paper we introduce a new comparison-based diagnostic model based on the one-to-many communication paradigm. Two implementations of the model are presented. In... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compiler-assisted heterogeneous checkpointing

    Publication Year: 2001, Page(s):56 - 65
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (142 KB) | HTML iconHTML

    We consider the problem of heterogeneous checkpointing in distributed systems. We propose a new solution to the problem that is truly heterogeneous in that it can support new architectures without any information about the architecture. The ability to support new architectures without additional knowledge or custom configuration is an important contribution of this work. This ability is particular... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A microkernel middleware architecture for distributed embedded real-time systems

    Publication Year: 2001, Page(s):218 - 226
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (212 KB) | HTML iconHTML

    Today more and more embedded real-time systems are implemented in a distributed way. These distributed embedded systems consist of a few controllers up to several hundreds. Distribution and parallelism in the design of embedded real-time systems increase the engineering challenges and require new methodological framework based on middleware. Our research work focuses on the development of a middle... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Research in high-confidence distributed information systems

    Publication Year: 2001, Page(s):76 - 77
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (12 KB) | HTML iconHTML

    A high-confidence system is one in which the designers, implementers, and users have a high degree of assurance that the system will not fail or misbehave due to errors in the system, faults in the environment, or hostile attempts to compromise the system. Consequences of such system behavior are well understood and are predictable under an operational context envisioned by its creators. High-conf... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Looking ahead in atomic actions with exception handling

    Publication Year: 2001, Page(s):142 - 151
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (132 KB) | HTML iconHTML

    An approach to introducing exception handling into object-oriented N is presented. A novel atomic action scheme is developed that does not impose any participant synchronisation on action exit. In order to use cooperative exception handling at the action level as the main fault tolerance mechanism, we develop a distributed protocol that finds, for any exception raised, an action containing all pot... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing a robust namespace for distributed file services

    Publication Year: 2001, Page(s):162 - 171
    Cited by:  Papers (7)  |  Patents (51)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB) | HTML iconHTML

    A number of ongoing research projects follow a partition-based approach to provide highly scalable distributed storage services. These systems maintain namespaces that reference objects distributed across multiple locations in the system. Typically, atomic commitment protocols, such as 2-phase commit, are used for updating the namespace, in order to guarantee its consistency even in the presence o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Incorporation of security and fault tolerance mechanisms into real-time component-based distributed computing systems

    Publication Year: 2001, Page(s):74 - 75
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (126 KB) | HTML iconHTML

    The volume and size of real-time (RT) distributed computing (DC) applications are now growing faster than in the last century. The mixture of application tasks running on such systems is growing as well as the shared use of computing and communication resources for multiple applications including RT and non-RT applications. The increase in use of shared resources accompanies with it the need for e... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Consensus with written messages under link faults

    Publication Year: 2001, Page(s):194 - 197
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (78 KB) | HTML iconHTML

    This paper shows that deterministic consensus with written messages is possible in presence of link faults and compromised signatures. Relying upon a suitable perception-based hybrid fault model that provides different categories for both node and link faults, we prove that the authenticated Byzantine agreement algorithms OMHA and ZA of Gong, Lincoln and Rushby (1995) can be made resilient to f View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Quantifying rollback propagation in distributed checkpointing

    Publication Year: 2001, Page(s):36 - 45
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (226 KB) | HTML iconHTML

    Proposes a new classification of executions with checkpoints that is based on the notion of k-rollback, indicating the maximal number of checkpoints that may need to be rolled back during recovery. The relation between known execution classes is explored, and it is shown that coordinated checkpointing, SZPF (strictly Z-path free) and ZPF (Z-path free) are 1-rollback mechanisms, while ZCF (Z-cycle ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient TDMA synchronization for distributed embedded systems

    Publication Year: 2001, Page(s):198 - 201
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (139 KB) | HTML iconHTML

    A desired attribute in safety critical embedded real-time systems is a system time/event synchronization capability on which predictable communication can be established. Focusing on bus-based communication protocols in TDMA environments, we present a novel, efficient, and low-cost synchronization approach with bounded start-up time. This approach utilizes information about each node's unique mess... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimizing file availability in a secure serverless distributed file system

    Publication Year: 2001, Page(s):4 - 13
    Cited by:  Papers (23)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (820 KB) | HTML iconHTML

    Farsite is a secure, scalable, distributed file system that logically functions as a centralized file server but that is physically realized on a set of client desktop computers. Farsite provides security, reliability and availability by storing replicas of each file on multiple machines. It continuously monitors machine availability and relocates replicas as necessary to maximize the effective av... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.