By Topic

Reliable Distributed Systems, 2001. Proceedings. 20th IEEE Symposium on

Date 31-31 Oct. 2001

Filter Results

Displaying Results 1 - 25 of 37
  • Proceedings 20th IEEE Symposium on Reliable Distributed Systems

    Publication Year: 2001
    Request permission for commercial reuse | PDF file iconPDF (41 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2001, Page(s): 267
    Request permission for commercial reuse | PDF file iconPDF (13 KB)
    Freely Available from IEEE
  • Message logging optimization for wireless networks

    Publication Year: 2001, Page(s):182 - 185
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (272 KB) | HTML iconHTML

    This paper describes a message logging optimization that improves performance for failure recovery protocols where messages exchanged between mobile hosts are logged at base stations. The algorithm described and evaluated in this paper does not generate orphan processes in spite of base station failures and achieves run-time performance similar to that of asynchronous logging View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Why is it so hard to predict software system trustworthiness from software component trustworthiness?

    Publication Year: 2001
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (14 KB) | HTML iconHTML

    When software is built from components, nonfunctional properties such as security, reliability, fault-tolerance, performance, availability, safety, etc. are not necessarily composed. The problem stems from our inability to know a priori, for example, that the security of a system composed of two components can be determined from knowledge about the security of each. This is because the security of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application of commercial-grade digital equipment in nuclear power plant safety systems

    Publication Year: 2001, Page(s):176 - 178
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (295 KB) | HTML iconHTML

    Due to obsolescence, increasing maintenance costs, and the lack of qualified spare parts for the equipment and components of the analog instrumentation and control (I&C) systems in operating domestic nuclear power plants, nuclear utilities are replacing equipment and upgrading certain I&C systems. These activities generally involve changing from analog to digital technology. In many cases ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparison-based system-level fault diagnosis in ad hoc networks

    Publication Year: 2001, Page(s):257 - 266
    Cited by:  Papers (59)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (257 KB) | HTML iconHTML

    The problem of identifying faulty mobiles in ad-hoc networks is considered. Current diagnostic models were designed for wired networks, thus they do not take advantage of the shared nature of communication typical of ad-hoc networks. In this paper we introduce a new comparison-based diagnostic model based on the one-to-many communication paradigm. Two implementations of the model are presented. In... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-quality customizable embedded software from COTS components

    Publication Year: 2001, Page(s):174 - 175
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (20 KB) | HTML iconHTML

    Dramatic advances in computer and communication technologies have greatly promoted the growth of embedded telecommunication systems. More and more critical applications, such as banking and financial services, remote patient monitoring systems, transportation, etc., are being developed. The software for these applications is becoming increasingly sophisticated and complex and this trend will accel... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Polynomial time synthesis of Byzantine agreement

    Publication Year: 2001, Page(s):130 - 139
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (225 KB) | HTML iconHTML

    We present a polynomial time algorithm for automatic synthesis of fault-tolerant distributed programs, starting from fault-intolerant versions of those programs. Since this synthesis problem is known to be NP-hard, our algorithm relies on heuristics to reduce the complexity. We demonstrate that our algorithm is able to synthesize an agreement program that tolerates a Byzantine fault View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the effectiveness of a counter-based cache invalidation scheme and its resiliency to failures in mobile environments

    Publication Year: 2001, Page(s):247 - 256
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (185 KB) | HTML iconHTML

    Caching frequently accessed data items on the client side is an effective technique to improve the performance of data dissemination in mobile environments. Classical cache invalidation strategies are not suitable for mobile environments due to the disconnection and mobility of the mobile clients. One attractive cache invalidation technique is based on invalidation reports (IRs). However, IR-based... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Designing a robust namespace for distributed file services

    Publication Year: 2001, Page(s):162 - 171
    Cited by:  Papers (7)  |  Patents (51)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB) | HTML iconHTML

    A number of ongoing research projects follow a partition-based approach to provide highly scalable distributed storage services. These systems maintain namespaces that reference objects distributed across multiple locations in the system. Typically, atomic commitment protocols, such as 2-phase commit, are used for updating the namespace, in order to guarantee its consistency even in the presence o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Research in high-confidence distributed information systems

    Publication Year: 2001, Page(s):76 - 77
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (12 KB) | HTML iconHTML

    A high-confidence system is one in which the designers, implementers, and users have a high degree of assurance that the system will not fail or misbehave due to errors in the system, faults in the environment, or hostile attempts to compromise the system. Consequences of such system behavior are well understood and are predictable under an operational context envisioned by its creators. High-conf... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • How to select a replication protocol according to scalability, availability and communication overhead

    Publication Year: 2001, Page(s):24 - 33
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (315 KB) | HTML iconHTML

    Data replication is playing an increasingly important role in the design of parallel information systems. In particular, the widespread use of cluster architectures in high-performance computing has created many opportunities for applying data replication techniques in new areas. For instance, as part of work related to cluster computing in bioinformatics, we have been confronted with the problem ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A consensus protocol based on a weak failure detector and a sliding round window

    Publication Year: 2001, Page(s):120 - 129
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (206 KB) | HTML iconHTML

    The paper revisits the "sliding window" notion commonly encountered in communication protocols and applies it to the round numbers of round-based asynchronous protocols. This approach is novel. To illustrate its benefits, the paper presents an original weak failure detector-based consensus protocol that allows each process to be simultaneously involved in several rounds. The rounds in which a proc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Can reliability and security be joined reliably and securely?

    Publication Year: 2001, Page(s):72 - 73
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (39 KB) | HTML iconHTML

    The combined topics of reliability and security are briefly traced in relation to the past and present endeavors of the Air Force Research Laboratory's Information Directorate. It is concluded that in the realm of information assurance, system features created to tolerate benign failures and to respond to attack must be stressed and tested beforehand and their effectiveness predicted, otherwise th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Continental Pronto

    Publication Year: 2001, Page(s):46 - 55
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (299 KB) | HTML iconHTML

    Continental Pronto unifies high availability and disaster resilience at the specification and implementation levels. At the specification level, Continental Pronto formalizes the client's view of a system addressing local-area and wide-area data replication within a single framework. At the implementation level, Continental Pronto makes data highly available and disaster resilient. The algorithm p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliable real-time cooperation of mobile autonomous systems

    Publication Year: 2001, Page(s):238 - 246
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB) | HTML iconHTML

    Autonomous systems are expected to provide increasingly complex and safety-critical services that will, sooner or later, require the cooperation of several autonomous systems for their fulfillment. In particular, coordinating the access to shared physical and information technological resources will become a general problem. Scheduling these resources is subject to strong real-time and reliability... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detecting heap smashing attacks through fault containment wrappers

    Publication Year: 2001, Page(s):80 - 89
    Cited by:  Papers (8)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (149 KB) | HTML iconHTML

    Buffer overflow attacks are a major cause of security breaches in modern operating systems. Not only are overflows of buffers on the stack a security threat, overflows of buffers kept on the heap can be too. A malicious user might be able to hijack the control flow of a root-privileged program if the user can initiate an overflow of a buffer on the heap when this overflow overwrites a function poi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Assessing inter-modular error propagation in distributed software

    Publication Year: 2001, Page(s):152 - 161
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (368 KB) | HTML iconHTML

    With the functionality of most embedded systems based on software (SW), interactions amongst SW modules arise, resulting in error propagation across them. During SW development, it would be helpful to have a framework that clearly demonstrates the error propagation and containment capabilities of the different SW components. In this paper, we assess the impact of inter-modular error propagation. A... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Incorporation of security and fault tolerance mechanisms into real-time component-based distributed computing systems

    Publication Year: 2001, Page(s):74 - 75
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (126 KB) | HTML iconHTML

    The volume and size of real-time (RT) distributed computing (DC) applications are now growing faster than in the last century. The mixture of application tasks running on such systems is growing as well as the shared use of computing and communication resources for multiple applications including RT and non-RT applications. The increase in use of shared resources accompanies with it the need for e... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient update diffusion in byzantine environments

    Publication Year: 2001, Page(s):90 - 98
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (166 KB) | HTML iconHTML

    We present a protocol for diffusion of updates among replicas in a distributed system where up to b replicas may suffer Byzantine failures. Our algorithm ensures that no correct replica accepts spurious updates introduced by faulty replicas, by requiring that a replica accepts an update only after receiving it from at least b+1 distinct replicas (or directly from the update source). Our algorithm ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applying fault-tolerance principles to security research

    Publication Year: 2001, Page(s):68 - 69
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (25 KB) | HTML iconHTML

    There has been much focus on building secure distributed systems. The CERIAS center has been established at Purdue along with 14 other such centers in USA. We note that many of the ideas, concepts, algorithms being proposed in security have many common threads with reliability. We need to apply the science and engineering of reliability research to the research in security and vice versa. We brief... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The challenge of creating productive collaborating information assurance communities via Internet research and standards

    Publication Year: 2001, Page(s):70 - 71
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (15 KB) | HTML iconHTML

    Overviews the challenging 5-year process leading to the design, specification, and implementation of the Internet, Engineering Task Force (IETF) Intrusion Detection Working Group (IDWQ) Intrusion Exchange Protocol (IDXP). IDXP seeks to facilitate the ubiquitous interoperability of intrusion detection components across Internet enterprises. This capability is a critical enabler of successful intrus... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of the CORBA notification service

    Publication Year: 2001, Page(s):227 - 236
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (241 KB) | HTML iconHTML

    As CORBA (Common Object Request Broker Architecture) gains popularity as a standard for portable, distributed, object-oriented computing, the need for a CORBA messaging solution is being increasingly felt. This led the Object Management Group (OMQ) to specify a Notification Service that aims to provide a more flexible and robust messaging solution than the earlier Event Service. The Notification S... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient recovery information management schemes for the fault tolerant mobile computing systems

    Publication Year: 2001, Page(s):202 - 205
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (110 KB) | HTML iconHTML

    This paper presents region-based storage management schemes, which support the efficient implementation of checkpointing and message logging for fault tolerant mobile computing systems. In the proposed schemes, a recovery manager assigned for a group of cells takes care of the recovery for the mobile hosts within the region. As a result, the recovery information of a mobile host, which may be disp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Chasing the FLP impossibility result in a LAN: or, How robust can a fault tolerant server be?

    Publication Year: 2001, Page(s):190 - 193
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (92 KB) | HTML iconHTML

    Fault tolerance can be achieved in distributed systems by replication. However Fischer, Lynch and Paterson (1985) have proven an impossibility result about consensus in the asynchronous system model, and similar impossibility results exist for atomic broadcast and group membership. We investigate, with the aid of an experiment conducted in a LAN, whether these impossibility results set limits to t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.