By Topic

Reliable Distributed Systems, 2002. Proceedings. 21st IEEE Symposium on

Date 13-16 Oct. 2002

Filter Results

Displaying Results 1 - 25 of 54
  • Proceedings 21st IEEE Symposium on Reliable Distributed Systems

    Publication Year: 2002
    Request permission for commercial reuse | PDF file iconPDF (176 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2002, Page(s):423 - 424
    Request permission for commercial reuse | PDF file iconPDF (58 KB)
    Freely Available from IEEE
  • Fault-tolerant virtual private networks within an autonomous system

    Publication Year: 2002, Page(s):41 - 50
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (689 KB) | HTML iconHTML

    This paper proposes the concept of a fault-tolerant virtual private network (FVPN) within an autonomous system-a framework for supporting seamless network fail-over by leveraging the inherent redundancy of the underlying Internet infrastructure. The proposed architecture includes an application-level module, which is integrated into gateways at VPN end-points. This module enables fail-over to a re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Collaborative networking in an uncooperative Internet

    Publication Year: 2002, Page(s):51 - 60
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (619 KB) | HTML iconHTML

    Collaborative applications often require peer-to-peer interaction and peer discovery mechanisms. In today's Internet, Firewall and NAT technology, and a lack of support of IP multicast, have made it very difficult to support such applications. Application Level Gateways and Directory Services can solve these problems to some extent, but have scalability problems and should be used as a last resort... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dependability of CORBA systems: service characterization by fault injection

    Publication Year: 2002, Page(s):276 - 285
    Cited by:  Papers (16)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (730 KB) | HTML iconHTML

    The dependability of CORBA systems is a crucial issue for the development of today's distributed platforms and applications. This paper analyzes various techniques that can be applied to the dependability evaluation of CORBA systems. Due to the complexity of a middleware platform like CORBA and its various types of software components, experiments using several fault injection techniques are requi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Availability models with age-dependent checkpointing

    Publication Year: 2002, Page(s):130 - 139
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (583 KB) | HTML iconHTML

    In this paper, we consider a new stochastic model for file recovery action with checkpointing when a system failure occurs according to a homogeneous Poisson process. The present checkpoint model strongly depends on the system age and is quite different from the models by Gelenbe (1979) and Goes and Sumita (1995). We propose three kinds of approximation schemes to determine the optimal checkpoint ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asynchronous Byzantine group communication

    Publication Year: 2002, Page(s):352 - 357
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (431 KB) | HTML iconHTML

    This paper summarizes our work on group communication in a fully asynchronous Byzantine environment. Instead of failure detectors or timing information, our protocols use randomization to circumvent the impossibility result by Fischer, Lynch and Paterson. This is the first time this technique is used for a real system; thanks to modern cryptography, our protocols are practical and fast enough to b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault-local stabilization : the shortest path tree

    Publication Year: 2002, Page(s):62 - 69
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (475 KB) | HTML iconHTML

    We present a fault-local solution to the shortest path tree problem in a rooted network. We consider the case where a transient fault corrupts f nodes (f is unknown, but inferior to half the size of the network) after the tree has been constructed. Our solution allows to recover in less than O (f) time units. If an upper bound k on the number of corrupted nodes is known, the memory space needed de... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A lower bound on dynamic k-stabilization in asynchronous systems

    Publication Year: 2002, Page(s):212 - 221
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (527 KB) | HTML iconHTML

    It is desirable that the smaller the number of faults hitting a network, the faster a network protocol recovers. We study the scenario where up to k (for a given k) faults hit processors of a synchronous distributed system by corrupting their state undetectably. In this context, we show that the well known step complexity model is not appropriate to study time complexity of time-adaptive protocols... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of inspection-based preventive maintenance in operational software systems

    Publication Year: 2002, Page(s):286 - 295
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (586 KB) | HTML iconHTML

    Recently, the phenomenon of "software aging", one in which the state of a software system gradually degrades with time and eventually leads to performance degradation or crash/hang failure, has been reported. Preventive maintenance of operational software systems is used specifically to counteract this phenomenon. However preventive maintenance incurs an overhead in terms of downtime and cost and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Building a reliable mutable file system on peer-to-peer storage

    Publication Year: 2002, Page(s):324 - 329
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (445 KB) | HTML iconHTML

    This paper sketches the design of the Eliot File System (Eliot), a mutable file system that maintains the pure immutability of its peer-to-peer (P2P) substrate by isolating mutation in an auxiliary metadata service. The immutability of address-to-content bindings has several advantages in P2P systems. However mutable file systems are desirable because they allow clients to update existing files; a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Management of mobile agent systems using social insect metaphors

    Publication Year: 2002, Page(s):410 - 415
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (509 KB) | HTML iconHTML

    The management of mobile agent systems that solve problems in a network is an issue that must be addressed if mobile agents are to be deployed industrially. It is clear that insufficient or excessive numbers of agents can cause the problem solving capabilities of an agent-based system to be impaired. Also, agents being software entities are almost always flawed therefore requiring the upgrade prob... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Power-aware epidemics

    Publication Year: 2002, Page(s):358 - 361
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (326 KB) | HTML iconHTML

    Epidemic protocols have been heralded as appropriate for wireless sensor networks. The nodes in such networks have limited battery resources. In this paper we investigate the use of power in three styles of epidemic protocols: basic epidemics, neighborhood flooding epidemics, and hierarchical epidemics. Basic epidemics turn out to be highly power hungry, and are not appropriate for power-aware app... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • OBIGrid: towards a new distributed platform for bioinformatics

    Publication Year: 2002, Page(s):380 - 381
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB) | HTML iconHTML

    This paper describes the design philosophy for the grid system being developed by the Japan Committee on High-Performance Computing for Bioinformatics and Initiative for Parallel Bioinformatics (IPAB). The grid is an attractive solution to achieve a distributed bioinformatics environment with high performance parallel computers, large genomic databases, computation intensive applications such as h... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A self-stabilizing algorithm for the Steiner tree problem

    Publication Year: 2002, Page(s):396 - 401
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB) | HTML iconHTML

    Self-stabilization is a theoretical framework of non-masking fault-tolerant distributed algorithms. In this paper, we investigate the Steiner tree problem in distributed systems, and propose a self-stabilizing solution to the problem. Our solution is based on the pruned-MST technique, a heuristic technique to find a minimal cost Steiner tree by pruning unnecessary nodes and edges in a minimum cost... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient distributed precision control in symmetric replication environments

    Publication Year: 2002, Page(s):119 - 128
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (728 KB) | HTML iconHTML

    Maintaining strict consistency of replicated data can be prohibitively expensive for many distributed applications and environments. In order to alleviate this problem, some systems allow applications to access stale, imprecise data. Due to relaxed correctness requirements, many applications can tolerate stale data but require that the imprecision be properly bounded. This paper describes ReBound,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fault-tolerant approach to secure information retrieval

    Publication Year: 2002, Page(s):12 - 21
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (624 KB) | HTML iconHTML

    Several private information retrieval (PIR) schemes were proposed to protect users' privacy when sensitive information stored in database servers is retrieved. However, existing PIR schemes assume that any attack to the servers does not change the information stored and any computational results. We present a novel fault-tolerant PIR scheme (called FT-PIR) that protects users' privacy and at the s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimistic Byzantine agreement

    Publication Year: 2002, Page(s):262 - 267
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (479 KB) | HTML iconHTML

    The paper considers the Byzantine agreement problem in a fully asynchronous network, where some participants may be actively malicious. This is an important building block for fault-tolerant applications in a hostile environment, and a non-trivial problem: An early result by Fischer et al. (1985) shows that there is no deterministic solution in a fully asynchronous network subject to even a single... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Non-intrusive, parallel recovery of replicated data

    Publication Year: 2002, Page(s):150 - 159
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (629 KB) | HTML iconHTML

    The increasingly widespread use of cluster architectures has resulted in many new application scenarios for data replication. While data replication is, in principle, a well understood problem. recovery of replicated systems has not yet received enough attention. In the case of clusters, recovery procedures are particularly important since they have to keep a high level of availability even during... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Service time optimal self-stabilizing token circulation protocol on anonymous undirectional rings

    Publication Year: 2002, Page(s):80 - 89
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (641 KB) | HTML iconHTML

    We present a self-stabilizing token circulation protocol on unidirectional anonymous rings. This protocol requires no processor identifiers or distinguished processor (i.e. all processors perform the same algorithm). The protocol is randomized and self-stabilizing, meaning that starting from an arbitrary configuration (in response to an arbitrary perturbation modifying the memory state), it reache... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The guardian model for exception handling in distributed systems

    Publication Year: 2002, Page(s):304 - 313
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (643 KB) | HTML iconHTML

    We present an abstraction called guardian for exception handling in distributed systems. The guardian can solve several limitations with existing distributed exception handling techniques. To understand these limitations, we analyze distributed exception handling with respect to sequential exception handling and identify the significant differences between them. This leads to the fundamental probl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A unified proof of minimum time complexity for reaching consensus and uniform consensus - an oracle-based approach

    Publication Year: 2002, Page(s):102 - 108
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (466 KB) | HTML iconHTML

    In this paper, we offer new proofs to two lower bound results in distributed computing: a minimum of f+1 and f+2 rounds for reaching consensus and uniform consensus respectively when at most f fail-stop faults can happen. Here the computation model is synchronous message passing. Both proofs are based on a novel oracle argument. These two induction proofs are unified in the following sense: the in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of threshold-based diagnostic mechanisms for COTS-based applications

    Publication Year: 2002, Page(s):296 - 303
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (583 KB) | HTML iconHTML

    This work investigates feasibility issues that must be addressed when threshold-based mechanisms are to be used for diagnostic purposes in COTS-based distributed systems. Threshold based mechanisms have typically been used for such purposes in embedded systems. A variety of solutions exist, with different characteristics of completeness, accuracy, and induced overhead. We first discuss the challen... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The performance of checkpointing and replication schemes for fault tolerant mobile agent systems

    Publication Year: 2002, Page(s):256 - 261
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (405 KB) | HTML iconHTML

    We evaluate the performance of checkpointing and replication schemes for the fault tolerant mobile agent system. For the quantitative comparison, we have implemented an experimental system on top of the Mole mobile agent system and also built a simulation system to include various failure cases. Our experiment aims to have the insight into the behavior of agents under two schemes and provide a gui... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using file-grain connectivity to implement a peer-to-peer file system

    Publication Year: 2002, Page(s):318 - 323
    Cited by:  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (512 KB) | HTML iconHTML

    Recent work has demonstrated a peer-to-peer storage system that locates data objects using O(logN) messages by placing objects on nodes according to pseudo-randomly chosen IDs. While elegant, this approach constrains system functionality and flexibility: files are immutable, directories and symbolic names are not supported, data location is fixed, and access locality is not exploited. This paper p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.