By Topic

21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings.

13-16 Oct. 2002

Filter Results

Displaying Results 1 - 25 of 54
  • Proceedings 21st IEEE Symposium on Reliable Distributed Systems

    Publication Year: 2002
    Request permission for commercial reuse | PDF file iconPDF (176 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 2002, Page(s):423 - 424
    Request permission for commercial reuse | PDF file iconPDF (58 KB)
    Freely Available from IEEE
  • Management of mobile agent systems using social insect metaphors

    Publication Year: 2002, Page(s):410 - 415
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (509 KB) | HTML iconHTML

    The management of mobile agent systems that solve problems in a network is an issue that must be addressed if mobile agents are to be deployed industrially. It is clear that insufficient or excessive numbers of agents can cause the problem solving capabilities of an agent-based system to be impaired. Also, agents being software entities are almost always flawed therefore requiring the upgrade prob... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Failure detectors for large-scale distributed systems

    Publication Year: 2002, Page(s):404 - 409
    Cited by:  Papers (33)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (413 KB) | HTML iconHTML

    This paper discusses the problem of implementing a scalable failure detection service for grid systems. More specifically, traditional implementations of failure detectors are often tuned for running over local networks and fail to address important problems found in wide-area distributed systems, such as grid systems. We identify some of the most important problems raised in the context of grids.... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A self-stabilizing algorithm for the Steiner tree problem

    Publication Year: 2002, Page(s):396 - 401
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB) | HTML iconHTML

    Self-stabilization is a theoretical framework of non-masking fault-tolerant distributed algorithms. In this paper, we investigate the Steiner tree problem in distributed systems, and propose a self-stabilizing solution to the problem. Our solution is based on the pruned-MST technique, a heuristic technique to find a minimal cost Steiner tree by pruning unnecessary nodes and edges in a minimum cost... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A self-stabilizing algorithm for finding cliques in distributed systems

    Publication Year: 2002, Page(s):390 - 395
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (424 KB) | HTML iconHTML

    Self-stabilization is a theoretical framework of non-masking fault-tolerant algorithms in distributed systems. In this paper, we consider a problem to find fully connected subgraphs (cliques) in a network. In our problem setting, each process P in a network G is given a set of its neighbor processes as input, and must find a set of neighbors that are fully connected together with P. As constraints... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of inspection-based preventive maintenance in operational software systems

    Publication Year: 2002, Page(s):286 - 295
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (586 KB) | HTML iconHTML

    Recently, the phenomenon of "software aging", one in which the state of a software system gradually degrades with time and eventually leads to performance degradation or crash/hang failure, has been reported. Preventive maintenance of operational software systems is used specifically to counteract this phenomenon. However preventive maintenance incurs an overhead in terms of downtime and cost and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-stabilizing distributed file systems

    Publication Year: 2002, Page(s):384 - 389
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (480 KB) | HTML iconHTML

    A self-stabilizing distributed file system is presented. The system constructs and maintains a spanning tree for each file volume. The spanning tree consists of the servers that have volume replicas and caches for the specific file volume. The spanning trees are constructed and maintained by self-stabilizing distributed algorithms. File system updates use the tree to implement file read and write ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dependability of CORBA systems: service characterization by fault injection

    Publication Year: 2002, Page(s):276 - 285
    Cited by:  Papers (17)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (730 KB) | HTML iconHTML

    The dependability of CORBA systems is a crucial issue for the development of today's distributed platforms and applications. This paper analyzes various techniques that can be applied to the dependability evaluation of CORBA systems. Due to the complexity of a middleware platform like CORBA and its various types of software components, experiments using several fault injection techniques are requi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • OBIGrid: towards a new distributed platform for bioinformatics

    Publication Year: 2002, Page(s):380 - 381
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB) | HTML iconHTML

    This paper describes the design philosophy for the grid system being developed by the Japan Committee on High-Performance Computing for Bioinformatics and Initiative for Parallel Bioinformatics (IPAB). The grid is an attractive solution to achieve a distributed bioinformatics environment with high performance parallel computers, large genomic databases, computation intensive applications such as h... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling communication delays in distributed systems using time series

    Publication Year: 2002, Page(s):268 - 273
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (444 KB) | HTML iconHTML

    The design of dependable distributed applications is a hard task, mainly because of the indefinable statistic behavior of the communication delays. Despite this feature, in practice, most system monitors make use of timeouts (a maximum waiting time) to ensure some termination properties in their protocols. To have better results, some monitors dynamically predict new timeout values based on observ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fault-tolerant approach to secure information retrieval

    Publication Year: 2002, Page(s):12 - 21
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (624 KB) | HTML iconHTML

    Several private information retrieval (PIR) schemes were proposed to protect users' privacy when sensitive information stored in database servers is retrieved. However, existing PIR schemes assume that any attack to the servers does not change the information stored and any computational results. We present a novel fault-tolerant PIR scheme (called FT-PIR) that protects users' privacy and at the s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-stabilizing local mutual exclusion on networks in which process identifiers are not distinct

    Publication Year: 2002, Page(s):202 - 211
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (625 KB) | HTML iconHTML

    A self-stabilizing system is a system such that it autonomously converges to a legitimate system state, regardless of the initial system state. The local mutual exclusion problem is the problem of guaranteeing that no two processes neighboring each other execute their critical sections at a time. The process identifiers are said to be chromatic if no two processes neighboring each other have the s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementing IPv6 as a peer-to-peer overlay network

    Publication Year: 2002, Page(s):347 - 351
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (361 KB) | HTML iconHTML

    This paper proposes to implement an IPv6 routing infrastructure as a self-organizing overlay network on top of the current IPv4 infrastructure. The overlay network builds upon a distributed IPv6 edge router with a master/slave architecture. We show how different slaves can be constructed to tunnel through NATs and firewalls, as well as to improve robustness of the routing infrastructure and to pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Collaborative networking in an uncooperative Internet

    Publication Year: 2002, Page(s):51 - 60
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (619 KB) | HTML iconHTML

    Collaborative applications often require peer-to-peer interaction and peer discovery mechanisms. In today's Internet, Firewall and NAT technology, and a lack of support of IP multicast, have made it very difficult to support such applications. Application Level Gateways and Directory Services can solve these problems to some extent, but have scalability problems and should be used as a last resort... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • From Byzantine agreement to practical survivability: a position paper

    Publication Year: 2002, Page(s):374 - 379
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (488 KB) | HTML iconHTML

    Only a decade ago, issues of replication, high availability and load balancing were the focus of small, closely coupled cluster projects. Consequently, techniques for cluster management and small replication systems are abundant. However, the advent of the Internet led to wide spread and highly decentralized access of services and content that bring issues of scale and ubiquitous deployment. In pa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A unified proof of minimum time complexity for reaching consensus and uniform consensus - an oracle-based approach

    Publication Year: 2002, Page(s):102 - 108
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (466 KB) | HTML iconHTML

    In this paper, we offer new proofs to two lower bound results in distributed computing: a minimum of f+1 and f+2 rounds for reaching consensus and uniform consensus respectively when at most f fail-stop faults can happen. Here the computation model is synchronous message passing. Both proofs are based on a novel oracle argument. These two induction proofs are unified in the following sense: the in... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimistic Byzantine agreement

    Publication Year: 2002, Page(s):262 - 267
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (479 KB) | HTML iconHTML

    The paper considers the Byzantine agreement problem in a fully asynchronous network, where some participants may be actively malicious. This is an important building block for fault-tolerant applications in a hostile environment, and a non-trivial problem: An early result by Fischer et al. (1985) shows that there is no deterministic solution in a fully asynchronous network subject to even a single... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The guardian model for exception handling in distributed systems

    Publication Year: 2002, Page(s):304 - 313
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (643 KB) | HTML iconHTML

    We present an abstraction called guardian for exception handling in distributed systems. The guardian can solve several limitations with existing distributed exception handling techniques. To understand these limitations, we analyze distributed exception handling with respect to sequential exception handling and identify the significant differences between them. This leads to the fundamental probl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient Byzantine-resilient reliable multicast on a hybrid failure model

    Publication Year: 2002, Page(s):2 - 11
    Cited by:  Papers (15)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (722 KB) | HTML iconHTML

    The paper presents a new reliable multicast protocol that tolerates arbitrary faults, including Byzantine faults. This protocol is developed using a novel way of designing secure protocols which is based on a well-founded hybrid failure model. Despite our claim of arbitrary failure resilience, the protocol need not necessarily incur the cost of "Byzantine agreement", in number of participants and ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient distributed precision control in symmetric replication environments

    Publication Year: 2002, Page(s):119 - 128
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (728 KB) | HTML iconHTML

    Maintaining strict consistency of replicated data can be prohibitively expensive for many distributed applications and environments. In order to alleviate this problem, some systems allow applications to access stale, imprecise data. Due to relaxed correctness requirements, many applications can tolerate stale data but require that the imprecision be properly bounded. This paper describes ReBound,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Heterogeneous checkpointing for multithreaded applications

    Publication Year: 2002, Page(s):140 - 149
    Cited by:  Papers (4)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (649 KB) | HTML iconHTML

    We present the first heterogeneous checkpointing scheme for applications using POSIX threads. The scheme relies on source code instrumentation to achieve heterogeneity. It supports various types of synchronization primitives, such as locks, semaphores, condition variables, and join operations. Unlike other non-heterogeneous checkpointing schemes proposed in the literature, our scheme supports both... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimistic total order in wide area networks

    Publication Year: 2002, Page(s):190 - 199
    Cited by:  Papers (17)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (715 KB) | HTML iconHTML

    Total order multicast greatly simplifies the implementation of fault-tolerant services using the replicated state machine approach. The additional latency of total ordering can be masked by taking advantage of spontaneous ordering observed in LANs: A tentative delivery allows the application to proceed in parallel with the ordering protocol. The effectiveness of the technique rests on the optimist... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A search for routing strategies in a peer-to-peer network using genetic programming

    Publication Year: 2002, Page(s):341 - 346
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (460 KB) | HTML iconHTML

    Results taken from a simulated peer-to-peer network are described, in which genetic programming is utilized to evolve routing strategies that optimize resource location in various traffic flow scenarios. In all cases the evolved strategies result in more numerous resource locations than a pure, non-adaptive peer-to-peer protocol such as the Gnutella protocol. The resulting evolved strategies are d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault-tolerant virtual private networks within an autonomous system

    Publication Year: 2002, Page(s):41 - 50
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (689 KB) | HTML iconHTML

    This paper proposes the concept of a fault-tolerant virtual private network (FVPN) within an autonomous system-a framework for supporting seamless network fail-over by leveraging the inherent redundancy of the underlying Internet infrastructure. The proposed architecture includes an application-level module, which is integrated into gateways at VPN end-points. This module enables fail-over to a re... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.