By Topic

Proceedings 1999 Pacific Rim International Symposium on Dependable Computing

16-17 Dec. 1999

Filter Results

Displaying Results 1 - 25 of 34
  • Proceedings 1999 Pacific Rim International Symposium on Dependable Computing

    Publication Year: 1999
    Request permission for commercial reuse | PDF file iconPDF (183 KB)
    Freely Available from IEEE
  • Index of authors

    Publication Year: 1999, Page(s): 277
    Request permission for commercial reuse | PDF file iconPDF (9 KB)
    Freely Available from IEEE
  • Fault-tolerant routing algorithms based on optimal path matrices

    Publication Year: 1999, Page(s):227 - 233
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (88 KB)

    Presents a new concept - optimal path matrices (OPMs) - for fault-tolerant routing on hypercube multicomputers. OPMs stored on each node of a hypercube hold the fault information and indicate whether there is an optimal path from the node to a destination. Two fault-tolerant routing algorithms based on OPMs are proposed in order to maintain the matrices and to route messages from sources to destin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Empirical-Bayesian availability indices of safety and time critical software systems with corrective maintenance

    Publication Year: 1999, Page(s):84 - 91
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (216 KB)

    If the recovery or remedial time is not incorporated in the reliability of a software module in a safety and time-critical integrated system operation, then a mere reliability index based on failure characteristics is simply not adequate and realistic. In deriving the probability density function (pdf) of the software availability, empirical Bayesian procedures will be used to employ expert engine... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of message logging protocols for NOWs with MPI

    Publication Year: 1999, Page(s):252 - 259
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (72 KB)

    Among the various systems developed for parallel and distributed computing, networks of workstations (NOWs) based on the Message Passing Interface (MPI) have been recognized as an efficient platform. In this paper, we implement and compare two important message logging protocols, pessimistic and optimistic, for a NOW employing MPI. An experiment reveals that the total execution time is not signifi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reliable probabilistic checkpointing

    Publication Year: 1999, Page(s):153 - 160
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB)

    Recently proposed probabilistic checkpointing has one drawback, naming aliasing. When analyzed, 64-bit signatures show negligible possibility of aliasing. But in practice, the shift-XOR signature generation function used with probabilistic checkpointing shows a high aliasing rate, which limits the practicality of probabilistic checkpointing. In this paper, two enhancements are considered to make p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-validating diagnosis of hypercube systems

    Publication Year: 1999, Page(s):218 - 226
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB)

    A novel approach to the diagnosis of hypercubes, called self-validating diagnosis (SVD), is introduced. An algorithm bared on this approach, called the SVD algorithm, is presented and evaluated. Given any fault set and the resulting syndrome, the algorithm returns a diagnosis and a syndrome-dependent bound, Tσ, with the property that the diagnosis is correct (although possibly inc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Testing-resource allocation for redundant software systems

    Publication Year: 1999, Page(s):78 - 83
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (136 KB)

    For many safety critical systems, redundancy is the only acceptable method to achieve high operational reliability as individual modules can hardly be certified to have reached that level. When limited resources are available in the testing of a redundant software system, it is important to allocate the testing-time efficiently so that the maximum reliability of the complete system is achieved. In... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new placement algorithm dedicated to parallel computers: bases and application

    Publication Year: 1999, Page(s):242 - 249
    Cited by:  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (112 KB)

    One way to improve reliability in parallel computers consists of adding supplementary processors and interconnections to the functional structure in order to replace faulty processors with respect to the network structure. This approach is named structural fault tolerance (SFT). Very integrated parallel computers are one way to implement a parallel structure. The material structure is then compose... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The effect of interconnect schemes on the dependability of a modular multi-processor system with shared resources

    Publication Year: 1999, Page(s):103 - 110
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (180 KB)

    AlliedSignal's Avionics & Lighting business unit is expanding the performance of its flight safety avionics by means of functional integration (added functionality enabled by exchanging information between traditionally stand-alone subsystems), as well as physical integration (sharing of system resources) and full dual redundancy. Major performance goals of this integrated modular architecture... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A simulated fault injection tool for dependable VoD application design

    Publication Year: 1999, Page(s):170 - 177
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB)

    This work presents a simulation-based tool for dependability-oriented design of Video on Demand (VoD) applications. The tool is organized in a layered architecture, so that simulation models can be built and detailed according to a hierarchical and modular approach. The higher layer, namely the Application Level, provides a variety of objects to rapidly model fundamental components typically found... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Availability and performance evaluation for automatic protection switching in TDMA wireless system

    Publication Year: 1999, Page(s):15 - 22
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (164 KB)

    In this paper, we compare the availability and performance of a wireless TDMA system with and without automatic protection switching. Stochastic reward net models are constructed and solved by SPNP (Stochastic Petri Net Package). Hierarchical decomposition is adopted to simplify the analysis. The optimization of the number of guard channels reserved for the handoff calls is studied. Numerical resu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel NMR structure with concurrent output error location capability

    Publication Year: 1999, Page(s):32 - 39
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (396 KB)

    This paper proposes a novel N-modular redundancy (NMR) structure with concurrent output error location (COEL) capability. The concurrent output error locatable NMR structure consists of a conventional NMR structure and a totally self-checking (TSC) extra circuit with N+1 two-rail code outputs. This extra circuit is used for locating the output error produced by any replicated module or the voter, ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An architecture-based software reliability model

    Publication Year: 1999, Page(s):143 - 150
    Cited by:  Papers (33)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (136 KB)

    We present an analytical model for estimating architecture-based software reliability, according to the reliability of each component, the operational profile, and the architecture of software. Our approach is based on Markov chain properties and architecture view to state view transformations to perform reliability analysis on heterogeneous software architectures. We demonstrate how this analytic... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A simple and efficient deadlock recovery scheme for wormhole routed 2-dimensional meshes

    Publication Year: 1999, Page(s):210 - 217
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (112 KB)

    In order to avoid deadlocks, prevention-based routing algorithms impose certain routing restrictions which lead to high hardware complexity or low adaptability. If deadlock occurrences are extremely rare, recovery-based routing algorithms become more attractive with respect to hardware complexity and routing adaptability. A simple architecture where each router is provided with an additional speci... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hardware fault tolerance in arithmetic coding for data compression

    Publication Year: 1999, Page(s):70 - 77
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (80 KB)

    New fault tolerance techniques are presented for protecting a lossless compression algorithm, arithmetic coding, whose recursive nature makes it vulnerable to temporary hardware failures. The fundamental arithmetic operations are protected by low-cost residue codes, employing fault tolerance in multiplications and additions. Additional fault-tolerant design techniques are developed to protect othe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reconfiguration of two-dimensional meshes embedded in hypercubes

    Publication Year: 1999, Page(s):234 - 241
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (136 KB)

    Proposes a method of reconfiguring 2D meshes embedded in hypercubes. Our reconfiguration for link failures consists of two stages. The first stage assigns the d dimensions of the hypercubes to two directions with respect to rows and columns in the mesh, so that the number of disconnected pairs with adjacent rows and columns becomes smaller. The second stage re-establishes the mesh communication by... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • FBD: a fault-tolerant buffering disk system for improving write performance of RAID5 systems

    Publication Year: 1999, Page(s):95 - 102
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (108 KB)

    The parity calculation technique of the RAID5 provides high reliability, efficient disk space usage, and good read performance for parallel-disk-array configurations. However, it requires four disk accesses for each write request. The write performance of a RAID5 is therefore poor compared with its read performance. We propose a buffering system to improve write performance while maintaining the r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Measurement and modeling of burst packet losses in Internet end-to-end communications

    Publication Year: 1999, Page(s):260 - 267
    Cited by:  Papers (17)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (152 KB)

    We have measured the packet loss ratio, its time dependency, and the frequency of burst packet losses in Internet end-to-end communications. To do this, we developed a tool that sends and receives UDP (User Datagram Protocol) packets. Our measurements showed that long burst losses are more likely when the packet loss ratio is high. We then examined two models for calculating the burst packet loss,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal checkpointing and rollback strategies with media failures: statistical estimation algorithms

    Publication Year: 1999, Page(s):161 - 168
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (560 KB)

    This paper considers two stochastic models for a file recovery action with checkpoint generations when two kinds of failures; system failure and media failure, occur according to a homogeneous Poisson process and a renewal process, respectively. For the unknown media failure time distribution, we develop statistical nonparametric algorithms to estimate the optimal checkpoint intervals which maximi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dependability issues in mobile distributed system

    Publication Year: 1999, Page(s):7 - 14
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (112 KB)

    The article discusses dependability issues in Distributed Systems comprising mobile hosts and wireless data communications (Mobile Distributed Systems). By enabling motion and location independence wireless data communications and mobile hosts allow information access that may occur any time and any place. We show that location and time significantly affect the dependability concept of distributed... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fuzzy based approach for the design and evaluation of dependable systems using the Markov model

    Publication Year: 1999, Page(s):112 - 119
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (172 KB)

    Dependability is a subject of great importance in the development of critical systems and may be quantified in terms of various factors, such as reliability, maintainability and availability, whose significance may vary between different applications. A generic technique based on the Markov model is proposed in this paper, using fuzzy theory, for the reliability and safety assessment of fault-tole... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Networked Windows NT system field failure data analysis

    Publication Year: 1999, Page(s):178 - 185
    Cited by:  Papers (28)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (84 KB)

    This paper presents a measurement-based dependability study of a Networked Windows NT system based on field data collected from NT System Logs from 503 servers running in a production environment over a four-month period. The event logs at hand contains only system reboot information. We study individual server failures and domain behavior in order to characterize failure behavior and explore erro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • LLT and LTn schemes: error recovery schemes in mobile environments

    Publication Year: 1999, Page(s):23 - 30
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB)

    Recently various mobile stations become widely available due to the advance of communication and computer technologies. Because of the mobility of terminal devices and the bandwidth limitation of wireless networks, it is difficult to apply traditional error recovery schemes for fixed networks to mobile network environments directly. So, several recovery schemes for mobile environments have been pr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A novel fault tolerant approach for SRAM-based FPGAs

    Publication Year: 1999, Page(s):40 - 44
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (76 KB)

    This paper presents a novel fault tolerant approach for SRAM-based FPGAs. The proposed approach includes a fault tolerant architecture and its related routing procedure. In the approach, both the overheads for CLBs and interconnects are considered. The fault tolerant routing procedure under this novel approach is simple and less time-consuming. We provide the simulation results and show that the p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.