Proceedings 2001 Pacific Rim International Symposium on Dependable Computing

17-19 Dec. 2001

Filter Results

Displaying Results 1 - 25 of 54
  • Proceedings 2001 Pacific Rim International Symposium on Dependable Computing

    Publication Year: 2001
    Request permission for commercial reuse | PDF file iconPDF (297 KB)
    Freely Available from IEEE
  • Middleware of real-time object based fault tolerant distributed computing systems: issues and some approaches

    Publication Year: 2001, Page(s):3 - 8
    Cited by:  Papers (6)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (656 KB) | HTML iconHTML

    At this turn of the century the object-oriented (OO) distributed real-time (RT) programming movement is growing rapidly along with the networked embedded systems market. The motivations are reviewed and then a brief overview is given of the particular programming scheme which this author and his collaborators have been establishing. The scheme is called the time-triggered message triggered object ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Novel fault-tolerant techniques for high capacity RAMs

    Publication Year: 2001, Page(s):11 - 18
    Cited by:  Papers (3)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (826 KB) | HTML iconHTML

    In the area of high capacity RAMs, the memory columns (rows), including the redundancies, are partitioned into column blocks (row blocks), respectively. If the replacement is performed at the row-block level, then a row block-based FTM (RBFTM) system is used. Alternatively, if the replacement is performed at the column-block level, then a column block-based FTM (CBFTM) system is used. If both appr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Connectivity-based multichip module repair

    Publication Year: 2001, Page(s):19 - 26
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (664 KB) | HTML iconHTML

    This paper presents a new model for analyzing the yield of MCM systems with repair process. It exploits the connectivity of the interconnected chips in which yield degradation due to both neighboring chips and interconnect structure are taken into account. Based on the connectivity, two MCM repair scheduling strategies, Smallest Number of Interconnections First (SNIF) and Smallest Number of Neighb... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • SSD: an affordable fault tolerant architecture for superscalar processors

    Publication Year: 2001, Page(s):27 - 34
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (743 KB) | HTML iconHTML

    The paper proposes an integrity checking architecture for superscalar processors that can achieve fault tolerance capability of a duplex system at much less cost than the traditional duplication approach. The pipeline of the CPU core (P-pipeline) is combined in series with another pipeline (V-pipeline), which re-executes instructions processed in the P-pipeline. Operations in the two pipelines are... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A design-diversity based fault-tolerant COTS avionics bus network

    Publication Year: 2001, Page(s):35 - 42
    Cited by:  Papers (6)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (806 KB) | HTML iconHTML

    The paper describes a COTS bus network architecture consisting of the IEEE 1394 and SpaceWire buses. This architecture is based on the multi-level fault tolerance design methodology proposed by S.N. Chau et al. (1999) but has much less overhead than the original IEEE 1394/I/sup 2/C implementation. The simplifications are brought about by the topological flexibility and high performance of the Spac... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the choice of checkpoint interval using memory usage profile and adaptive time series analysis

    Publication Year: 2001, Page(s):45 - 48
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (287 KB) | HTML iconHTML

    This paper presents a new checkpoint scheme that utilizes the memory usage profile and time series analysis for low-overhead checkpoint. The proposed checkpoint scheme checks current and future checkpoint overhead based on the on the changes of the memory size and the expected checkpoint overhead using memory profile and adaptive time series analysis when it decides whether or not to take a checkp... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A secure checkpointing system

    Publication Year: 2001, Page(s):49 - 56
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (745 KB) | HTML iconHTML

    Fault-tolerant computer systems are being used increasingly in such applications as e-commerce, banking, and stock trading, where privacy and integrity of data are as important as the uninterrupted operation of the service provided. While much attention has been paid to the protection of data explicitly communicated over the Internet, there are also other sources of information leakage that must b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal checkpoint interval analysis using stochastic Petri net

    Publication Year: 2001, Page(s):57 - 60
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (352 KB) | HTML iconHTML

    While various checkpointing schemes have been widely used to reduce the recovery time when a fault occurs, the problem of evaluating the optimal checkpoint interval that maximizes the availability of the system has been a critical research issue for decades. The evaluation can be done by developing analytical models with restrict assumptions. However, the analytical model has reached its limitatio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dependability analysis of a fault-tolerant processor

    Publication Year: 2001, Page(s):63 - 67
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (433 KB) | HTML iconHTML

    Advances in semiconductor technology have improved the performance of integrated circuits, in general, and microprocessors, in particular, at a dazzling pace. Although, smaller transistor dimensions, lower power voltages and higher operating frequencies have significantly increased the circuit sensitivity to transient and intermittent faults. In this paper we present the architecture of a fault-to... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling the dependability of N-modular redundancy on demand under malicious agreement

    Publication Year: 2001, Page(s):68 - 75
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (629 KB) | HTML iconHTML

    In a multiprocessor under normal loading conditions, idle processors naturally offer spare capacity. Previous work attempted to utilize this redundancy to overcome the limitations of classic diagnosability and modular redundancy techniques while providing significant fault tolerance. A popular approach is task duplexing. The usefulness of this approach for critical applications, unfortunately, is ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • OpenSESAME: an intuitive dependability modeling environment supporting inter-component dependencies

    Publication Year: 2001, Page(s):76 - 83
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (818 KB) | HTML iconHTML

    The paper proposes a novel modeling method for the evaluation of dependability measures of highly available systems. The proposed method, which has been implemented in the tool OpenSESAME (Simple but Extensive Structured Availability Modeling Environment), combines the advantages of Boolean methods and state space based methods. The tool supports the modeler with a set of well-defined, structured,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal software rejuvenation policy with discounting

    Publication Year: 2001, Page(s):87 - 94
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (713 KB) | HTML iconHTML

    Software rejuvenation is a preventive maintenance technique that has been extensively studied in the recent literature. We consider a generalized problem to estimate the optimal software rejuvenation schedule. More precisely, the software rejuvenation model is formulated via the semi-Markov process, and the optimal software rejuvenation schedule which minimizes the expected total discounted cost o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic verification of fault tolerance using model checking

    Publication Year: 2001, Page(s):95 - 102
    Cited by:  Papers (5)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (725 KB) | HTML iconHTML

    Model checking is a technique that can make a verification for finite state systems absolutely automatic. We propose a method for automatic verification of fault-tolerant systems using this technique. Unlike other related work, which is tailored to specific systems, we are aimed at providing a general approach to verification of fault tolerance. The main obstacle in model checking is state explosi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of periodic preventive maintenance with general system failure distribution

    Publication Year: 2001, Page(s):103 - 107
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (444 KB) | HTML iconHTML

    Preventive maintenance is applied to improve system availability or decrease operational cost. Preventive maintenance with generally distributed parameters is discussed, and a steady-state solution is obtained by solving the underlying semi-Markov process (SMP). Specifically, periodic preventive maintenance with deterministically distributed repair time and maintenance time is discussed in detail,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault-tolerant routing in two-dimensional mesh networks with less-restricted fault patterns

    Publication Year: 2001, Page(s):111 - 118
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (674 KB) | HTML iconHTML

    Wormhole routing in networks is prone to deadlocks. Several techniques have been provided to solve the problem, including virtual channels and restriction on the fault patterns. We relax the fault patterns to one that does not contain the column-surrounded fault pattern. In our routing scheme, the concept of off-node is proposed to help messages leave the visited f-ring at an appropriate node such... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Availability considerations in network design

    Publication Year: 2001, Page(s):119 - 126
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (668 KB) | HTML iconHTML

    One of the performance factors considered during network design is availability. There are many ways of defining network availability. We explain why a new measure could be useful, establish requirements that a new availability measure should satisfy, propose a new measure satisfying these requirements and give examples of its application to a 3-tier architecture. We show how to calculate the meas... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Escape and restoration routing: suspensive deadlock recovery in interconnection networks

    Publication Year: 2001, Page(s):127 - 134
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (719 KB) | HTML iconHTML

    A routing strategy for suspensive deadlock recovery called an escape-restoration routing is proposed and its performance is evaluated. In the principle of the proposed techniques, a small amount of exclusive buffer (escape-buffer) at each router is prepared for handling one of the deadlocked packets. The transmission of the packet is suspended by temporarily escaping it to the escape-buffer. After... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Impact of a failure detection mechanism on the performance of consensus

    Publication Year: 2001, Page(s):137 - 145
    Cited by:  Papers (14)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (739 KB) | HTML iconHTML

    The paper considers a consensus algorithm for an asynchronous system augmented with failure detectors, and analyzes the impact on its termination time of various implementations of failure detectors. The study shows that the design of fault-tolerant distributed algorithms in the asynchronous system model augmented with failure detectors is orthogonal to implementing the actual failure detectors. T... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An adaptive failure detection protocol

    Publication Year: 2001, Page(s):146 - 153
    Cited by:  Papers (48)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (656 KB) | HTML iconHTML

    The detection of process failures is a crucial problem system designers have to cope with in order to build fault-tolerant distributed platforms. Unfortunately, it is impossible to distinguish with certainty a crashed process from a very slow process in a purely asynchronous distributed system. This prevents some problems from being solved in such systems. That is why failure detector oracles have... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rejuvenation and failure detection in partitionable systems

    Publication Year: 2001, Page(s):154 - 161
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (808 KB) | HTML iconHTML

    Certain gateways (e.g., some cable or DSL modems) are known to have low reliability and low availability. Most failures of these devices can however be "fixed" by rejuvenating the device after a failure has been detected. Such a detection based rejuvenation strategy permits increasing the availability of these gateways. In the considered scenario, rejuvenation is non-trivial since a failure of suc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implications of dependable computing in banking industry

    Publication Year: 2001, Page(s): 165
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (54 KB)

    Summary form only given. Information systems (IS) play a crucial role in banking institutions' day-to-day business and the level of service customers expect from banking industry is becoming ever more demanding. Satisfying those demands is one of the most important critical success factors for sustaining competitive edge, especially in today's hard economic time. In this fiercely competitive world... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automatic reconfiguration of an autonomous disk cluster

    Publication Year: 2001, Page(s):169 - 172
    Cited by:  Papers (1)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (384 KB) | HTML iconHTML

    Recently, storage-centric configurations, such as NAS and SAN architectures, have attracted attention in advanced data processing. For these configurations, scalability, flexibility, and availability are key features, and central control is unsuitable. We propose autonomous disks to enable distributed control in the storage-centric configurations. Autonomous disks configure a cluster in a network,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Partial order reduction in verification of wheel structured parameterized circuits

    Publication Year: 2001, Page(s):173 - 182
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (882 KB) | HTML iconHTML

    It is known that many systems have the regular structure constructed from several kinds of basic modules. We focus on parameterized asynchronous circuits with a wheel structure, which consists of one kernel module and many identical symmetry modules, and aim at verifying such systems of arbitrary sizes. In this paper we propose a fully automatic state enumeration procedure for wheel structured sys... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ECC: extended condition coverage for design verification using excitation and observation

    Publication Year: 2001, Page(s):183 - 190
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (730 KB) | HTML iconHTML

    An important issue in register-transfer-level (RTL) hardware verification is the ability to check specified functions and to determine the presence of an error. Code-level coverage is often used to measure the success in verification at this level. However existing code-level coverage inaccurately estimates the verification result by considering only the excitations of functional blocks. While it ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.