By Topic

Computers, IEEE Transactions on

Issue 3 • Date March 1973

Filter Results

Displaying Results 1 - 19 of 19
  • IEEE Transactions on Computers - Table of contents

    Page(s): c1
    Save to Project icon | Request Permissions | PDF file iconPDF (369 KB)  
    Freely Available from IEEE
  • IEEE Computer Society

    Page(s): c2
    Save to Project icon | Request Permissions | PDF file iconPDF (218 KB)  
    Freely Available from IEEE
  • Fault-Tolerant Computing: An Introduction and a Viewpoint

    Page(s): 225 - 229
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2392 KB)  

    AFTER approximately 20 years of obscurity, the field of fault-tolerant computing was revived by the formation of the IEEE Technical Committee on Fault-Tolerant Computing, by a series of articles in COMPUTER for January/ February 1971, by the 1971 International Symposium on Fault-Tolerant Computing in Pasadena, Calif., and by an IEEE TRANSACTIONS ON COMPUTERS Special Issue on Fault-Tolerant Computing in November 1971. Interest and activity continued apace, and the 1972 International Symposium on Fault-Tolerant Computing was held in Newton, Mass. Most of the excellent papers in this second IEEE TRANSACTIONS Special Issue on Fault-Tolerant Computing were presented at that symposium. As an introduction to these papers, consideration of the ultimate goals of this discipline and the universe in which our work is being done is most appropriate. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Use of SPOOF's in the Analysis of Faulty Logic Networks

    Page(s): 229 - 234
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3664 KB)  

    In general, one cannot predict the effects of possible failures on the functional characteristics of a logic network without knowlegde of the structure of that network. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiple Fault Detection in Combinational Circuits: Algorithms and Computational Results

    Page(s): 235 - 240
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2776 KB)  

    A new approach is developed for finding multiple fault detection tests under quite arbitrary fault models. Computational results are reported and discussed. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Testing for Intermittent Faults in Digital Circuits

    Page(s): 241 - 246
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2312 KB)  

    In this paper we present a few fundamental results related to the problems of characterization and detection of intermittent faults in digital circuits, which, up to now, have been almost totally ignored. This problem is important since in many technologies intermittency is a predominant mode of failure. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Figure of Merit for Fault-Tolerant Space Computers

    Page(s): 246 - 251
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3600 KB)  

    To aid the computer designer in addressing the proper reliability goals, an economic model is proposed in which any increase in cost for reliability is evaluated against the expected decrease in cost of failure. These criteria are applied in an example to optimize the memory configuration of a computer that has a five-year lifetime requirement in space. An internal figure of merit is developed that can be used by the computer designer when little specific spacecraft and mission data are available. A more general external figure of merit can be used when mission parameters are defined. This latter is useful both for the computer designer and the spacecraft systems engineer. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Concept of Coverage and Its Effect on the Reliability Model of a Repairable System

    Page(s): 251 - 254
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (3176 KB)  

    Duplication is a technique frequently employed to achieve high reliability for a repairable system. Although the philosophy of duplication is that it takes two faults to place a system out of service, there are generally some critical single faults that cause a system failure. This paper considers the effect of such a set of faults on a repairable system's reliability. It is shown that even a small number of such faults may severely degrade the mean time to system failure and the expected downtime for an otherwise highly reliable system. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a Self-Checking Microprogram Control

    Page(s): 255 - 262
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2832 KB)  

    In designing a self-checking processor, it is essential to recognize the types of failures that are most probable. Matching the checking techniques with the type of faults that are expected to occur should yield the best result with the least amount of hardware. The microprogram control will consist of integrated circuits: large-scale integration (LSI) for the memory and small-scale integration (SSI) for the associated control logic. Because of the density of chips on a plug-in package and the physical proximity of the devices on an integrated circuit, multiple faults within a single circuit are highly probable. The types of faults within a circuit have been analyzed and found to be of the type which would tend to affect the bits in a unidirectional manner. Also the failed bits would probably be adjacent rather than randomly dispersed throughout-the microprogram store word. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of Totally Self-Checking Check Circuits for m-Out-of-n Codes

    Page(s): 263 - 269
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2616 KB)  

    The design of totally self-checking check circuits for m-out-of-n codes is described. Totally self-checking m-out-of-n checkers provide an error indication whenever the input is not an m-out-of-n code or whenever a fault occurs within the checker itself. Since the checker checks itself, there is no need for additional maintenance access or periodic exercise of the checker to verify its ability to detect errors. The basic structure of the checker relies on the use of majority detection circuits. Various gate level implementations for the majority detection circuits are also presented, although the self-checking capability of the checker does not depend on their particular implementation since they are exhaustively tested by code inputs. The self-testing checkers for k-out-of-2k codes are discussed in the most detail since the totally self-checking checkers for 1-out-of-n and arbitrary m-out-of-n codes are constructed by first translating the code to a k-out-of-2k code via a totally self-checking translator. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modeling of a Bubble-Memory Organization with Self-Checking Translators to Achieve High Reliability

    Page(s): 269 - 275
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (4184 KB)  

    This paper reports a study on the design and modeling of a highly reliable bubble-memory system. This system has the capability of correcting a single 16-adjacent bit-group error resulting from failures in a single basic storage module (BSM), and detecting with a probability greater than 0.99 any double errors resulting from failures in BSM's. The encoding/decoding network (memory translator) is designed to be self-checking, i.e., a single circuit failure in the translator wiH not produce an erroneous output that goes undetected. The system is able to perform reliable configuration in the event of uncorrectable BSM failures, memory translator failures, and dual-memory buffer failures; even in the presence of a single failure in the status registers controlling the configuration network. The bubble memory under study permits serial accessing of the store with 64 x 1024 bit blocks at a 100-kHz rate. The objective of this study is to develop good fault-tolerant design and analysis methods adequate for newly emerging technologies and prove the practicality by example. The reliability modeling study justifies the design philosophy adopted of employing memory data encoding and a translator to correct single group errors and detect double group errors to enhance the overall system reliability. By a proper design of the memory translator based on a new checking technique, a uniformly high percentage of multiple b-adjacent bit-group error detection is achieved through the use of a proposed code (detects 99.99695 percent of double b-adjacent bit-group errors and 99.9985 percent of triple or more b-adjacent bit-group errors). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Switch Complexity in Systems with Hybrid Redundancy

    Page(s): 276 - 282
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2800 KB)  

    The combination of N-modular redundancy (NMR) and standby sparing has resulted in a promising redundancy technique for protecting those portions of a fault-tolerant system whose continuous real-time operation is essential. This technique, known as hybrid redundancy, uses N + Sp identical modules. N of these are connected to a majority voter to form an NMR core. The remaining Sp are used as standby spares. Disagreement detectors instruct a switch to replace with a standby spare any of the N modules that disagrees with the majority consensus. Since the switch and disagreement detector, as well as the modules, must function properly for the system to perform its designed task, the overall system reliability depends on the disagreement detector, switch, and hybrid reliabilities. Hence the overall reliability is a function of switch reliability. A highly reliable, thus simple, switch is desirable. First, strategies where every spare can be switched into every voter position (totally assigned) are considered. An optimal strategy is developed where the number of states in the switch is the criterion used for optimality. Formulas for the switch state count for the various strategies are derived and the strategies compared. Next, partially assigned switching strategies, strategies where every spare need not be capable of occupying every voter position, are examined. It is shown that designs where all spares are assigned to the same t + 2 of the 2t + 1 voter positions have as good reliability as the more complex totally assigned switching strategy. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Lookaside Techniques for Minimum Circuit Memory Translators

    Page(s): 283 - 289
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2488 KB)  

    This paper demonstrates two improvements in coding techniques that could be used for memory word coding. First, within the fixed structure of a Hamming single-error-correcting, double-error-detecting (SEC/DED) code, an improvement can be obtained in circuit cost and operational speed over more conventional code implementations. Second, the mechanics of error correction in a fault-tolerant computer may be carried out via conventional hardware means or by use of the existing system facilities, such as the combination of the microprogram unit, local store, and the arithmetic-logic unit. These improvements may be obtained by the use of Rotational Coding schemes in conjunction with a technique calied "lookaside correction." This paper first shows a generalized algorithm for specifying the parity check matrix of Rotational Codes. The structure implemented by the parity check matrix in this paper is not merely encoding and decoding circuitry, but translates between Rotational Code forms and byte-parity encoded forms. The unique feature of these translators is that use of the Rotational Code permits the error correction to be performed on only a subset of the data word bits, and only if a single-error condition has been detected. The correction mechanism may be either a hardware logic circuit or firmware. The paper concludes with a comparison of the circuit requirements and correctional speed of the Hamming (72, 64) single-error-correcting, double-error-detecting code, as it normally would be implemented, and a Rotational Code translator also operating on 64 data bits and 8 check bits. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Iterative Cell Switch Design for Hybrid Redundancy

    Page(s): 290 - 297
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1424 KB)  

    A marriage of N-modular redundancy (NMR) and standby sparing has resulted in a promising redundancy technique for protecting those portions of a fault-tolerant system whose continuous real-time operation is essential. This technique, known as hybrid redundancy, consists of N identical modules connected to a majority voter to form an NMR core. Disagreement detectors instruct a switch to replace any of the N modules that disagree with the majority consensus by a standby spare. The switch is essential to the operation of the hybrid redundancy scheme. The system reliability is a product of the reliabilities of the switch and the hybrid redundancy scheme assuming a perfect switch. To realize the demonstrated potential of the latter a highly reliable, thus simple, switch is required. An iterative cell switch is proposed and demonstrated to save at least 25 percent, and more than 80 percent in some instances, of the complexity of a switch design presented elsewhere in the literature. The use of threshold, rather than majority, voters is considered and shown to yield a simpler design in some cases. Three techniques for decreasing the propagation delay through the iterative cell switch are presented as well as a scheme to implement retry of failed modules. Finally, five different switch designs are compared on a cost and complexity basis. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Shared Logic Realizations of Dynamically Self-Checked and Fault-Tolerant Logic

    Page(s): 298 - 306
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2568 KB)  

    Dynamically self-checked or fault-tolerant realizations of switching functions and sequential machines are proposed under a fault model that permits arbitrary logic faults in a single-logic module, where the modules are explicitly defined. These realizations permit considerable logic sharing, organized around an (n, m, r)-basis for decomposing switching functions. The logic sharing permits more economical realizations than can be obtained using classical parity and triple-modular redundancy schemes for obtaining logic circuits with the corresponding property. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error Correcting Properties of Redundant Residue Number Systems

    Page(s): 307 - 315
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2808 KB)  

    The error correcting properties of the redundant residue number systems (RNS) are investigated through a more natural a approach than was previously known. The necessary and sufficient condition for the correction of a given error affecting a single residue digit of any legitimate number in an RRNS is determined. The minimal redundancy allowing the correction of the whole class of the single residue digit errors is derived and an efficienit procedure for error correction is given. Moreover, it is shown that a smaller redundancy and a single redundant modulus may allow the correction of certain important subclasses of single residue digit errors, e.g., the set of errors affecting a single bit in the code. Examples are given. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IEEE Computer Society Membership & Publications

    Page(s): 315
    Save to Project icon | Request Permissions | PDF file iconPDF (182 KB)  
    Freely Available from IEEE
  • Information for authors

    Page(s): 315
    Save to Project icon | Request Permissions | PDF file iconPDF (291 KB)  
    Freely Available from IEEE
  • Blank Page

    Page(s): 315
    Save to Project icon | Request Permissions | PDF file iconPDF (27 KB)  
    Freely Available from IEEE

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au