By Topic

Computers, IEEE Transactions on

Issue 10 • Date Oct 1996

Filter Results

Displaying Results 1 - 12 of 12
  • Harvest rate of reconfigurable pipelines

    Page(s): 1200 - 1203
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (376 KB)  

    For a reconfigurable architecture, the harvest rate is the expected percentage of defect-free processors that can be connected into the desired topology. The authors give an analytical estimation for the harvest rate of reconfigurable multipipelines based on the following model: there are n pipelines each with m stages, where each stage of a pipeline is defective with identical independent probability 0.5 and spare wires are provided for reconfiguration. By formulating the “shifting” reconfiguration as weighted chains in a partial ordered set, they prove when n=θ(m), the harvest rate is between 34% and 72% View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A performance evaluation of RAID architectures

    Page(s): 1116 - 1130
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1360 KB)  

    In today's computer systems, the disk I/O subsystem is often identified as the major bottleneck to system performance. One proposed solution is the so called redundant array of inexpensive disks (RAID). We examine the performance of two of the most promising RAID architectures, the mirrored array and the rotated parity array. First, we propose several scheduling policies for the mirrored array and a new data layout, group-rotate declustering, and compare their performance with each other and in combination with other data layout schemes. We observe that a policy that routes reads to the disk with the smallest number of requests provides the best performance, especially when the load on the I/O system is high. Second, through a combination of simulation and analysis, we compare the performance of this mirrored array architecture to the rotated parity array architecture. This latter study shows that: 1) given the same storage capacity (approximately double the number of disks), the mirrored array considerably outperforms the rotated parity array; and 2) given the same number of disks, the mirrored array still outperforms the rotated parity array in most cases, even for applications where I/O requests are for large amounts of data. The only exception occurs when the I/O size is very large; most of the requests are writes, and most of these writes perform full stripe write operations View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An analytical model for designing memory hierarchies

    Page(s): 1180 - 1194
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1316 KB)  

    Memory hierarchies have long been studied by many means: system building, trace driven simulation, and mathematical analysis. Yet little help is available for the system designer wishing to quickly size the different levels in a memory hierarchy to a first order approximation. We present a simple analysis for providing this practical help and some unexpected results and intuition that come out of the analysis. By applying a specific, parameterized model of workload locality, we are able to derive a closed form solution for the optimal size of each hierarchy level. We verify the accuracy of this solution against exhaustive simulation with two case studies: a three level I/O storage hierarchy and a three level processor cache hierarchy. In all but one case, the configuration recommended by the model performs within 5% of optimal. One result of our analysis is that the first place to spend money is the cheapest (rather than the fastest) cache level, particularly with small system budgets. Another is that money spent on an n level hierarchy is spent in a fixed proportion until another level is added View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulation and generation of IDDQ tests for bridging faults in combinational circuits

    Page(s): 1131 - 1140
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (900 KB)  

    In the absence of information about the layout, test generation and fault simulation systems must target all bridging faults. A novel algorithm, that is both time and space efficient, for simulating IDDQ tests for all two-line bridging faults in combinational circuits is presented. Simulation results using randomly generated test sets point to the computational feasibility of targeting all two-line bridging faults. On a more theoretical note, we show that the problem of computing IDDQ tests for all two-line bridging faults, even in some restricted classes of circuits, is intractable, and, even under some pessimistic assumptions, a complete IDDQ test set for all two-line bridging faults also covers all multiple line, single cluster bridging faults View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scheduling master-slave multiprocessor systems

    Page(s): 1195 - 1199
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (456 KB)  

    The author defines the master-slave multiprocessor scheduling model in which a master processor coordinates the activities of several slave processors. O(n log n) centralized, deterministic, batch-oriented algorithms are developed for some of the problems formulated. Some others are shown to be NP-hard View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive system-level diagnosis for hypercube multiprocessors

    Page(s): 1157 - 1170
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1340 KB)  

    System-level diagnosis is an important technique for fault detection and location in multiprocessor computing systems. Efficient diagnosis is highly desirable for sustaining the original system power. Moreover, effective diagnosis is particularly important for a multiprocessor system with high scalability but low connectivity. Most of the existing results are not applicable in practice because of the high diagnosis cost and limited diagnosability. Over-d fault diagnosis, where d is the diagnosability, has only been addressed using a probabilistic method in the literature. Aiming at these two issues, we propose a hierarchical adaptive system-level diagnosis approach for hypercube systems using a divide-and-conquer strategy. We first propose a conceptual algorithm HADA to formulate a rigorous analysis. Then we present its practical variant IHADA. In HADA and IHADA, the over-d fault problem is inherently tackled through a deterministic method. Three measures for diagnosis cost (diagnosis time, number of tests, and number of test links) are analyzed for the proposed algorithms. It is proved that the diagnosis cost required by our approach is lower than in previous diagnosis algorithms. It is shown that the diagnosis cost for the proposed algorithms depends on the number and location of faulty units in the system and the cost is extremely low when only a small number of faulty units exist. It is also shown that our algorithms are characterized by lower costs than a pessimistic diagnosis algorithm which trades lower diagnosis cost for a lower degree of accuracy. Experimental results on the nCUBE are provided View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Architecture technique trade-offs using mean memory delay time

    Page(s): 1089 - 1100
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1160 KB)  

    Many architecture features are available for improving the performance of a cache-based system. These hardware techniques include cache memories, processor stalling characteristics, memory cycle time, the external databus width of a processor, and pipelined memory system, etc. Each of these techniques affects the cost, design, and performance of a system. We present a powerful approach to assess the performance trade-offs of these architecture techniques based on the equivalence of mean memory delay time. For the same performance point, we demonstrate how each of these features can be traded off and report the ranking of the achievable performance of using them View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A bidirectional associative memory based on optimal linear associative memory

    Page(s): 1171 - 1179
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (752 KB)  

    A bidirectional associative memory is presented. Unlike many existing BAM algorithms, the presented BAM uses an optimal associative memory matrix in place of the standard Hebbian or quasi correlation matrix. The optimal associative memory matrix is determined by using only simple correlation learning, requiring no pseudoinverse calculation. Guaranteed recall of all training pairs is ensured by the present BAM. The designs of a linear BAM (LBAM) and a nonlinear BAM (NBAM) are given, and the stability and other performances of the BAMs are analyzed. The introduction of a nonlinear characteristic enhances considerably the ability of the BAM to suppress the noises occurring in the output pattern, and reduces largely the spurious memories, and therefore improves greatly the recall performance of the BAM. Due to the nonsymmetry of the connection matrix of the network, the capacities of the present BAMs are far higher than that of the existing BAMs. Excellent performances of the present BAMs are shown by simulation results View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Load sharing in hypercube-connected multicomputers in the presence of node failures

    Page(s): 1203 - 1211
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (804 KB)  

    The paper addresses two important issues associated with load sharing (LS) in hypercube-connected multicomputers: (1) ordering fault-free nodes as preferred receivers of “overflow” tasks for each overloaded node and (2) developing an LS mechanism to handle node failures. Nodes are arranged into preferred lists of receivers of overflow tasks in such a way that each node will be selected as the kth preferred node of one and only one other node. Such lists are proven to allow the overflow tasks to be evenly distributed throughout the entire system. However, the occurrence of node failures will destroy the original structure of a preferred list if the failed nodes are simply dropped from the list, thus forcing some nodes to be selected as the kth preferred node of more than one other node. The authors propose three algorithms to modify the preferred list such that its original features can be retained regardless of the number of faulty nodes in the system. It is shown that the number of adjustments or the communication overhead of these algorithms is minimal. Using the modified preferred lists, they also proposed a simple mechanism to tolerate node failures. Each node is equipped with a backup queue which stores and updates the information on the tasks arriving/completing at its most preferred node View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On linear dependencies in subspaces of LFSR-generated sequences

    Page(s): 1212 - 1216
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (428 KB)  

    The probability of linear dependency in subsequences generated by linear feedback shift registers is examined. It is shown that this probability for a short subsequence, e.g., a sequence defined by the length of a scan chain, can be much higher than that for an entire m-sequence View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Theory of transparent BIST for RAMs

    Page(s): 1141 - 1156
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1580 KB)  

    I present the theoretical aspects of a technique called transparent BIST for RAMs. This technique applies to any RAM test algorithm and transforms it into a transparent one. The interest of the transparent test algorithms is that testing preserves the contents of the RAM. The transparent test algorithm is then used to implement a transparent BIST. This kind of BIST is very suitable for periodic testing of RAMs. The theoretical analysis shows that this transparent BIST technique does not decrease the fault coverage for modeled faults, it behaves better for unmodeled ones and does not increase the aliasing with respect to the initial test algorithm. Furthermore, transparent BIST involves only slightly higher area overhead with respect to standard BIST. Thus, transparent BIST becomes more attractive than standard BIST since it can be used for both fabrication testing and periodic testing View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An architecture for tolerating processor failures in shared-memory multiprocessors

    Page(s): 1101 - 1115
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1860 KB)  

    This paper focuses on the problem of fault tolerance in shared memory multiprocessors, and describes an architecture designed for transparently tolerating processor failures. The Recoverable Shared Memory (RSM) is the novel component of this architecture, providing a hardware supported backward error recovery mechanism which minimizes the propagation of recovery when a processor fails. The RSM permits a shared memory multiprocessor to be constructed using standard caches and cache coherence protocols, and does not require any changes to be made to applications software. The performance of the recovery scheme supported by the RSM is evaluated and compared with other schemes that have been proposed for fault tolerant shared memory multiprocessors. The performance study has been conducted by simulation using address traces collected from real parallel applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au