By Topic

Computers, IEEE Transactions on

Issue 12 • Date Dec 1990

Filter Results

Displaying Results 1 - 13 of 13
  • Adaptive fault-tolerant routing in hypercube multicomputers

    Page(s): 1406 - 1416
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (952 KB)  

    A connected hypercube with faulty links and/or nodes is called an injured hypercube. A distributed adaptive fault-tolerant routing scheme is proposed for an injured hypercube in which each node is required to know only the condition of its own links. Despite its simplicity, this scheme is shown to be capable of routing messages successfully in an injured n-dimensional hypercube as long as the number of faulty components is less than n. Moreover, it is proved that this scheme routes messages via shortest paths with a rather high probability, and the expected length of a resulting path is very close so that of a shortest path. Since the assumption that the number of faulty components is less than n in an n-dimensional hypercube might limit the usefulness of the above scheme, a routing scheme based on depth-first search which works in the presence of an arbitrary number of faulty components is introduced. Due to the insufficient information on faulty components, however, the paths chosen by this scheme may not always be the shortest. To guarantee all messages to be routed via shortest paths, the authors propose to equip every node with more information than that on its own links. The effects of this additional information on routing efficiency are analyzed, and the additional information to be kept at each node for the shortest path routing is determined. Several examples and remarks are given to illustrate the results View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Radix-16 signed-digit division

    Page(s): 1424 - 1433
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (852 KB)  

    A two-stage algorithm for fixed point, radix-16 signed-digit division is presented. The algorithm uses two limited precision radix-4 quotient digit selection stages to produce the full radix-16 quotient digit. The algorithm requires a two-digit estimate of the (initial) partial remainder and a three-digit estimate of the divisor to correctly select each successive quotient digit. The normalization of redundant signed-digit numbers requires accommodation of some fuzziness at one end of the range of numeric values that are considered normalized. A set of general equations for determining the ranges of normalized signed-digit numbers is derived. Another set of general equations for determining the precisions of estimates of the divisor and dividend are derived. These two sets of equations permit design tradeoff analyses to be made with respect to the complexity of the model division. The specific case of a two-stage radix-16 signed-digit division is presented. The staged division algorithm used can be extended to other radices as long as the signed-digital number representation used has certain properties View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A simulation-based method for generating tests for sequential circuits

    Page(s): 1456 - 1463
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (696 KB)  

    In a recent work of the authors (1987), a simulation-based directed search approach for generating test vectors for combinational circuits was proposed. In this method, the search for a test vector is guided by a cost function computed by the simulator. Event-driven simulation deals with circuit delays in a very natural manner. Signal controllability information required for the cost function is incorporated in a new form of logic model called the threshold-value model. These concepts are extended to meet the needs of sequential circuit test generation. Such extensions include handling of unknown values, analysis of feedback loops, and analysis of race conditions in the threshold-value model. A threshold-value sequential test generation program, TVSET, is implemented. It automatically initializes the circuit and generates race-free tests for synchronous and asynchronous circuits View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modular architecture for high performance implementation of the FRR algorithm

    Page(s): 1464 - 1468
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    A novel VLSI-oriented architecture to compute the discrete Fourier transform is presented. It consists of a homogeneous structure of processing elements. The structure has a performance equal to 1/t transforms per second, where t is the time needed for the execution of a single butterfly computation or the time needed for the collection of a complete vector of samples, whichever is longer. Although the system is not optimal (it achieves O(N 3 log4 N) area×time2 performance), the architecture is modular and makes it possible to design a system which performs FFT of any size without any extra circuitry. Moreover, the system can provide a built-in self-test and self-restructuring. The modular system is easy to integrate. Processing elements (PEs) are connected to the neighboring PEs only, and form a linear network easy to implement in two and three dimensions. The number of pins required for a chip does not depend on the number of PEs integrated on it, nor on the size of the transform. The system consists of only one type of integrated circuit with a structure irrespective of the transform size, which considerably reduces the cost of implementation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An upper bound on expected clock skew in synchronous systems

    Page(s): 1475 - 1477
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (264 KB)  

    A statistical model is considered for clock skew in which the propagation delays on every source-to-processor path are sums of independent contributions, and are identically distributed. Upper bounds are derived for expected skew, and its variance, in tree distribution systems with N synchronously clocked processing elements. The results are applied to two special cases of clock distribution. In the first, the metric-free model, the total delay in each buffer stage is Gaussian with a variance independent of stage number. In this case, the upper bound on skew grows as Θ (log N). The second, metric, model, is meant to reflect VLSI constraints. Here, the clock delay in a stage is Gaussian with a variance proportional to wire length, and the distribution tree is an H-tree embedded in the plane. In this case, the upper bound on expected skew is Θ (N 1/4 (log N)1/2) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The performance of parallel Prolog programs

    Page(s): 1435 - 1445
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (964 KB)  

    Performance results are presented for a parallel execution model for Prolog that supports AND-parallelism, OR-parallelism, and intelligent backtracking. The results show that restricted AND-parallelism is of limited benefit for small programs, but produced speedups from 7-10 on two large programs. OR-parallelism was found to be generally not useful for the benchmarks examined if the semantics of Prolog were preserved. Of particular interest is the phenomenon of super-multiplicative behavior, in which the performance improvement obtained when more than one technique is used is greater than the product of the performance improvements due to each technique individually. The implications of the performance results for parallel Prolog systems are discussed, and directions for future work are indicated View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Embedding rectangular grids into square grids with dilation two

    Page(s): 1446 - 1455
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (592 KB)  

    A novel technique, the multiple ripple propagation technique, is presented for mapping and h×w grid into a w ×h grid such that the dilation cost is 2, i.e. such that any two neighboring nodes in the first grid are mapped onto two nodes in the second grid that are separated by a distance of at most 2. The technique is then used as a basic tool for mapping any rectangular source grid into a square target grid with the dilation two property preserved. The ratio of the number of nodes in the source grid to the number of nodes in the target grid, called the expansion cost, is shown to be always less than 1.2. This is a significant improvement over the previously suggested techniques, where the expansion cost could be bounded by 1.2 only if the dilation cost was allowed to be as high as 18 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On addition and multiplication with Hensel codes

    Page(s): 1417 - 1423
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (504 KB)  

    It has been stated by R.N. Gorgui-Naguib and R.A. King (1986) that the operations of addition and multiplication on Hensel codes originally defined by E.V. Krishnamurthy, T.M. Rao, and K. Subramanian (1975) are seriously in error in that it is possible to add/subtract or multiply Hensel codes and not get a valid Hensel code. It is shown that it is the presence of so-called invalid Farey fractions that results in the need to modify the original arithmetic operations. However, this also results in the Hensel codes becoming redundant. The authors show how to include the invalid Farey fractions such that it is possible to compute with their Hensel codings without the need to map back and forth between the rationals and their Hensel codings. This provides an alternative to the method of Gorgui-Naguib and King. Unfortunately, it turns out that Hensel codes of a large size will be needed in practice, even for relatively small problems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sequential fault occurrence and reconfiguration in system level diagnosis

    Page(s): 1472 - 1475
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (384 KB)  

    In a classical system-level diagnosis model, a complex multiprocessor system is characterized to be uniquely diagnosable under the presence of any arbitrary fault set of size up to t. Fault occurrence, however, is usually a sequential process in real-life systems, i.e. multiple faults occur one after another. Any faulty location is immediately diagnosed and the system is reconfigured before any further fault occurs. Systems which are designed under the assumption of sequential fault occurrences and reconfiguration are discussed and their test interconnection assignment for unique diagnosability is characterized. A theorem is developed for sequential k/t-diagnosability, where the system is allowed to have up to t faults but not more than k of them occur at a time. For most practical cases, k has a value of 1. The t-diagnosability theorem is then a special case of this theorem for k=Kt. The results of this theorem are more useful in the design practical systems where the system is reconfigured after every fault is detected and located, and they do not have to satisfy the constraints n>2⩽t View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault detection and design complexity in C-testable VLSI arrays

    Page(s): 1477 - 1481
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (436 KB)  

    An extension of a previous approach to fault detection and C -testability of orthogonal iterative arrays is presented. The state transition table of a basic cell is analyzed. Five new states are added to it. It is proved that even though the number of additional states in the proposed approach is greater than previous approaches, (five states compared to four), the required number of test vectors is considerably reduced (by a factor of approximately 4/9). An approach to implement the proposed C-testability approach into logic design is also presented. Complexity of this implementation is analyzed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel graph algorithms based upon broadcast communications

    Page(s): 1468 - 1472
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (580 KB)  

    Some common guidelines that can be used to design parallel algorithms under the single-channel broadcast communication model are presented. Several graph problems are solved, including topological ordering, the connected component problem, breadth-first search, and depth-first search. If an ideal conflict resolution scheme is used, all of the algorithms require O(n) time by using n processors. Under such a situation, the algorithms are all optimal. If a realistic conflict resolution is used, the algorithms require O(n log n) time by using n/log n processors. For both cases, all of the algorithms achieve optimal speedups View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A reconfigurable tree architecture with multistage interconnection network

    Page(s): 1481 - 1485
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (436 KB)  

    A novel approach to the design of a reconfigurable tree architecture is presented. The architecture is implemented with an augmented shuffle-exchange multistage interconnection network and is capable of assuming N distinct binary tree configurations, where N is the number of processing elements (PEs) in the system. The novel features of the architecture include fast switching from one configuration to another, simplified hardware in the PEs and the switching network, and simple routing control View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the constant diagnosability of baseline interconnection networks

    Page(s): 1458
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (392 KB)  

    A novel approach for the diagnosis of baseline interconnection networks with a fan-in/fan-out of 2 is presented. The totally exhaustive combinatorial fault model with single fault assumption is used in the analysis. Some new characteristics of baseline interconnection networks are proved. A characterization for the fault location and the fault type of the one-response fault are given. This characterization is used in proving that baseline interconnection networks with fan-in/fan-out of 2 can be diagnosed with a constant number of tests independent of the network size. The maximum number of tests is 12 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Albert Y. Zomaya
School of Information Technologies
Building J12
The University of Sydney
Sydney, NSW 2006, Australia
http://www.cs.usyd.edu.au/~zomaya
albert.zomaya@sydney.edu.au