By Topic

Computers, IEEE Transactions on

Issue 6 • Date June 1996

Filter Results

Displaying Results 1 - 15 of 15
  • Comments on "Line digraph iterations and connectivity analysis of de Bruijn and Kautz graphs"

    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (138 KB)  

    The aim of this note is to present some counterexamples to the results in the paper by Du, Lyuu, and Hsu (see ibid., vol.42, no.5, p.612-16, May 1993). View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A conflict sense routing protocol and its performance for hypercubes

    Page(s): 693 - 703
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1188 KB)  

    We propose a new switching format for multiprocessor networks, which we call conflict sense routing protocol. This switching format is a hybrid of packet and circuit switching, and combines advantages of both. We initially present the protocol in a way applicable to a general topology. We then present an implementation of this protocol for a hypercube computer and a particular routing algorithm. We also analyze the steady-state throughput of the hypercube implementation for random node-to-node communications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A theory of wormhole routing in parallel computers

    Page(s): 704 - 713
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (992 KB)  

    Virtually all theoretical work on message routing in parallel computers has dwelt on packet routing: messages are conveyed as packets, an entire packet can reside at a node of the network, and a packet is sent from the queue of one node to the queue of another node until its reaches its destination. A trend in multicomputer architecture, however, is to use wormhole routing. In wormhole routing a message is transmitted as a contiguous stream of bits, physically occupying a sequence of nodes/edges in the network. Thus, a message resembles a worm burrowing through the network. In this paper we give theoretical analyses of simple wormhole routing algorithms, showing them to be nearly optimal for butterfly and mesh connected networks. Our analysis requires initial random delays in injecting messages to the network. We report simulation results suggesting that the idea of random initial delays may have an impact beyond theoretical analysis View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the effect of defect clustering on test transparency and IC test optimization

    Page(s): 753 - 757
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (424 KB)  

    We recently proposed a wafer-based testing approach which for the first time employs defect clustering information on the wafer to optimize test cost and defect levels in the shipped product. Preliminary analysis of this approach had implicitly assumed that the probability that a test detects a faulty circuit is independent of the number of faults in that circuit. This assumption may be optimistic. In this correspondence, we study the effect of clustering and test transparency on defect distributions in individual dice, and its impact on the fault detection capabilities of a given test set. We show here that significant defect-level improvements in the shipped product can indeed be achieved by exploiting defect clustering in optimization testing View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Doing the twist: diagonal meshes are isomorphic to twisted toroidal meshes

    Page(s): 766 - 767
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (176 KB)  

    We show that a k×n diagonal mesh is isomorphic to a (n+k)/2×(n+k)/2-(n-k)/2×(n-k)/2 twisted toroidal mesh, i.e., a network similar to a standard (n+k)/2×(n+k)/2 toroidal mesh, but with opposite handed twists of (n-k)/2 in the two directions, which results in a loss of ((n-k)/2)2 nodes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiple fault detection in fan-out free circuits using minimal single fault test set

    Page(s): 763 - 765
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (228 KB)  

    This paper presents a new algorithm to generate test sets for single stuck-at faults, which also detect all multiple stuck-at faults in fan-out-free circuits. This algorithm derives the test set for each node in a fan-out-free circuit by calculating the output count of the node. The output count indicates the number of test patterns needed to check for all faults in the corresponding subcircuit. The fan-out-free circuit can be any combination of AND, OR, NOT, NAND, and NOR gates View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimization of memory and network contention for accessing arbitrary data patterns in SIMD systems

    Page(s): 757 - 762
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (544 KB)  

    Finding general XOR-schemes to minimize memory and network contention for accessing arrays with arbitrary sets of data templates is presented. A combined XOR-matrix is proposed together with a necessary and sufficient condition for conflict-free access. We present a new characterization of the baseline network. Finding an XOR-matrix for combined templates is shown to be an NP-complete problem. A heuristic is proposed for finding XOR-matrices by determining the constraints of each template-matrix and solving a set of simultaneous equations for each row. Evaluation shows significant reduction of memory and network contention compared to interleaving and to static row-column-diagonals storage View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis and implementation of hybrid switching

    Page(s): 684 - 692
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1008 KB)  

    The switching scheme of a point-to-point network determines how packets flow through each node, and is a primary element in determining the network's performance. In this paper, we present and evaluate a new switching scheme called hybrid switching. Hybrid switching dynamically combines both virtual cut-through and wormhole switching to provide higher achievable throughput than wormhole alone, while significantly reducing the buffer space required at intermediate nodes when compared to virtual cut-through. This scheme is motivated by a comparison of virtual cut-through and wormhole switching through cycle-level simulations, and then evaluated using the same methods. To show the feasibility of hybrid switching, as well as to provide a common base for simulating and implementing a variety of routing and switching schemes, we have designed SPIDER, a communication adapter built around a custom ASIC called the programmable routing controller (PRC) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive fault-tolerant deadlock-free routing in meshes and hypercubes

    Page(s): 666 - 683
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1828 KB)  

    We present an adaptive deadlock-free routing algorithm which decomposes a given network into two virtual interconnection networks, VIN1 and VIN2. VIN1 supports deterministic deadlock-free routing, and VIN2 supports fully-adaptive routing. Whenever a channel in VIN1 or VIN2 is available, it can be used to route a message. Each node is identified to be in one of three states: safe, unsafe, and faulty. The unsafe state is used for deadlock-free routing, and an unsafe node can still send and receive messages. When nodes become faulty/unsafe, some channels in VIN2 around the faulty/unsafe nodes are used as the detours of those channels in VIN1 passing through the faulty/unsafe nodes, i.e., the adaptability in VIN 2 is transformed to support fault-tolerant deadlock-free routing. Using information on the state of each node's neighbors, we have developed an adaptive fault-tolerant deadlock-free routing scheme for n-dimensional meshes and hypercubes with only two virtual channels per physical link. In an n-dimensional hypercube, any pattern of faulty nodes can be tolerated as long as the number of faulty nodes is no more than [n/2]. The maximum number of faulty nodes that can be tolerated is 2n-1, which occurs when all faulty nodes can be encompassed in an (n-1)-cube. In an n-dimensional mesh, we use a more general fault model, called a disconnected rectangular block. Any arbitrary pattern of faulty nodes can be modeled as a rectangular block after finding both unsafe and disabled nodes (which are then treated as faulty nodes). This concept can also be applied to k-ary n-cubes with four virtual channels, two in VIN1 and the other two in VIN2. Finally, we present simulation results for both hypercubes and 2-dimensional meshes by using various workloads and fault patterns View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed, deadlock-free routing in faulty, pipelined, direct interconnection networks

    Page(s): 651 - 665
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1420 KB)  

    This paper focuses on designing high performance pipelined networks that can operate in the presence of dynamic component failures. A general, rigorous framework for deadlock-free communication in faulty, pipelined networks is developed. A mechanism is also proposed for recovering from dynamic link and node failures. The recovery mechanism (1) is fully distributed, (2) does not require timeouts, (3) prevents fault-induced deadlock, and (4) is integrated into the virtual channel flow control mechanisms. This recovery mechanism is used to develop a new pipelined communication mechanism-acknowledged pipelined circuit-switching (APCS). This mechanism supports existing routing protocols that can tolerate a maximal number of static link failures, i.e., one less than the number of ports on a node. An implementation of a novel router architecture is described and the results of detailed flit level simulations are presented. Finally, the proposed recovery mechanism is shown to be applicable to existing adaptive wormhole routing protocols which are prone to deadlock in the presence of dynamic faults View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On bufferless routing of variable length messages in leveled networks

    Page(s): 714 - 729
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1600 KB)  

    We study the most general communication paradigm on a multiprocessor, wherein each processor has a distinct message (of possibly distinct lengths) for each other processor. We study this paradigm, which we call chatting, on multiprocessors that do not allow messages once dispatched ever to be delayed on their routes. By insisting on oblivious routes for messages, we convert the communication problem to a pure scheduling problem. We introduce the notion of a virtual chatting schedule, and we show how efficient chatting schedules can often be produced from efficient virtual chatting schedules. We present a number of strategies for producing efficient virtual chatting schedules on a variety of network topologies View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fault-tolerant tree communication scheme for hypercube systems

    Page(s): 641 - 650
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (932 KB)  

    The tree communication scheme was shown to be very efficient for global operations on data residing in the processors of a hypercube with time complexity of O(log2N), where N is the number of processors. This communication scheme is very useful for many parallel algorithms on hypercube multiprocessors. If a problem can be divided into independent subproblems, each subproblem can first be solved by one of the processors. Then, the tree communication scheme is invoked to merge the subresults into the final results. All the algorithms for problems with this property can benefit from the tree communication scheme. We propose a more general and efficient tree communication scheme in this paper. In addition, we also propose fault-tolerant algorithms for the tree communication scheme, by exploiting the unique properties of the tree communication scheme. The computation and communication slowdown is small (<2) under the effect of multiple link and/or node failures View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parameters for system effectiveness evaluation of distributed systems

    Page(s): 746 - 752
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (608 KB)  

    In a Distributed Computing System (DCS), the failure of one or more system components causes the degradation in its effectiveness to complete a given task as opposed to a complete network breakdown. This paper addresses the issue of degraded system effectiveness evaluation by introducing two static measures, namely, Distributed Program Performance Index (DPPI) and Distributed System Performance Index (DSPI). These metrics can be used to compare networks with different features for application execution. It can be used to determine if the network with high reliability and low capacity, or low reliability and high capacity is better for a given program execution. An algorithm is also developed for computing these indices, and it is shown to be not only efficient, but general enough to compute many other existing measures such as computer networks, distributed systems, transaction based systems, etc View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Detection of multiple faults in two-dimensional ILAs

    Page(s): 741 - 746
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (624 KB)  

    We provide test sets proportional to the sum of the two dimensions of the array for a large class of cells, which allow us to test rows (or columns) of cells of the array independently. Constant length test sets for array multipliers have been found under the single faulty cell model if the array is modified, and otherwise test sets are proportional to the number of cells. We can verify the full adder array of a combinational n×m multiplier in O(n+m) tests under the Multiple Faulty Cell (MFC) model. The entire multiplier, including the AND gates which generate the summands, can be verified after applying the same modifications which make the multiplier C-testable under the single faulty cell model View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Static assignment of stochastic tasks using majorization

    Page(s): 730 - 740
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1132 KB)  

    We consider the problem of statically assigning many tasks to a (smaller) system of homogeneous processors, where a task's structure is modeled as a branching process, all tasks are assumed to have identical behavior, and the tasks may synchronize frequently. We show how the theory of majorization can be used to obtain a partial order among possible task assignments. We show that if the vector of numbers of tasks assigned to each processor under one mapping is majorized by that of another mapping, then the former mapping is better than the latter with respect to a large number of objective functions. In particular, we show how the metrics of finishing time, the space-time product, and reliability are all captured. We also apply majorization to the problem of partitioning a pool of processors for distribution among parallelizable tasks. Limitations of the approach, which include the static nature of the assignment, are also discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
Paolo Montuschi
Politecnico di Torino
Dipartimento di Automatica e Informatica
Corso Duca degli Abruzzi 24 
10129 Torino - Italy
e-mail: pmo@computer.org