By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 11 • Date Nov 1995

Filter Results

Displaying Results 1 - 9 of 9
  • FDDI-M: a scheme to double FDDI's ability of supporting synchronous traffic

    Page(s): 1125 - 1131
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (724 KB)  

    Synchronous messages are usually generated periodically and each of them is required to be transmitted before the generation of the next message. Due to the inherent deficiency in its medium access control (MAC) protocol, an FDDI token ring can use at most one half of its ring bandwidth to transmit such synchronous traffic. This deficiency greatly reduces the FDDI's capability of supporting multimedia applications like real-time voice/video transmissions. In this paper, we show how a few simple modifications to the FDDI's MAC protocol can remove this deficiency and double a ring's ability of supporting synchronous traffic. The modified protocol, called FDDI-M, can also achieve a higher throughput for asynchronous traffic than the standard FDDI and the FDDI-II, thus making it useful even for those networks without heavy synchronous traffic View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exact convergence of a parallel textured algorithm for data network optimal routing problems

    Page(s): 1132 - 1146
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1132 KB)  

    In our earlier paper (1991), a textured decomposition based algorithm is developed to solve the optimal routing problem in data networks; a few examples were used to illustrate the speedup advantage and the convergence conditions for the textured algorithm to converge to a global minimum. The speedup advantage is investigated in Huang et al. (1993). However, the theoretical foundation is not provided. In this paper, we provide the foundation. First, we show that for any textured decomposition, the algorithm always converges to a stationary point, which may not be a global minimum. And then, we prove that if the conditions of the exact convergence theorem are satisfied, the textured algorithm will converge to a global minimum View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Routing in modular fault-tolerant multiprocessor systems

    Page(s): 1206 - 1220
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1476 KB)  

    In this paper, we consider a class of modular multiprocessor architectures in which spares are added to each module to cover for faulty nodes within that module, thus forming a fault-tolerant basic block (FTBB). In contrast to reconfiguration techniques that preserve the physical adjacency between active nodes in the system, our goal is to preserve the logical adjacency between active nodes by means of a routing algorithm which delivers messages successfully to their destinations. We introduce two-phase routing strategies that route messages first to their destination FTBB, and then to the destination nodes within the destination FTBB. Such a strategy may be applied to a variety of architectures including binary hypercubes and three-dimensional tori. In the presence of f faults in hypercubes and tori, we show that the worst case length of the message route is min {σ+f, (K+1)σ}+c where σ is the shortest path in the absence of faults, K is the number of spare nodes in an FTBB, and c is a small constant. The average routing overhead is much lower than the worst case overhead View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient placement of parity and data to tolerate two disk failures in disk array systems

    Page(s): 1177 - 1184
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (784 KB)  

    In this paper, we deal with the data/parity placement problem which is described as follows: how to place data and parity evenly across disks in order to tolerate two disk failures, given the number of disks N and the redundancy rate p which represents the amount of disk spaces to store parity information. To begin with, we transform the data/parity placement problem into the problem of constructing an N×N matrix such that the matrix will correspond to a solution to the problem. The method to construct a matrix has been proposed and we have shown how our method works through several illustrative examples. It is also shown that any matrix constructed by our proposed method can be mapped into a solution to the placement problem if a certain condition holds between N and p where N is the number of disks and p is a redundancy rate View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance considerations of shared virtual memory machines

    Page(s): 1185 - 1194
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1008 KB)  

    Generalized speedup is defined as parallel speed over sequential speed. In this paper the generalized speedup and its relation with other existing performance metrics, such as traditional speedup, efficiency, scalability, etc., are carefully studied. In terms of the introduced asymptotic speed, we show that the difference between the generalized speedup and the traditional speedup lies in the definition of the efficiency of uniprocessor processing, which is a very important issue in shared virtual memory machines. A scientific application has been implemented on a KSR-1 parallel computer. Experimental and theoretical results show that the generalized speedup is distinct from the traditional speedup and provides a more reasonable measurement. In the study of different speedups, an interesting relation between fixed-time and memory-bounded speedup is revealed. Various causes of superlinear speedup are also presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An adaptive fault-tolerant routing algorithm for hypercube multicomputers

    Page(s): 1147 - 1152
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (656 KB)  

    This paper presents a partially adaptive fault-tolerant routing algorithm for hypercube multicomputers. The algorithm is tolerant to n-1 link and/or node faults for an n-cube. It makes routing decisions adaptively based on local failure information only. It is simple to implement and needs a very small message overhead. A comparison between the algorithm and a popular previous work is given View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Annealed embeddings of communication patterns in an interconnection cached network

    Page(s): 1153 - 1167
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1472 KB)  

    The communication needs of many parallel applications exhibit what we call switching locality. In such applications, each computation entity (process, thread, etc.) tends to restrict its communication to a small set of other entities. The physical location or proximity of these entities can be arbitrary, as long as the communication degree is small. The Interconnection Cached Network (ICN) is a reconfigurable network ideally suited for exploiting such locality. The use of fast small crossbar switches (Interconnection Caches) with a larger, but slower, reconfigurable network (optimized for connectivity) lets the ICN adapt to the communication requirements of individual applications, potentially achieving higher performance. Embedding communication patterns efficiently in an ICN, requires finding a bounded l-contraction of the underlying communication graph. The problem of identifying whether a graph has a bounded and l-contraction for a given integer l is known to be NP-complete for l>2. We describe a heuristic algorithm based on simulated annealing for this problem. We test the effectiveness of our approach by using it to embed graphs, representing regular communication patterns, for which the best solutions are deterministically known. The algorithm does not rely on any structural information of the communication pattern and is therefore applicable to irregular patterns as well. The results of applying our heuristics to embed such irregular graphs are also presented View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hot-potato algorithms for permutation routing

    Page(s): 1168 - 1176
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1020 KB)  

    We develop a methodology for the design of hot-potato algorithms for routing permutations. The basic idea is to convert existing store-and-forward routing algorithms to hot-potato algorithms. Using it, we obtain the following complexity bounds for permutation routing: n×n Mesh: 7n+o(n) steps; 2n hypercube: O(n2) steps; n×n Torus: 4n+o(n) steps. The algorithm for the two-dimensional grid is the first to be both deterministic and asymptotically optimal. The algorithm for the 2n-nodes Boolean cube is the first deterministic algorithm that achieves a complexity of o(2n) steps View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multicoloring of grid-structured PDE solvers on shared-memory multiprocessors

    Page(s): 1195 - 1205
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1104 KB)  

    In order to execute a parallel PDE (partial differential equation) solver on a shared-memory multiprocessor, we have to avoid memory conflicts in accessing multidimensional data grids. A new multicoloring technique is proposed for speeding sparse matrix operations. The new technique enables parallel access of grid-structured data elements in the shared memory without causing conflicts. The coloring scheme is formulated as an algebraic mapping which can be easily implemented with low overhead on commercial multiprocessors. The proposed multicoloring scheme bas been tested on an Alliant FX/80 multiprocessor for solving 2D and 3D problems using the CGNR method. Compared to the results reported by Saad (1989) on an identical Alliant system, our results show a factor of 30 times higher performance in Mflops. Multicoloring transforms sparse matrices into ones with a diagonal diagonal block (DDB) structure, enabling parallel LU decomposition in solving PDE problems. The multicoloring technique can also be extended to solve other scientific problems characterized by sparse matrices View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology