By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 7 • Date Jul 1997

Filter Results

Displaying Results 1 - 7 of 7
  • Embedding of generalized Fibonacci cubes in hypercubes with faulty nodes

    Publication Year: 1997 , Page(s): 727 - 737
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (512 KB)  

    The generalized Fibonacci cubes (abbreviated to GFCs) were recently proposed as a class of interconnection topologies, which cover a spectrum ranging from regular graphs such as the hypercube to semiregular graphs such as the second order Fibonacci cube. It has been shown that the kth order GFC of dimension n+k is equivalent to an n-cube for 0⩽n<k; and it is a proper subgraph of an n-cube for n⩾k. Thus, a kth order GFC of dimension n+k can be obtained from the n-cube for all n⩾k by removing certain nodes from an n-cube. This problem is very simple when no faulty node exists in an k-cube; but it becomes very complex if some faulty nodes appear in an n-cube. In this paper, we first consider the following open problem: How can a maximal (in terms of the number of nodes) generalized Fibonacci cube be distinguished from a faulty hypercube which can also be considered as a fault-tolerant embedding in hypercubes. Then, we shall show how to directly embed a GFC into a faulty hypercube and prove that if no more than three faulty nodes exist, then an [n/2]th order GFC of dimension n+[n/2] can be directly embedded into an n-cube in the worst case, for n=4 or n⩾6 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis and randomized design of algorithm-based fault tolerant multiprocessor systems under an extended model

    Publication Year: 1997 , Page(s): 757 - 768
    Cited by:  Papers (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (268 KB)  

    Reliability of compute-intensive applications can be improved by introducing fault tolerance into the system. Algorithm based fault tolerance (ABFT) is a low-cost scheme which provides the required fault tolerance to the system through system level encoding. In this paper, we propose randomized construction techniques, under an extended model, for the design of ABFT systems with the required fault tolerance capability. The model considers failures in the processors performing the checking operations View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Noncontiguous processor allocation algorithms for mesh-connected multicomputers

    Publication Year: 1997 , Page(s): 712 - 726
    Cited by:  Papers (34)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (688 KB)  

    Current processor allocation techniques for highly parallel systems are typically restricted to contiguous allocation strategies for which performance suffers significantly due to the inherent problem of fragmentation. As a result, message-passing systems have yet to achieve the high utilization levels exhibited by traditional vector supercomputers. We are investigating processor allocation algorithms which lift the restriction on contiguity of processors in order to address the problem of fragmentation. Three noncontiguous processor allocation strategies-paging allocation, random allocation, and the Multiple Buddy Strategy (MBS)-are proposed and studied in this paper. Simulations compare the performance of the noncontiguous strategies with that of several well-known contiguous algorithms. We show that noncontiguous allocation algorithms perform better overall than the contiguous ones, even when message-passing contention is considered. We also present the results of experiments on an Intel Paragon XP/S-15 with 208 nodes that show noncontiguous allocation is feasible with current technologies View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Recovery analysis of data sharing systems under deferred dirty page propagation policies

    Publication Year: 1997 , Page(s): 695 - 711
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB)  

    In a multinode data sharing environment, different buffer coherency control schemes based on various lock retention mechanisms can be designed to exploit the concept of deferring the propagation or writing of dirty pages to disk to improve normal performance. Two types of deferred write policies are considered. One policy only propagates dirty pages to disk at the times when dirty pages are flushed out of the buffer under LRU buffer replacement. The other policy also performs writes at the times when dirty pages are transferred across nodes. The dirty page propagation policy can have significant implications on the database recovery time. In this paper, we provide an analytical modeling framework for the analysis of the recovery times under the two deferred write policies. We demonstrate how these policies can be mapped onto a unified analytic modeling framework. The main challenge in the analysis is to obtain the pending update count distribution which can be used to determine the average numbers of log records and data I/Os needed to be applied during recovery. The analysis goes beyond previous work on modeling buffer hit probability in a data sharing system where only the average buffer composition, not the distribution, needs to be estimated, and recovery analysis in a single node environment where the complexities on tracking the propagation of dirty pages across nodes and the buffer invalidation effect do not appear View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable global and local hashing strategies for duplicate pruning in parallel A* graph search

    Publication Year: 1997 , Page(s): 738 - 756
    Cited by:  Papers (7)  |  Patents (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (616 KB)  

    For many applications of the A* algorithm, the state space is a graph rather than a tree. The implication of this for parallel A* algorithms is that different processors may perform significant duplicated work if interprocessor duplicates are not pruned. In this paper, we consider the problem of duplicate pruning in parallel A* graph-search algorithms implemented on distributed-memory machines. A commonly used method for duplicate pruning uses a hash function to associate with each distinct node of the search space a particular processor to which duplicate nodes arising in different processors are transmitted and thereby pruned. This approach has two major drawbacks. First, load balance is determined solely by the hash function. Second, node transmissions for duplicate pruning are global; this can lead to hot spots and slower message delivery. To overcome these problems, we propose two different duplicate pruning strategies: 1) To achieve good load balance, we decouple the task of duplicate pruning from load balancing, by using a hash function for the former and a load balancing scheme for the latter. 2) A novel search-space partitioning scheme that allocates disjoint parts of the search space to disjoint subcubes in a hypercube (or disjoint processor groups in the target architecture), so that duplicate pruning is achieved with only intrasubcube or adjacent intersubcube communication. Thus message latency and hot-spot probability are greatly reduced. The above duplicate pruning schemes were implemented on an nCUBE2 hypercube multicomputer to solve the Traveling Salesman Problem (TSP). For uniformly distributed intercity costs, our strategies yield a speedup improvement of 13 to 35 percent on 1,024-processors over previous methods that do not prune any duplicates, and 13 to 25 percent over the previous hashing-only scheme. For normally distributed data the corresponding figures are 135 percent and 10 to 155 percent. Finally, we analyze the scalability of our parallel A* algorithms on k-ary n-cube networks in terms of the isoefficiency metric, and show that they have isoefficiency lower and upper bounds of Θ(P log P) and Θ(Pkn2), respectively View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Linear recursive networks and their applications in distributed systems

    Publication Year: 1997 , Page(s): 673 - 680
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (472 KB)  

    We present a new class of interconnection topologies called the Linear Recursive Networks (LRNs) and examine their possible applications in distributed systems. Each LRN is characterized by a recursive pattern of interconnection which can be specified by simple parameters. Basic properties such as node degree, diameter, and the performance of routing algorithms for all LRNs are then collectively analyzed in terms of these parameters. By choosing appropriate values for the parameters, our results can assist a network designer in selecting a topology with required routing performance and cost of interconnection. A subclass of LRNs, called Congruent LRNs (CLRNs), is also identified here and shown to possess desirable properties for more tightly coupled systems. It is shown that the CLRNs include existing networks such as hypercube and generalized Fibonacci cubes. These results suggest that the linear recursive networks potentially have applications in interconnecting distributed systems View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward a more realistic performance evaluation of interconnection networks

    Publication Year: 1997 , Page(s): 681 - 694
    Cited by:  Papers (4)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (364 KB)  

    lnterconnection network design plays a central role in the design of parallel systems. Most of the previous research has evaluated the performance of interconnection networks in isolation. In this study, we investigate the relationship between application program characteristics and interconnection network performance using an execution driven simulation test bed: the Reconfigurable Architecture Workbench (RAW). We simulate five topological configurations of a k-ary n-cube interconnect and four different network link models for a 4,096 node SIMD machine, and quantify the impact of the network on two application programs. We provide experimental evidence that such “in-context” simulation provides a better view of the impact of network design variables on system performance. We show that recent results, indicating that low-dimensional designs provide better ICN performance, ignore application requirements that may favor high-dimensional designs. Furthermore, applications that would appear to favor low dimensional designs may not, in fact, be significantly impacted by the network's dimensionality. We experimentally test the results of published performance models comparing the use of a synthetic load to that of a load generated by a typical application program View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology