By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 10 • Date Oct 2001

Filter Results

Displaying Results 1 - 7 of 7
  • The power of two choices in randomized load balancing

    Publication Year: 2001 , Page(s): 1094 - 1104
    Cited by:  Papers (74)  |  Patents (2)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (242 KB) |  | HTML iconHTML  

    We consider the following natural model: customers arrive as a Poisson stream of rate λn, λ<1, at a collection of n servers. Each customer chooses some constant d servers independently and uniformly at random from the n servers and waits for service at the one with the fewest customers. Customers are served according to the first-in first-out (FIFO) protocol and the service time for a customer is exponentially distributed with mean 1. We call this problem the supermarket model. We wish to know how the system behaves and in particular we are interested in the effect that the parameter d has on the expected time a customer spends in the system in equilibrium. Our approach uses a limiting, deterministic model representing the behavior as n→∞ to approximate the behavior of finite systems. The analysis of the deterministic model is interesting in its own right. Along with a theoretical justification of this approach, we provide simulations that demonstrate that the method accurately predicts system behavior, even for relatively small systems. Our analysis provides surprising implications. Having d=2 choices leads to exponential improvements in the expected time a customer spends in the system over d=1, whereas having d=3 choices is only a constant factor better than d=2. We discuss the possible implications for system design View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matrix multiplication on heterogeneous platforms

    Publication Year: 2001 , Page(s): 1033 - 1051
    Cited by:  Papers (33)  |  Patents (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (539 KB) |  | HTML iconHTML  

    We address the issue of implementing matrix multiplication on heterogeneous platforms. We target two different classes of heterogeneous computing resources: heterogeneous networks of workstations and collections of heterogeneous clusters. Intuitively, the problem is to load balance the work with different speed resources while minimizing the communication volume. We formally state this problem in a geometric framework and prove its NP-completeness. Next, we introduce a (polynomial) column-based heuristic, which turns out to be very satisfactory: We derive a theoretical performance guarantee for the heuristic and we assess its practical usefulness through MPI experiments View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • DP: a paradigm for anonymous remote, computation and communication for cluster computing

    Publication Year: 2001 , Page(s): 1052 - 1065
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (281 KB) |  | HTML iconHTML  

    This paper explores the transparent programmability of communicating parallel tasks in a Network of, Workstations (NOW). Programs which are tied up with specific machines will not be resilient to the changing conditions of a NOW. The Distributed Pipes (DP) model enables location independent intertask communication among processes' across machines. This approach enables migration of communicating parallel tasks according to runtime conditions. A transparent programming model for a parallel solution to Iterative Grid Computations using DP is also proposed. Programs written using the model are resilient to the heterogeneity of nodes and changing conditions in the NOW. They are also devoid of any network related code. The design of runtime support and function library support are presented. An engineering problem, namely, the Steady State Equilibrium Problem, is studied over the model. The performance analysis shows the speedup due to parallel execution and scaled down memory requirements. We present a case where the effect of communication overhead can be nullified to achieve a linear to super-linear speedup. The analysis discusses performance resilience of Iterative Grid Computations and characterizes synchronization delay among subtasks and the effect of network overhead and load fluctuations on performance. The performance saturation characteristics of such applications are also studied View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A reliable multicast protocol for distributed mobile systems: design and evaluation

    Publication Year: 2001 , Page(s): 1009 - 1022
    Cited by:  Papers (13)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1331 KB) |  | HTML iconHTML  

    Reliable multicast is a powerful communication primitive for structuring distributed programs in which multiple processes must closely cooperate together. We propose a protocol for supporting reliable multicast in a distributed system that includes mobile hosts and evaluate the performance of our proposal through simulation We consider a scenario in which mobile hosts communicate with a wired infrastructure by means of wireless technology. Our proposal provides several novel features. The sender of each multicast may select among three increasingly strong delivery ordering guarantees: FIFO, causal, total. Movements do not trigger the transmission of any message in the wired network as no notion of hand-off is used. The set of senders and receivers (group) may be dynamic. The size of data structures at mobile hosts, the size of message headers, and the number of messages in the wired network for each multicast are all independent of the number of group members. The wireless network is assumed to provide only incomplete spatial coverage and message losses could occur even within cells. Movements are not negotiated and a mobile host that leaves a cell may enter any other cell, perhaps after a potentially long disconnection. The simulation results show that the proposed protocol has good performance and good scalability properties View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Loop-free hybrid single-path/flooding routing algorithms with guaranteed delivery for wireless networks

    Publication Year: 2001 , Page(s): 1023 - 1032
    Cited by:  Papers (139)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (246 KB) |  | HTML iconHTML  

    In a localized routing algorithm, each node makes forwarding decisions solely based on the position of itself, its neighbors, and its destination. In distance, progress, and direction-based approaches'(reported in the literature), when node A wants to send or forward message m to destination node D, it forwards m to its neighbor C which is closest to D (has best progress toward D, whose direction is closest to the direction of D, respectively) among all neighbors of A. The same procedure is repeated until D, if possible, is eventually reached. The algorithms are referred to as GEDIR, MFR, and DIR when a common failure criterion is introduced: The algorithm stops if the best choice for the current node is the node from which the message came. We propose 2-hop GEDIR, DIR, and MFR methods in which node A selects the best candidate node C among its 1-hop and 2-hop neighbors according to the corresponding criterion and forwards m to its best 1-hop neighbor among joint neighbors of A and C. We then propose flooding GEDIR and MFR and hybrid single-path/flooding GEDIR and MFR methods which are the first localized algorithms (other than full flooding) to guarantee the message delivery (in a collision-free environment). We show that the directional routing methods are not loop-free, while the GEDIR and MFR-based methods are inherently loop free. The simulation experiments, with static random graphs, show that GEDIR and MFR have similar success rates, which is low for low degree graphs and high for high degree ones. When successful, their hop counts are near the performance of the shortest path algorithm. Hybrid single-path/flooding GEDIR and MFR methods have low communication overheads. The results are also confirmed by experiments with moving nodes and MAC layer View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • MPI-LAPI: an efficient implementation of MPI for IBM RS/6000 SP systems

    Publication Year: 2001 , Page(s): 1081 - 1093
    Cited by:  Papers (10)  |  Patents (15)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (385 KB) |  | HTML iconHTML  

    The IBM RS/6000 SP system is one of the most cost-effective commercially available high performance machines. IBM RS/6000 SP systems support the Message Passing Interface standard (MPI) and LAPI. LAPI is a low level, reliable and efficient one-sided communication API library implemented on IBM RS/6000 SP systems. This paper explains how the high performance of the LAPI library has been exploited in order to implement the MPI standard more efficiently than the existing MPI. It describes how to avoid unnecessary data copies at both the sending and receiving sides for such an implementation. The resolution of problems arising from the mismatches between the requirements of the MPI standard and the features of LAPI is discussed. As a result of this exercise, certain enhancements to LAPI are identified to enable an efficient implementation of MPI on LAPI. The performance of the new implementation of MPI is compared with that of the underlying LAPI itself. The latency (in polling and interrupt modes) and bandwidth of our new implementation is compared with that of the native MPI implementation on RS/6000 SP systems. The results indicate that the MPI implementation on LAPI performs comparably to or better than the original MPI implementation in most cases. Improvements of up to 17.3 percent in polling mode latency, 35.8 percent in interrupt mode latency, and 20.9 percent in bandwidth are obtained for certain message sizes. The implementation of MPI on top of LAPI also outperforms the native MPI implementation for the NAS Parallel Benchmarks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Writing programs that run EveryWare on the Computational Grid

    Publication Year: 2001 , Page(s): 1066 - 1080
    Cited by:  Papers (6)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1084 KB) |  | HTML iconHTML  

    The Computational Grid has been proposed, for the implementation of high-performance applications using widely dispersed computational resources. The goal of a Computational Grid is to aggregate ensembles of shared, heterogeneous, and distributed resources (potentially controlled by separate organizations) to provide computational, "power" to an application program. We provide a toolkit for the development of globally deployable Grid applications. The toolkit, called EveryWare, enables an application to draw computational power transparently from the Grid. It consists of a portable set of processes and libraries that can be incorporated into an application so that a wide variety of dynamically changing distributed infrastructures and resources can be used together to achieve supercomputer-like performance. We provide our experiences gained while building the EveryWare toolkit prototype and an explanation of its use in implementing a large-scale Grid application View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology