By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 7 • Date Jul 2000

Filter Results

Displaying Results 1 - 11 of 11
  • A protocol to achieve independence in constant rounds

    Page(s): 636 - 647
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (272 KB)  

    Independence is a fundamental property needed to achieve security in fault-tolerant distributed computing. In practice, distributed communication networks are neither fully synchronous or fully asynchronous, but rather loosely synchronized. By this, we mean that in a communication protocol, messages at a given round may depend on messages from other players at the same round. These possible dependencies among messages create problems if we need n players to announce independently chosen values. This task is called simultaneous broadcast. In this paper, we present the first constant round protocol for simultaneous broadcast in a reasonable computation model (which includes a common shared random string among the players). The protocol is provably secure under general cryptographic assumptions. In the process, we develop a new and stronger formal definition for this problem. Previously known protocols for this task required either O(log n) or expected constant rounds to complete (depending on the computation model considered) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The partitioned optical passive stars network: simulations and fundamental operations

    Page(s): 739 - 748
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    We show how a multiprocessor computer interconnected by a partitioned optical passive stars network (POPS) can simulate hypercube and mesh-connected computers. POPS algorithms for data sum, prefix sum, rank, adjacent sum, consecutive sum, concentrate, distribute, and generalize are also developed. These fundamental operations form the building blocks of parallel algorithms for many applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • High-performance routing in networks of workstations with irregular topology

    Page(s): 699 - 719
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (540 KB)  

    Networks of workstations are rapidly emerging as a cost-effective alternative to parallel computers. Switch-based interconnects with irregular topology allow the wiring flexibility, scalability, and incremental expansion capability required in this environment. However, the irregularity also makes routing and deadlock avoidance on such systems quite complicated. In current proposals, many messages are routed following nonminimal paths, increasing latency and wasting resources. In this paper, we propose two general methodologies for the design of adaptive routing algorithms for networks with irregular topology. Routing algorithms designed according to these methodologies allow messages to follow minimal paths in most cases, reducing message latency and increasing network throughput. As an example of application, we propose two adaptive routing algorithms for ANI (previously known as Autonet). They can be implemented either by duplicating physical channels or by splitting each physical channel into two virtual channels. In the former case, the implementation does not require a new switch design. It only requires changing the routing tables and adding links in parallel with existing ones, taking advantage of spare switch ports. In the latter case, a new switch design is required, but the network topology is not changed. Evaluation results for several different tapologies and message distributions show that the new routing algorithms are able to increase throughput for random traffic by a factor of up to 4 with respect to the original up*/down* algorithm, also reducing latency significantly. For other message distributions, throughput is increased more than seven times. We also show that most of the improvement comes from the use of minimal routing View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An opportunity cost approach for job assignment in a scalable computing cluster

    Page(s): 760 - 768
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1276 KB)  

    A new method is presented for job assignment to and reassignment between machines in a computing cluster. Our method is based on a theoretical framework that has been experimentally tested and shown to be useful in practice. This “opportunity cost” method converts the usage of several heterogeneous resources in a machine to a single homogeneous “cost.” Assignment and reassignment are then performed based on that cost. This is in contrast to traditional, ad hoc methods for job assignment and reassignment. These treated each resource as an independent entity with its own constraints, as there was no clean way to balance one resource against another. Our method has been tested by simulations, as well as real executions, and was found to perform well View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A unified framework for optimizing locality, parallelism, and communication in out-of-core computations

    Page(s): 648 - 668
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2036 KB)  

    This paper presents a unified framework that optimizes out-of-core programs by exploiting locality and parallelism, and reducing communication overhead. For out-of-core problems where the data set sizes far exceed the size of the available in-core memory, it is particularly important to exploit the memory hierarchy by optimizing the I/O accesses. We present algorithms that consider both iteration space (loop) and data space (file layout) transformations in a unified framework. We show that the performance of an out-of-core loop nest containing references to out-of-core arrays can be improved by using a suitable combination of file layout choices and loop restructuring transformations. Our approach considers array references one-by-one and attempts to optimize each reference for parallelism and locality. When there are references for which parallelism optimizations do not work, communication is vectorized so that data transfer can be performed before the innermost loop. Results from hand-compiles on IBM SP-2 and Inter Paragon distributed-memory message-passing architectures show that this approach reduces the execution times and improves the overall speedups. In addition, we extend the base algorithm to work with file layout constraints and show how it is useful for optimizing programs that consist of multiple loop nests View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Randomized initialization protocols for ad hoc networks

    Page(s): 749 - 759
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (224 KB)  

    Ad hoc networks are self-organizing entities that are deployed on demand in support of various events including collaborative computing, multimedia classroom, disaster-relief, search-and-rescue, interactive mission planning, and law enforcement operations. One of the fundamental tasks that have to be addressed when setting up an ad hoc network (AHN, for short) is initialization. This involves assigning each of the n stations in the AHN a distinct ID number (e.g., a local IP address) in the range from 1 to n. Our main contribution is to propose efficient randomized initialization protocols for AHNs. We begin by showing that if the number n of stations is known beforehand, an n-station, single-channel AHN can be initialized with probability exceeding 1-(1/n), in en+O(√(nlogn)) time slots, regardless of whether the AHN has collision detection capability. We then go on to show that even if n is not known in advance, an n-station, single-channel AHN with collision detection can be initialized with probability exceeding 1-(1/n), in (10n)/3+O(√(n 1n n)) time slots. Using this protocol as a stepping stone, we then present an initialization protocol for the n-station, k-channel AHN with collision detection that terminates with probability exceeding 1-(1/n), in (10n)/(3k)+O(√(n 1n n)/k) time slots. Finally, we look at the case where the collision detection capability is not present. Our first result in this direction is to show that the task of electing a leader in an n-station, single-channel AHN can be completed with probability exceeding 1-(1/n), in fewer than 11.37(iog n)2+2.39 log n time slots. This leader election protocol allows us to design an initialization protocol for the n-station, single-channel AHN with no collision detection that terminates with probability exceeding 1-(1/n), in fewer than 5.67n+O(√(n 1n n)) time slots, even if n is not known beforehand. We then discuss an initialization protocol for the n-station, k-channel AHN with no collision detection that terminates with probability exceeding 1-(1/n), in fewer than 5.67(n/k)+O(√(n 1n n)/k) time slots, whenever k⩽n/((log n)3) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The odd-even turn model for adaptive routing

    Page(s): 729 - 738
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (328 KB)  

    This paper presents a model for designing adaptive wormhole routing algorithms for meshes without virtual channels. The model restricts the locations where some turns can be taken so that deadlock is avoided. In comparison with previous methods, the degree of routing adaptiveness provided by the model is more even for different source-destination pairs. The mesh network may benefit from this feature in terms of communication efficiency. Simulation results show that the even adaptiveness provided by the odd-even turn model makes message routing less vulnerable to nonuniform factors such as hot spot traffic. In addition, this property results in a smaller fluctuation of the network performance with respect to different traffic patterns View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploiting fine-grained idle periods in networks of workstations

    Page(s): 683 - 698
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (2872 KB)  

    Studies have shown that for a significant fraction of the time, workstations are idle. In this paper, we present a new scheduling policy called Linger-Longer that exploits the fine-grained availability of workstations to run sequential and parallel jobs. We present a two-level workload characterization study and use it to simulate a cluster of workstations running our new policy. We compare two variations of our policy to two previous policies: Immediate-Eviction and Pause-and-Migrate. Our study shows that the Linger-Longer policy can improve the throughput of foreign jobs on a cluster by 60 percent with only a 0.5 percent slowdown of local jobs. For parallel computing, we show that the Linger-Longer policy outperforms reconfiguration strategies when the processor utilization by the local process is 20 percent or less in both synthetic bulk synchronous and real data-parallel applications View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matrix multiplication of data routing using a partitioned optical passive stars network

    Page(s): 720 - 728
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (320 KB)  

    We develop optimal or near optimal algorithms to multiply matrices and perform commonly occurring data permutations and BPC permutations on multiprocessor computers interconnected by a partitioned optical passive stars network View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A framework for the design and implementation of FFT permutation algorithms

    Page(s): 625 - 635
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (248 KB)  

    We propose an algebraic framework for the design and implementation of a large class of data-sorting procedures, including all index-digit permutations used in FFTs. We discuss both old and new algorithms in terms of this framework. We show that the algebraic formulation of the new algorithms can be easily encoded using a functional programming language, and that the resulting code introduces no inefficiencies. We present performance results for implementations of three new algorithms for mixed-radix digit-reversal on a Cray C-90 and on a Sun Sparc 5 View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed multimedia application configuration management

    Page(s): 669 - 682
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1664 KB)  

    Employing distributed multimedia applications (DMA) requires management support for multiple configuration steps including the definition of a desired DMA topology, the specification of a desired quality of service (QoS) and its enforcement through resource reservation. In this paper, we examine the additional aspect of finding an appropriate placement for a DMA within a distributed computer system (DCS). An overall approach is described for interrelating placement functions with existing procedures for topology and QoS specification and resource reservation. Then the problem of assigning a DMA within a DCS is formulated with the goal of finding a DMA placement with minimized computation and communication cost. For solving the assignment problem an efficient heuristic algorithm-SIGMA-is presented. Unlike other approaches, SIGMA takes into account requirements, which are specific for multimedia applications. Based on experiments conducted for randomly generated DMA and DCS graphs, the efficiency and accuracy of SIGMA is shown to be encouraging because, at low execution times, it finds assignments with cost very close to the optimal one View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology