By Topic

Parallel and Distributed Systems, IEEE Transactions on

Issue 1 • Date Jan 1997

Filter Results

Displaying Results 1 - 7 of 7
  • Constant time algorithms for computational geometry on the reconfigurable mesh

    Page(s): 1 - 12
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (568 KB)  

    The reconfigurable mesh consists of an array of processors interconnected by a reconfigurable bus system. The bus system can be used to dynamically obtain various interconnection patterns among the processors. Recently, this model has attracted a lot of attention. The authors show O(1) time solutions to the following computational geometry problems on the reconfigurable mesh: all-pairs nearest neighbors, convex hull, triangulation, two-dimensional maxima, two-set dominance counting, and smallest enclosing box. All these solutions accept N planar points as input and employ an N×N reconfigurable mesh. The basic scheme employed in the implementations is to recursively find an O(1) time solution. The number of recursion levels and the size of the subproblems at each level of recursion are optimized such that the problem decomposition and the solution to the problem can be obtained in constant time. As a result, they have developed some efficient merge techniques to combine the solutions for subproblems on the reconfigurable mesh. These techniques exploit reconfigurability in nontrivial ways leading to constant time solutions using optimal size of the mesh View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • pp-mess-sim: a flexible and extensible simulator for evaluating multicomputer networks

    Page(s): 25 - 40
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (724 KB)  

    The paper presents pp-mess-sim, an object-oriented discrete-event simulation environment for evaluating interconnection networks in message-passing systems. The simulator provides a toolbox of various network topologies, communication workloads, routing-switching algorithms, and router models. By carefully defining the boundaries between these modules, pp-mess-sim creates a flexible and extensible environment for evaluating different aspects of network design. The simulator models emerging multicomputer networks that can support multiple routing and switching schemes simultaneously; pp-mess-sim achieves this flexibility by associating routing-switching policies, traffic patterns, and performance metrics with collections of packets, instead of the underlying router model. Besides providing a general framework for evaluating router architectures, pp-mess-sim includes a cycle-level model of the PRC, a programmable router for point-to-point distributed systems. The PRC model captures low-level implementation details, while another high-level model facilitates experimentation with general router design issues. Sample simulation experiments capitalize on this flexibility to compare network architectures under various application workloads View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptively scheduling parallel loops in distributed shared-memory systems

    Page(s): 70 - 81
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (472 KB)  

    Using runtime information of load distributions and processor affinity, the authors propose an adaptive scheduling algorithm and its variations from different control mechanisms. The proposed algorithm applies different degrees of aggressiveness to adjust loop scheduling granularities, aiming at improving the execution performance of parallel loops by making scheduling decisions that match the real workload distributions at runtime. They experimentally compared the performance of the algorithm and its variations with several existing scheduling algorithms on two parallel machines: the KSR-1 and the Convex Exemplar. The kernel application programs used for performance evaluation were carefully selected for different classes of parallel loops. The results show that using runtime information to adaptively adjust scheduling granularity is an effective way to handle loops with a wide range of load distributions when no prior knowledge of the execution can be used. The overhead caused by collecting runtime information is insignificant in comparison with the performance improvement. The experiments show that the adaptive algorithm and its five variations outperformed the existing scheduling algorithms View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An extended dominating node approach to broadcast and global combine in multiport wormhole-routed mesh networks

    Page(s): 41 - 58
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1152 KB)  

    A new approach to the design of collective communication operations in wormhole-routed mesh networks is described. The approach extends the concept of dominating sets in graph theory by accounting for the relative distance-insensitivity of the wormhole switching strategy and by taking advantage of a multiport communication architecture, which allows each node to simultaneously transmit messages on different outgoing channels. Collective communication operations are defined in terms of sets of extended dominating nodes (EDNs). The nodes in a set of EDNs can deliver (receive) messages to (from) a different, larger set of nodes in a single message-passing step under dimension-ordered wormhole routing and without channel contention among messages. The EDN model can be applied to different collective operations in 2D and 3D mesh networks. The authors focus on EDN-based broadcast and global combine operations. Performance evaluation results are presented that confirm the advantage of this approach over other methods View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Delay-optimal quorum consensus for distributed systems

    Page(s): 59 - 69
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (568 KB)  

    Given a set of nodes S, a coterie is a set of pairwise intersecting subsets of S. Each element in a coterie is called a quorum. Mutual exclusion in a distributed system can be achieved if each request is required to gel consensus from a quorum of nodes. This technique of quorum consensus is also used for replicated distributed database systems, and bicoteries and wr-coteries have been defined to capture the requirements of read and write operations in user transactions. The author is interested in finding coteries, bicoteries, and wr-coteries with optimal communication delay. The protocols take into account the network topology. They design delay-optimal quorum consensus protocols for network topologies of trees, rings, and clustered networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Time-optimal domain-specific querying on enhanced meshes

    Page(s): 13 - 24
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (600 KB)  

    Query processing is a crucial component of various application domains including information retrieval, database design and management, pattern recognition, robotics, and VLSI. Many of these applications involve data stored in a matrix satisfying a number of properties. One property that occurs time and again specifies that the rows and the columns of the matrix are independently sorted. It is customary to refer to such a matrix as sorted. An instance of the batched searching and ranking problem (BSR) involves a sorted matrix A of items from a totally ordered universe, along with a collection Q of queries. Q is an arbitrary mix of the following query types: for a search query qj , one is interested in an item of A that is closest to qj ; for a rank query qj one is interested in the number of items of A that are strictly smaller than qj. The BSR problem asks for solving all queries in Q. The authors consider the BSR problem in the following context: the matrix A is pretiled, one item per processor, onto an enhanced mesh of size √n×√n; the m queries are stored, one per processor, in the first m/√n¯ columns of the platform. Their main contribution is twofold. First, they show that any algorithm that solves the BSR problem must take at least Ω(max{logn, √m}) time in the worst case. Second, they show that this time lower bound is tight on meshes of size √n×√n enhanced with multiple broadcasting, by exhibiting an algorithm solving the BSR problem in Θ(max{logn, √m}) time on such a platform View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of multistage bus networks for a distributed shared memory multiprocessor

    Page(s): 82 - 95
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (600 KB)  

    A multistage bus network (MEN) is proposed to overcome some of the shortcomings of the conventional multistage interconnection networks (MINs), single bus, and hierarchical bus interconnection networks. The MBN consists of multiple stages of buses connected in a manner similar to the MINs and has the same bandwidth at each stage. A switch in an MBN is similar to that in a MIN switch except that there is a single bus connection instead of a crossbar. MBNs support bidirectional routing and there exists a number of paths between any source and destination pair. The authors develop self routing techniques for the various paths, present an algorithm to route a request along the path with minimum distance, and analyze the probabilities of a packet taking different routes. Further, they derive a performance analysis of a synchronous packet-switched MBN in a distributed shared memory environment and compare the results with those of an equivalent bidirectional MIN (BMIN). Finally, they present the execution time of various applications on the MBN and the BMIN through an execution-driven simulation. They show that the MBN provides similar performance to a BMIN while offering simplicity in hardware and more fault-tolerance than a conventional MIN View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.

Aims & Scope

IEEE Transactions on Parallel and Distributed Systems (TPDS) is published monthly. It publishes a range of papers, comments on previously published papers, and survey articles that deal with the parallel and distributed systems research areas of current importance to our readers.

Full Aims & Scope

Meet Our Editors

Editor-in-Chief
David Bader
College of Computing
Georgia Institute of Technology