By Topic

Parallel Processing, 1999. 13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP. Proceedings

Date 12-16 April 1999

Filter Results

Displaying Results 1 - 25 of 116
  • Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999

    Publication Year: 1999
    Request permission for commercial reuse | PDF file iconPDF (467 KB)
    Freely Available from IEEE
  • Infrastructure for building parallel database systems for multi-dimensional data

    Publication Year: 1999, Page(s):582 - 587
    Cited by:  Papers (11)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (65 KB)

    Our study of a large set of scientific applications over the past three years indicates that the processing for multidimensional datasets is often highly stylized. The basic processing step usually consists of mapping the individual input items to the output grid and computing output items by aggregating, in some way, all the input items mapped to the corresponding grid point. In this paper we dis... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Index of authors

    Publication Year: 1999, Page(s):759 - 761
    Request permission for commercial reuse | PDF file iconPDF (407 KB)
    Freely Available from IEEE
  • An efficient dynamic load balancing using the dimension exchange method for balancing of quantized loads on hypercube multiprocessors

    Publication Year: 1999, Page(s):708 - 712
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB)

    Dynamic load balancing on hypercube multiprocessors is considered with emphasis on quantized loads. Quantized loads are divisible only in a fixed size. First, we show that a direct application of the well-known Dimension Exchange Method (DEM) to quantized loads may result in difference in assigned loads to processors as large as log N units after balancing for a hypercube of size N. Then we propos... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The recursive grid layout scheme for VLSI layout of hierarchical networks

    Publication Year: 1999, Page(s):441 - 445
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (104 KB)

    We propose the recursive grid layout scheme for deriving efficient layouts of a variety of hierarchical networks and computing upper bounds on the VLSI area of general hierarchical networks. In particular we construct optimal VLSI layouts for butterfly networks, generalized hypercubes, and star graphs that have areas within a factor of 1+o(1) from their lower bounds. We also derive efficient layou... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comparison of router architectures for virtual cut-through and wormhole switching in a NOW environment

    Publication Year: 1999, Page(s):240 - 247
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (116 KB)

    Most commercial routers designed for networks of workstations (NOWs) implement wormhole switching. However wormhole switching is not well suited for NOWs. The long wires required in this environment lead to large buffers to prevent buffer overflow during flow control signaling. Moreover, wire length is limited by buffer size. Virtual cut-through (VCT) achieves a higher throughput than wormhole swi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Shuffle memory system

    Publication Year: 1999, Page(s):268 - 272
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (112 KB)

    This paper proposes a new memory system called shuffle memory. The shuffle memory is a generalization of transposition memory that has been widely used in 2-D Discrete Cosine Transform and Discrete Fourier Transform. The shuffle memory is the first memory system that receives M×N inputs in row major while providing M×N outputs in column major order using only M×N memory space. Sh... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping media streams onto a network of servers

    Publication Year: 1999, Page(s):470 - 476
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (108 KB)

    This paper presents the definition as well as a number of methods for the solution of a new combinatorial optimization problem, called S-MAMP that has to be solved for the efficient management of a network of media servers. This network of servers can be implemented on top of a closely connected network of workstations as well as on a wide area network of media servers. The problem studied here ad... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault-tolerant routing algorithms for hypercube networks

    Publication Year: 1999, Page(s):218 - 224
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (368 KB)

    For hypercube networks which have faulty nodes, a few efficient dynamic routing algorithms have been proposed by allowing each node to hold the status of neighbors. We propose two improved versions of the algorithm of Chiu and Wu by using the notion of full reachability. A fully reachable node means that the node can reach all nonfaulty nodes which have Hamming distance h from the node via a path ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel adaptive version of the block-based Gauss-Jordan algorithm

    Publication Year: 1999, Page(s):350 - 354
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (88 KB)

    This paper presents a parallel adaptive version of the block-based Gauss-Jordan algorithm used in numerical analysis to invert matrices. This version includes a characterization of the workload of processors and a mechanism of its adaptive folding/unfolding. The application is implemented and experimented with MARS in dedicated and non-dedicated environments. The results show that an absolute effi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A graph based framework to detect optimal memory layouts for improving data locality

    Publication Year: 1999, Page(s):738 - 743
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (156 KB)

    In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advances in caches help in better utilization of the memory hierarchy, compiler-directed locality enhancement techniques are also important. In this paper we propose a locality improvement technique that uses data space (array... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new approach to parallel dynamic partitioning for adaptive unstructured meshes

    Publication Year: 1999, Page(s):360 - 364
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (104 KB)

    Classical mesh partitioning algorithms were designed for rather static situations, and their straightforward application in a dynamical framework may lead to unsatisfactory results, e.g., excessive data migration among processors. Furthermore, special attention should be paid to their amenability to parallelization. In this paper a novel parallel method for the dynamic partitioning of adaptive uns... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reducing I/O complexity by simulating coarse grained parallel algorithms

    Publication Year: 1999, Page(s):14 - 20
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (152 KB)

    Block-wise access to data is a central theme in the design of efficient external memory (EM) algorithms. A second important issue, when more than one disk is present, is fully parallel disk I/O. In this paper we present a deterministic simulation technique which transforms parallel algorithms into (parallel) external memory algorithm. Specifically; we present a deterministic simulation technique w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Compiler analysis to support compiled communication for HPF-like programs

    Publication Year: 1999, Page(s):603 - 608
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (132 KB)

    By managing network resources at compile time, the compiled communication technique greatly improves the communication performance for communication patterns that are known at compile time. In order to support compiled communication, the compiler must estimate the run-time physical connection requirement (physical communication) of a program and partition the program into phases such that the unde... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of an infrastructure for worldwide parallel computing

    Publication Year: 1999, Page(s):379 - 386
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (372 KB)

    The millions of Java-capable computers on the Internet provide the hardware needed for a national and international computing infrastructure, a virtual parallel computer that can be tapped for many uses. To date, however, little is known about the cost and feasibility of building and maintaining such global, large scale structures. In this work, we evaluate the performance of a Java/WWW-based infr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Asynchronous group mutual exclusion in ring networks

    Publication Year: 1999, Page(s):539 - 543
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (152 KB)

    The design issues for group mutual exclusion have been modeled by Joung as the Congenial Talking Philosophers, and solutions for shared-memory models and complete message-passing networks have been proposed. These solutions, however cannot be straightforwardly and efficiently converted to ring networks where each philosopher can only communicate directly with its two neighboring philosophers. As r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic interval routing on asynchronous rings

    Publication Year: 1999, Page(s):225 - 232
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (124 KB)

    We consider the problem of routing in an asynchronous dynamically changing ring of processors using schemes that minimize the storage space for the routing information. In general, applying static techniques to a dynamic network would require significant re-computation. Moreover, the known dynamic techniques applied to the ring lead to inefficient schemes. In this paper we introduce a new techniqu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cashmere-VLM: Remote memory paging for software distributed shared memory

    Publication Year: 1999, Page(s):153 - 159
    Cited by:  Papers (8)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (92 KB)

    Software distributed shared memory (DSM) systems have successfully provided the illusion of shared memory on distributed memory machines. However most software DSM systems use the main memory of each machine as a level in a cache hierarchy, replicating copies of shared data in local memory. Since computer memories tend to be much larger than caches, DSM systems have largely ignored memory capacity... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast multithreaded out-of-core visualization technique

    Publication Year: 1999, Page(s):569 - 575
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (72 KB)

    Out-of-core rendering techniques are necessary for viewing large volume disk-resident data sets produced by many scientific applications or high resolution imaging systems. Traditional visualizers can provide real-time performance but require all of the data to be viewed to be in the RAM. We describe a multithreaded implementation of an out-of-core isosurface renderer that does not impose such res... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • 2.5n-step sorting on n×n meshes in the presence of o(√n) worst-case faults

    Publication Year: 1999, Page(s):436 - 440
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (724 KB)

    In this paper we propose the robust algorithm-configured emulation (RACE) scheme for the design of simple and efficient robust algorithms that can run on faulty mesh-connected computers. We show that 1-1 sorting (1 key per healthy processor) can be performed in 2.5n+o(n) communication steps and 2n+o(n) comparison steps on an n×n mesh with an arbitrary pattern of o(√n) faults. This runn... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hyperplane partitioning: an approach to global data partitioning for distributed memory machines

    Publication Year: 1999, Page(s):744 - 748
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (236 KB)

    Automatic global data partitioning for distributed memory machines (DMMs) is a difficult problem. In this work, we present a partitioning strategy called `hyperplane partitioning' which also works well with loops with non-uniform dependences. Several optimizations and an implementation on IBM-SP2 are described View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Reducing system overheads in home-based software DSMs

    Publication Year: 1999, Page(s):167 - 173
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1448 KB)

    Software DSM systems suffer from the high communication and coherence-induced overheads that limit performance. This paper introduces our efforts in reducing system overheads of a home-based software DSM called JIAJIA. Three measures, including eliminating false sharing through avoiding unnecessarily invalidating cached pages, reducing virtual memory page faults with a new write detection scheme, ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • IP validation for FPGAs using Hardware Object TechnologyTM

    Publication Year: 1999, Page(s):624 - 629
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (136 KB)

    Although verification and simulation tools are always improving, the results they provide remain hard to analyze and interpret. On one hand, verification sticks to the functional description of the circuit, with no timing consideration. On the other hand, simulation runs mainly on subsets of the entire input domain. Furthermore, these tools provide results in a format (e.g. state graphs, bit vecto... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance evaluation of the ServerNet(R) SAN under self-similar traffic

    Publication Year: 1999, Page(s):143 - 147
    Cited by:  Papers (12)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (140 KB)

    Self-similar traffic distributions have been observed in a wide range of networking applications and models such as LANs, WANs, telnet, FTP, WWW, ISDN, SS7 and VBR traffic over ATM. Therefore, it has been suggested that many other theoretical protocols and systems need to be reevaluated under this different type of traffic before practical implementations potentially show their faults. The ServerN... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient logging algorithm for incremental replay of message-passing applications

    Publication Year: 1999, Page(s):392 - 398
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB)

    To support incremental replay of message-passing applications, processes must periodically checkpoint and the content of some messages must be logged, to break dependencies of the current state of the execution on past events. The paper presents a new adaptive logging algorithm that dynamically decides whether to log a message based on dependencies the incoming message introduces on past events of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.