Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing

25-28 Oct. 1995

Filter Results

Displaying Results 1 - 25 of 98
  • Proceedings of Seventh IEEE Symposium on Parallel and Distributed Processing

    Publication Year: 1995
    Request permission for commercial reuse | |PDF file iconPDF (469 KB)
    Freely Available from IEEE
  • Data parallel logic programming in &ACE

    Publication Year: 1995, Page(s):424 - 431
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (818 KB)

    &ACE is a high performance parallel Prolog system developed at the Laboratory for Logic, Databases, and Advanced Programming that exploits and-parallelism from Prolog programs. &ACE was developed to exploit MIMD parallelism. However, SPMD parallelism also arises naturally in many Prolog programs. In this paper we develop runtime techniques that allow systems that have primarily been designed to ex... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 1995
    Request permission for commercial reuse | |PDF file iconPDF (123 KB)
    Freely Available from IEEE
  • An evaluation of DELTA, a decoupled pre-fetching virtual shared memory system

    Publication Year: 1995, Page(s):482 - 487
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (580 KB)

    Decoupled pre-fetching is a technique for reducing the page miss overheads in Distributed Shared Memory systems by separating out those instructions responsible for data fetching from the main instruction stream and running them on a separate CPU whose function is to predict store accesses ahead of time. This approach differs from other pre-fetching approaches in that the predictions of data usage... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A simulation methodology for evaluating parallel computers

    Publication Year: 1995, Page(s):478 - 481
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (364 KB)

    In this paper we present a simulation methodology for evaluating and understanding parallel computer architectures. The simulation tool that we have developed can generalize the effect of architectural features while maintaining the complex interactions of architecture and algorithm typical of a real computer system. We will show that our simulator is an accurate tool for predicting the performanc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault detection in bitonic sorting networks

    Publication Year: 1995, Page(s):266 - 270
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (320 KB)

    A new fault detection algorithm for bitonic sorting networks is proposed. A single fault on the comparison elements or links can be detected and diagnosed by inserting O(log2N) sets of testing vectors. The basic testing vectors consist basically of subsets and combinations of ascending (0,1,2,3,) and descending (...,3,2,1,0, i.e., reverse) identify vectors View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The performance of replica control protocols in the presence of site failures

    Publication Year: 1995, Page(s):470 - 477
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (640 KB)

    Although managing multiple copies of a database has been the subject of intensive research for quite some time now, it has yet to fulfill its promise in practical applications. In the current state of distributed database technology, data replication, if implemented at all, is typically enforced by the read-one-write-all protocol. More complicated but less restrictive replica control protocols, th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient algorithm for k-pairwise node disjoint path problem in hypercubes

    Publication Year: 1995, Page(s):673 - 680
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (620 KB)

    In this paper, we give an efficient algorithm for the following k-pairwise node disjoint path problem in n-dimensional hypercubes Hn: Given k=[n/2] pairs of 2k distinct nodes (s1, t1), ..., (sk, tk) in Hn, n⩾4, find k node disjoint paths si→ti, 1⩽i⩽k. Our algorithm finds the k node disjoint pat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fixed-point parallel convolver without precision loss for the real-time processing of long numerical sequences

    Publication Year: 1995, Page(s):644 - 651
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (528 KB)

    A parallel architecture, able to convolve in real-time long numerical sequences with long filter functions is shown. Real-time is intended as a processing made at the same frequency of the data input access with a minimum delay of the output production, in order to make the output immediately available during the input process. We have used a known scheme that assumes one processing element (PE) f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel heap: A practical priority queue for fine-to-medium-grained applications on small multiprocessors

    Publication Year: 1995, Page(s):328 - 335
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (544 KB)

    We present an efficient implementation of the parallel heap data structure on a bus-based Silicon Graphics multiprocessor GTX/4D. Parallel heap is theoretically the first heap-based data structure to have implemented an optimally scalable parallel priority queue on an exclusive-read exclusive-write parallel random access machine. We compared it with Rao-and-Kumar's concurrent heap and with the con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of output buffers in multistage interconnection networks with multiple paths

    Publication Year: 1995, Page(s):260 - 265
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (476 KB)

    Multistage interconnection networks with multiple paths can support higher bandwidth than those of non-blocking networks by passing multiple packets to the same destination simultaneously. In the multiple path networks, the performance of the output buffer affects the whole system performance and is closely coupled with the output traffic distribution i.e. the packet arrival rate at each output li... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal fault-tolerant resource allocation in dynamic distributed systems

    Publication Year: 1995, Page(s):460 - 467
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (696 KB)

    This paper presents a fault-tolerant resource allocation algorithm in a dynamic distributed message passing system, where concurrent processes sharing system resources can be created or terminated dynamically. The degree of fault-tolerance is measured by the failure locality that is the maximum number of processes whose liveness conditions (e.g., starvation freedom) cannot be satisfied because of ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Solving triangular linear systems in parallel using substitution

    Publication Year: 1995, Page(s):553 - 560
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (648 KB)

    Working within the LogP model, we present parallel triangular solvers which use forward/backward substitution and show that they are optimal. We begin by deriving several lower bounds on execution time for solving triangular linear systems. Specifically, we derive lower bounds in which it is assumed that the number of data items per processor is bounded, a general lower bound, and lower bounds for... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Edge-disjoint embedding of large full binary trees into hypercubes

    Publication Year: 1995, Page(s):669 - 672
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (296 KB)

    We present two embedding methods of the full binary tree into the hypercube when the tree has greater number of nodes than the hypercube. Both methods map the tree edges onto the edge-disjoint paths of the hypercube(each hypercube edge being considered as two anti-parallel directed edges), and distribute the same level tree nodes evenly to the hypercube nodes. One embedding method with the optimal... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Process scheduling using genetic algorithms

    Publication Year: 1995, Page(s):638 - 641
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (336 KB)

    This paper presents a genetic algorithm using a matrix genome encoding to schedule distributed tasks, represented by a directed acyclic graph, on processors in order to minimize the maximum task finishing time. Our experimental results show that this algorithm provides better scheduling results than list scheduling with insertion; and dominant sequence clustering heuristics. Our algorithm generate... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multi-level partitioning and scheduling under local memory constraint

    Publication Year: 1995, Page(s):612 - 619
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (840 KB)

    Massive uniform nested loops are broadly used in scientific and DSP applications. Due to the large amount of data handled by such applications the optimization of data accesses by fully utilizing the local memory and minimizing communication overhead is important in order to improve the overall system performance. Most of the traditional partition strategies do not consider the effect of data acce... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Resilient distributed objects: Basic results and application to shared tuple spaces

    Publication Year: 1995, Page(s):320 - 327
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (660 KB)

    Given a shared, atomic read-modify-write register r with deterministic operations, Herlihy (1991) has defined an interference condition on the operations of r and shown that this condition must be satisfied for r to support wait-free consensus. We extend this interference condition to general linearizable shared objects with nondeterministic operations. The extension is applicable to the entire se... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Automated processor specification and task allocation for embedded multicomputer systems: The packing-based approaches

    Publication Year: 1995, Page(s):44 - 51
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (708 KB)

    This paper considers the coupled design problems of processor specification and task allocation for embedded multicomputer systems. A packing-based representation is proposed that allows the problems to be solved concurrently. An algorithm based on this representation is described that utilizes a new heuristic packing technique coupled with an incremental design advisor. This algorithm, named IDAT... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Incremental design of scalable interconnection networks using basic building blocks

    Publication Year: 1995, Page(s):252 - 259
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (724 KB)

    We present an incremental design of scalable interconnection networks using basic building blocks, including both network topologies and routing. We consider wormhole-routed small-scale 2D meshes as basic building blocks. The minimum requirement to expand these networks is a single building block. This implies that the network does not have to maintain the regular 2D mesh topology. We introduce so... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient deadlock-free wormhole routing in shuffle based networks

    Publication Year: 1995, Page(s):92 - 99
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (684 KB)

    To provide deadlock-free wormhole routing in simple regular networks, virtual channels have recently been introduced. This paper presents a deadlock-free routing scheme for a class of shuffle-based directed and undirected networks. First, the network graph is partitioned into a predetermined number of subdigraphs such that there are no cycles in each subdigraph. This enables not only a deadlock-fr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards developing universal dynamic mapping algorithms

    Publication Year: 1995, Page(s):456 - 459
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (372 KB)

    We investigate the problem of mapping dynamically generated tasks onto the processors of an MIMD-system. Our main concern is to construct an algorithm which can be integrated in distributed runtime systems like PVM or MPI. Existing methods are often not adjustable to different architecture- and application-demands. Even if they are, the adjustment has to be done manually via time-consuming experim... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rebuild options in RAID5 disk arrays

    Publication Year: 1995, Page(s):511 - 518
    Cited by:  Papers (5)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (572 KB)

    The response time of disk accesses in RAID5 disk arrays degrades when one of the N+1 disks fails and there is a further degradation by the interference caused by rebuild processing. In addition to giving user accesses a higher non-preemptive priority over track reads for rebuild, we consider: (i) the read redirection option; (ii) split-seek option, i.e., allowing user requests to preempt track rea... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementing the hierarchical PRAM on the 2D mesh: analyses and experiments

    Publication Year: 1995, Page(s):587 - 594
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (708 KB)

    We investigate aspects of the performance of the EREW instance of the Hierarchical PRAM (H-PRAM) model, a recursively partitionable PRAM, on the 2D mesh architecture via analysis and simulation experiments. Since one of the ideas behind the H-PRAM is to systematically exploit locality in order to negate the need for expensive communication hardware and thus promote cost-effective scalability, our ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Weighted selection on coarse-grain hypercubes

    Publication Year: 1995, Page(s):544 - 552
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (852 KB)

    Given n weighted records distributed evenly among a p-processor hypercube, p⩽n, we present efficient parallel algorithms for solving the weighted selection and related problems in the coarse-grain weak-hypercube model. A special case of the weighted selection problem, in which all the weights are equal, is known as the (unweighted) selection or order statistics problem. Our algorithms seek to ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Stochastic bounds for parallel program execution times with processor constraints

    Publication Year: 1995, Page(s):208 - 213
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (432 KB)

    We develop some stochastic lower and upper bounds for parallel program execution times when there are limited processors. Such analysis can provide important information for job scheduling and resource allocation. For several typical classes of parallel programs, we derive very accurate closed form approximations for the bounds. Examples are also given to demonstrate the quality of the bounds View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.