By Topic

Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing

26-29 Oct. 1994

Filter Results

Displaying Results 1 - 25 of 87
  • Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing

    Publication Year: 1994
    Request permission for commercial reuse | PDF file iconPDF (59 KB)
    Freely Available from IEEE
  • On balancing computational load on rings of processors

    Publication Year: 1994, Page(s):478 - 483
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (392 KB)

    We consider a simple, deterministic policy for scheduling certain genres of dynamically evolving computations-specifically, computations in which tasks that spawn produce precisely two offspring-on rings of processors. Such computations include, for instance, tree-structured branching computations. We believe that our policy yields good parallel speedup on most computations of the genre, but we ha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallelism and locality in priority queues

    Publication Year: 1994, Page(s):490 - 496
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (600 KB)

    We explore two ways of incorporating parallelism into priority queues. The first is to speed up the execution of individual priority operations so that they can be performed one operation per time step, unlike sequential implementations which require O(log N) time steps per operation for an N element heap. We give an optimal parallel implementation that uses a linear array of O(log N) processors. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Upper and lower bounds for selection on the mesh

    Publication Year: 1994, Page(s):497 - 504
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (708 KB)

    A distance-optimal algorithm for selection on the mesh has proved to be elusive, although distance-optimal algorithms for the related problems of routing and sorting have recently been discovered. In this paper, we explain, using the notion of adaptiveness, why techniques used in the currently best selection algorithms cannot lead to a distance-optimal algorithm. We also present the first algorith... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performability analysis of non-repairable multicomponent systems using order statistics

    Publication Year: 1994, Page(s):646 - 653
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (612 KB)

    Performability, a composite measure that integrates both performance and reliability, has been deemed to be essential in evaluating systems that are capable of trading off performance for reliability under component failures. For non-repairable systems, the goal is to evaluate the distribution or moments of some accumulated reward (performance) defined on a stochastic process that characterizes th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An optimal hypercube algorithm for the all nearest smaller values problem

    Publication Year: 1994, Page(s):505 - 512
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (628 KB)

    Given a sequence of n elements, the All Nearest Smaller Values (ANSV) problem is to find, for each element in the sequence, the nearest element to the left (right) that is smaller, or to report that no such element exists. Berkman, Schieber, and Vishkin (1993) give an ANSV algorithm that runs in O(lg n) time on an (n/lg n)-processor CREW PRAM. In this paper, we present an O(lg n)-time n-processor ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Matrix transpose on meshes with wormhole and XY routing

    Publication Year: 1994, Page(s):656 - 663
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (636 KB)

    We give nearly optimal algorithms for matrix transpose on meshes with wormhole and XY routing and with a 1-port or 2-port communication model. For an N×N mesh, where N=3·2n and each mesh node has a submatrix of size m to be transposed, our algorithms take Nm/2 time steps for 1-port model, and about Nm/3.27 time steps for 2-port model. The lower bound is Nm/3.414. While ther... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Very fast optimal parallel algorithms for heap construction

    Publication Year: 1994, Page(s):514 - 521
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    We give two algorithms for permuting n items in an array into heap order on a CRCW PRAM. The first is deterministic and runs in O(log log n) time and performs O(n) operations and is time- and work-optimal. The second is randomized and runs in O(log log log n) time with high probability, performing O(n) operations. No PRAM algorithm with o(log n) run-time was previously known for this problem. We a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A performance comparison of processor allocation and job scheduling algorithms for mesh-connected multiprocessors

    Publication Year: 1994, Page(s):46 - 53
    Cited by:  Papers (15)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (800 KB)

    Due to its simplicity, regularity and suitability for VLSI implementation, the mesh topology for multiprocessors has drawn considerable attention. Several processor allocation strategies for mesh-connected multiprocessors have been proposed in recent years. In this paper, we present the results of a performance study of all the proposed strategies known to authors. Originally each of these allocat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards practical permutation routing on meshes

    Publication Year: 1994, Page(s):664 - 671
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (684 KB)

    We consider the permutation routing problem on two-dimensional n×n meshes. To be practical, a routing algorithm is required to ensure very small queue sizes Q, and very low running time T, not only asymptotically but particularly also for the practically important n up to 1000. With a technique inspired by a scheme of Kaklamanis/Krizanc/Rao, we obtain a near-optimal result: T=2·n+&Osc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The efficiency of randomized parallel backtrack search

    Publication Year: 1994, Page(s):522 - 529
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    We present a refined analysis of the randomized parallel backtrack search algorithm (RPBS) of Karp and Zhang (1993). It is shown that the total number of messages occurred in the execution of RPBS is likely to be O(hp log d) where h and d are the height and degree of the backtrack search tree and p is the number of processors used. As a consequence, under the assumption of unit-time message delive... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Modelling accesses to migratory and producer-consumer characterised data in a shared memory multiprocessor

    Publication Year: 1994, Page(s):612 - 619
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (680 KB)

    Directory-based, write-invalidate cache coherence protocols are effective in reducing latencies to the memory but suffer from cache misses due to coherence actions. It is therefore important to understand the nature of data sharing causing misses for this class of protocols. We identify a set of parameters that characterises the accesses to migratory and producer-consumer data in sufficient detail... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Application-specific array processors for binary prefix sum computation

    Publication Year: 1994, Page(s):118 - 125
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    The main contribution of this work is to propose two application-specific bus architectures for computing the prefix sums of a binary sequence. Our architectures feature the following characteristics: all broadcasts occur on buses of length 15 or 63; we use a new technique that we call shift switching which allows switches to cyclically permute an incoming signal, dramatically improving the perfor... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Software interleaving

    Publication Year: 1994, Page(s):56 - 65
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (800 KB)

    We investigate the costs and benefits of implementing memory interleaving in software. As our main contribution, we compare software memory interleaving to row-major allocation and logarithmic broadcasting. Our analysis demonstrates the clear superiority of software interleaving over row-major allocation in the presence of memory contention. Our analysis also indicates that the choice between soft... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Routing with locality in partitioned-bus meshes

    Publication Year: 1994, Page(s):715 - 721
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    We show that adding partitioned-buses (as opposed to long buses that span an entire row or column) to ordinary meshes can reduce the routing time by approximately one-third for permutation routing with locality. A matching time lower bound is also proved. The result can be generalized to multi-packet routing View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient submesh permutations in wormhole-routed meshes

    Publication Year: 1994, Page(s):672 - 678
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB)

    This paper studies how to concurrently permute related logical or physical submeshes in a d-dimensional n×…×n physical mesh via wormhole and dimension-ordered routing. Our objective is to minimize the congestion for realizing the permutations, while maximizing the number and dimensionality of permuted submeshes. We show that for d⩽2α+β, concurrent independent perm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the convergence of a parallel algorithm for finding polynomial zeros

    Publication Year: 1994, Page(s):530 - 535
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB)

    The problem of finding the zeros of a polynomial p(z) of degree n is considered. Some results related to a parallel algorithm given by Bini and Gemignani are improved. The algorithm is a reformulation of Householder's sequential algorithm (1971) that is based on the computation of the polynomial remainder sequence generated by the Euclidean scheme. The approximation to the sought after zeros (or f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multicast in extra-stage multistage interconnection networks

    Publication Year: 1994, Page(s):452 - 459
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (632 KB)

    This paper studies the multicast problem in the multistage interconnection network (MIN) topology. A regular MIN is a unique path network and can provide only a single path choice in routing or multicasting. However, if the MIN is added with a few extra stages, it can offer greater routing flexibilities. Design implications of extra-stage MINs are discussed in this paper. An upper bound on the num... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of the XDAC disk array system

    Publication Year: 1994, Page(s):620 - 627
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (608 KB)

    The paper presents an analytical model of a whole disk array architecture, XDAC, which consists of several major subsystems and features: the two-dimensional array structure; IO-bus with split transaction protocol; and cache for processing multiple I/O requests in parallel. Our modelling approach is based on a subsystem access time per request (SATPR) concept, in which we model for each subsystem ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Out-of-order access to vector elements in order to reduce conflicts in vector processors

    Publication Year: 1994, Page(s):126 - 134
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (656 KB)

    The performance of a vector processor accessing vectors is strongly dependent on the conflicts produced in the memory subsystem. These conflicts create holes in the data flow between processor and memory, delaying the task of the functional units. This paper proposes an out-of-order access to vector elements in order to reduce the average memory access time in vector processors. Previous research ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Program dependence analysis for concurrency exploitation in programs composed of abstract data type modules

    Publication Year: 1994, Page(s):66 - 73
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (640 KB)

    In object-based design and implementation, application programs are constructed by layering reusable abstract data type (ADT) module instances. An ADT instance is often used to manage more than one data object. There is contention for getting access to the instance if the multiple data objects need to be accessed concurrently. To resolve the contention, the ADT instance is replicated and the copie... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On multicast wormhole routing in multicomputer networks

    Publication Year: 1994, Page(s):722 - 729
    Cited by:  Papers (24)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (768 KB)

    We show that deadlocks due to dependencies on consumption channels is a fundamental problem in multicast wormhole routing. This issue of deadlocks has not been addressed in many previously proposed multicast algorithms. We also show that deadlocks on consumption channels can be avoided by using multiple classes of consumption channels and restricting the use of consumption channels by multicast me... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Minimal turn restrictions for designing deadlock-free adaptive routing

    Publication Year: 1994, Page(s):680 - 687
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (584 KB)

    A routing algorithm is basically required to be connected and deadlock-free. We can restrict some directions that messages can turn in a network to avoid deadlock. A deadlock-free adaptive routing with fewer turn restrictions is considered to possess a greater degree of adaptiveness. We present two basic strategies for designing feasible routings on networks which have bidirectional channels. Prim... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Good algorithm design style for multiprocessors

    Publication Year: 1994, Page(s):538 - 543
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (532 KB)

    We discuss a style of designing parallel algorithms with the following characteristics for a problem of the best known sequential time T(n): C1. Each processor spends O(T(n)/P) time in computing. C2. Each processor sends and/or receives O(n/P) messages of one-word-size. C3. The number of communication phases1 is constant, independent of the input size n. We show this is possible to achi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Necklaces and scalability of Kautz digraphs

    Publication Year: 1994, Page(s):409 - 415
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (540 KB)

    In this paper, the following results are reported. The notion of Kautz necklaces, similar to de Bruijn ones, is introduced. A linear-time algorithm for generating Kautz necklaces in lexicographic ordering is described and a formula for enumerating the Kautz necklaces is given. A one-to-one mapping between de Bruijn and Kautz vertices preserving the necklaces is presented. Then the notion of neckla... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.