Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing

25-28 Oct. 1995

Filter Results

Displaying Results 1 - 25 of 98
  • Proceedings of Seventh IEEE Symposium on Parallel and Distributed Processing

    Publication Year: 1995
    Request permission for commercial reuse | PDF file iconPDF (469 KB)
    Freely Available from IEEE
  • Data parallel logic programming in &ACE

    Publication Year: 1995, Page(s):424 - 431
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (818 KB)

    &ACE is a high performance parallel Prolog system developed at the Laboratory for Logic, Databases, and Advanced Programming that exploits and-parallelism from Prolog programs. &ACE was developed to exploit MIMD parallelism. However, SPMD parallelism also arises naturally in many Prolog programs. In this paper we develop runtime techniques that allow systems that have primarily been designed to ex... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 1995
    Request permission for commercial reuse | PDF file iconPDF (123 KB)
    Freely Available from IEEE
  • On symbolic scheduling and parallel complexity of loops

    Publication Year: 1995, Page(s):360 - 367
    Cited by:  Papers (3)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (704 KB)

    We first consider the symbolic scheduling and performance prediction of a partitioned single loop on message passing architectures with non zero communication and a sufficient number of processors. The loop body contains a set of coarse grain tasks whose computational weights change during the course of the iterations. Using the macro dataflow task: model and software pipelining techniques, we dev... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A memory distribution mechanism for object oriented applications

    Publication Year: 1995, Page(s):354 - 357
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (340 KB)

    Many applications, particularly in Computer-Aided Design (CAD), require large amounts of memory, limiting the size of problems which can be handled. This paper presents a new mechanism which exploits the large amount of memory available in a cluster of work-stations for programs that are designed in an object-oriented manner. The memory required for each object may be allocated in other machines o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Optimal load sharing in dynamically heterogeneous systems

    Publication Year: 1995, Page(s):346 - 353
    Cited by:  Papers (3)  |  Patents (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (588 KB)

    Heterogeneity of processor speed and time availability is introduced to the paradigm of load sharing among a number of autonomous and independently scheduled heterogeneous computers that communicate via message-passing interconnection system. A divisible job originating at one of the system sites is to be partitioned and executed concurrently on a suite of selected processors, to the extent of the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Geometric approach for optimal routing on mesh with buses

    Publication Year: 1995, Page(s):145 - 152
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (776 KB)

    Recently, the architecture of `mesh of buses' is becoming quite popular in parallel computing. Its main advantage is the limited broadcast capability that is used to overcome the main disadvantage of the mesh, namely the relatively big diameter. We show that in such networks busses indeed accelerate the time for the fundamental problem of routing. Furthermore, unlike in the `store and forward' mod... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast, efficient mutual and self simulations for shared memory and reconfigurable mesh

    Publication Year: 1995, Page(s):238 - 246
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (700 KB)

    This paper studies relations between the parallel random access machine (PRAM) model, and the reconfigurable mesh (RMESH) model, by providing mutual simulations between the models. We present an algorithm simulating one step of an (nlglgn)-processor CRCW PRAM on an n×n RMESH with delay O(lglgn) with high probability. We use our PRAM simulation to obtain the first efficient self-simulation al... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Stochastic bounds for parallel program execution times with processor constraints

    Publication Year: 1995, Page(s):208 - 213
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB)

    We develop some stochastic lower and upper bounds for parallel program execution times when there are limited processors. Such analysis can provide important information for job scheduling and resource allocation. For several typical classes of parallel programs, we derive very accurate closed form approximations for the bounds. Examples are also given to demonstrate the quality of the bounds View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Demand-based document dissemination to reduce traffic and balance load in distributed information systems

    Publication Year: 1995, Page(s):338 - 345
    Cited by:  Papers (26)  |  Patents (35)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    Research on replication techniques to reduce traffic and minimize the latency of information retrieval in a distributed system has concentrated on client-based caching, whereby recently/frequently accessed information is cached at a client (or at a proxy thereof) in anticipation of future accesses. We believe that such myopic solutions-focussing exclusively on a particular client or set of clients... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cache memories in dataflow architecture

    Publication Year: 1995, Page(s):182 - 189
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (820 KB)

    The recent advance in dataflow processing - to combine the dataflow paradigm with the control flow paradigm - has brought out many new challenging issues. This hybrid organization has made it possible to study familiar control flow concepts within the framework of the dataflow architecture. The concept of cache memory has proven its effectiveness in the von Neumann architecture due to the spatial ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Weighted selection on coarse-grain hypercubes

    Publication Year: 1995, Page(s):544 - 552
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (852 KB)

    Given n weighted records distributed evenly among a p-processor hypercube, p⩽n, we present efficient parallel algorithms for solving the weighted selection and related problems in the coarse-grain weak-hypercube model. A special case of the weighted selection problem, in which all the weights are equal, is known as the (unweighted) selection or order statistics problem. Our algorithms seek to ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Expressing and detecting control flow properties of distributed computations

    Publication Year: 1995, Page(s):432 - 438
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB)

    Properties of distributed computations can be either on their global states or on their control flows. This paper addresses control flow properties. It first presents a simple yet powerful logic for expressing general properties on control flows, seen as sequences of local states. Among other properties, we can express invariance, sequential properties (20 satisfy such a property a control flow mu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The polling primitive for hypercube networks

    Publication Year: 1995, Page(s):138 - 144
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (508 KB)

    We describe a distributed computing primitive termed polling that is both a means of synchronization and communication in distributed or concurrent systems. The polling operation involves the collection of messages from nodes in an interconnection network, in response to a query. We define the semantics of polling, and present algorithms for implementing the operation on hypercube networks. Time a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A characterization of one-to-one modular mappings

    Publication Year: 1995, Page(s):382 - 389
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (660 KB)

    We deal with modular mappings as introduced by H.J. Lee and J.A.B. Fortes (1994) and we build upon their results. Our main contribution is a characterization of one to one modular mappings that is valid even when the source domain and the target domain of the transformation have the same size but not the same shape. This characterization is constructive, and a procedure to test the injectivity of ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A lower bound for the QRQW PRAM

    Publication Year: 1995, Page(s):231 - 237
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (704 KB)

    The queue-read, queue-write (QRQW) parallel random access machine (PRAM) model is a shared memory model which allows concurrent reading and writing with a time cost proportional to the contention. This is designed to model currently available parallel machines more accurately than either the CRCW PRAM or EREW PRAM models. Many algorithmic results have been developed for the QRQW PRAM. However, the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic processor sharing in torus multicomputers

    Publication Year: 1995, Page(s):204 - 207
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (328 KB)

    This paper proposes a distributed dynamic processor sharing scheme in torus-connected multicomputer systems. It is applicable to database query and on-line transaction processing applications. In such a system, each processor can process small transaction tasks locally and support parallel execution of large transaction tasks in a timesharing fashion. Distributed management of processors is achiev... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A structural theory of recursively decomposable parallel processor-networks

    Publication Year: 1995, Page(s):570 - 578
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (720 KB)

    A `recursively decomposable' network G can be partitioned into a fixed number of subnetworks each of which is recursively decomposable and `a smaller version' of G. Several notions of such networks emerge depending on the collection of parameters chosen to model a subnetwork as `a smaller version' of another. Examples of such parameters are permutation time, bandwidth latency, topology, wires, deg... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic reconfiguration with I/O abstraction

    Publication Year: 1995, Page(s):496 - 501
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    Dynamic reconfiguration is explored in the context of I/O abstraction, a new programming model that defines the communication structure of a distributed system in terms of connections among well-defined data interfaces of encapsulated modules. We present a new module migration mechanism that avoids the expense and complication of state extraction techniques, minimizes the amount of code required f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel heap: A practical priority queue for fine-to-medium-grained applications on small multiprocessors

    Publication Year: 1995, Page(s):328 - 335
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB)

    We present an efficient implementation of the parallel heap data structure on a bus-based Silicon Graphics multiprocessor GTX/4D. Parallel heap is theoretically the first heap-based data structure to have implemented an optimally scalable parallel priority queue on an exclusive-read exclusive-write parallel random access machine. We compared it with Rao-and-Kumar's concurrent heap and with the con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A simulation methodology for evaluating parallel computers

    Publication Year: 1995, Page(s):478 - 481
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (364 KB)

    In this paper we present a simulation methodology for evaluating and understanding parallel computer architectures. The simulation tool that we have developed can generalize the effect of architectural features while maintaining the complex interactions of architecture and algorithm typical of a real computer system. We will show that our simulator is an accurate tool for predicting the performanc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Architecture and technology tradeoffs in the design of next-generation multiprocessor servers

    Publication Year: 1995, Page(s):174 - 181
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (732 KB)

    The design of high performance computing systems requires many design decisions based on performance, cost, power consumption, and possibly other criteria. Decisions made in the early, high-level specification phase are critical to developing a successful product. We describe a methodology which allows the architect to explore alternatives at all design levels for different technology options. We ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Partitioning and scheduling for parallel image processing operations

    Publication Year: 1995, Page(s):86 - 90
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (484 KB)

    Many computer vision and image processing (CVIP) operations can be represented as a sequence of tasks with nested loops, specified by the visual programming language Khoros. This paper addresses the automatic partitioning and scheduling of such operations on distributed memory multiprocessors. The major difficulties in determining the optimal image data distribution for each task are that the numb... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient race detection for message-passing programs with nonblocking sends and receives

    Publication Year: 1995, Page(s):534 - 541
    Cited by:  Papers (5)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (752 KB)

    This paper presents an algorithm for performing on-the-fly race detection for parallel message-passing programs. The algorithm reads a trace of the communication events in a message-passing parallel program and either finds a specific race condition or reports that the traced program is race-free. It supports a rich message-passing model, including blocking and non-blocking sends and receives, syn... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A fast and scalable scheduling algorithm for distributed memory systems

    Publication Year: 1995, Page(s):60 - 63
    Cited by:  Papers (6)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (304 KB)

    The inter-processor communication time is a major bottleneck for distributed memory systems (DMSs) and can be reduced by having an efficient task partitioning and scheduling strategy. The paper deals with the scheduling issues and presents an algorithm to schedule tasks onto DMSs. The complexity of this algorithm is O(V2), where Vr is the number of nodes of the directed acyclic graph (D... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.