Parallel and Distributed Processing, 1996., Eighth IEEE Symposium on

23-26 Oct. 1996

Filter Results

Displaying Results 1 - 25 of 83
  • Eighth IEEE Symposium On Parallel And Distributed Processing

    Publication Year: 1996, Page(s):iii - xi
    Request permission for commercial reuse | |PDF file iconPDF (585 KB)
    Freely Available from IEEE
  • Almost two-state self-stabilizing algorithm for token rings

    Publication Year: 1996, Page(s):52 - 59
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (726 KB)

    A self-stabilizing distributed system is a network of processors, which, regardless of its initial global state, will achieve the desired state in a finite number of steps. There are two main performance issues in the design of a self-stabilizing system: the stabilization time and memory requirements (the number of states required by each process). We first show that the probabilistic two-state al... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Last alternative optimization

    Publication Year: 1996, Page(s):538 - 541
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (420 KB)

    The authors present a new optimization for or-parallel logic programming (Prolog) systems, called last alternative optimization (LAO). The LAO follows from the flattening principle and the principle of duality of or-parallelism and and-parallelism. Originally LAO was conceived as the dual of last parallel call optimization, an optimization developed for and-parallel systems. LAO enables Prolog pro... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Proceedings of International Conference on Computer Aided Design

    Publication Year: 1996
    Request permission for commercial reuse | |PDF file iconPDF (349 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 1996, Page(s):616 - 618
    Request permission for commercial reuse | |PDF file iconPDF (153 KB)
    Freely Available from IEEE
  • A bulk-synchronous parallel library implementation for the BBN butterfly GP1000

    Publication Year: 1996, Page(s):288 - 297
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (872 KB)

    One of the fundamental goals of parallel computing is to develop a framework that will support portable and efficient application programs. The Bulk-Synchronous Parallel (BSP) model was proposed to help achieve this goal. The BSP model is intended to be a “unifying model”-it addresses both software and hardware issues by allowing theoretical analysis to coexist with practical physical ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generalized parallel selection in sorted matrices

    Publication Year: 1996, Page(s):281 - 285
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (428 KB)

    This paper presents a parallel algorithm running in time O(log m log* m(log log m+log(n/m))) time on an EREW PRAM with O(m/(log m log* m)) processors for the problem of selection in an m×n matrix with sorted rows and columns, m⩽n. Our algorithm generalizes the result of Sarnath and He (1992) for selection in a sorted matrix of equal dimensions, and thus answers the open question they pos... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An adaptive loop scheduling algorithm on shared-memory systems

    Publication Year: 1996, Page(s):250 - 257
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (828 KB)

    Using runtime information of load distributions and processor affinity, we propose an adaptive scheduling algorithm and its variations from different control mechanisms. The proposed algorithm applies different degrees of aggressiveness to adjust loop scheduling granularities, aiming at improving the execution performance of parallel loops by making scheduling decisions that match the real workloa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Concatenated parallelism: a technique for efficient parallel divide and conquer

    Publication Year: 1996, Page(s):488 - 495
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (848 KB)

    Efficient divide and conquer algorithms can be mapped to a parallel computer using either task parallelism or data parallelism. The former involves significant data movement and the latter can lead to severe load imbalances. A new strategy is proposed, which the authors call concatenated parallelism, for efficient parallel solution of problems resulting in divide and conquer trees. Their strategy ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast deterministic sorting on large parallel machines

    Publication Year: 1996, Page(s):273 - 280
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (712 KB)

    Many sorting algorithms that perform well on uniformly distributed data suffer significant performance degradation on non-random data. Unfortunately many real-world applications require sorting on data that is not uniformly distributed. In this paper we consider distributions of varying entropies. We describe A-Ranksort, a new sorting algorithm for parallel machines, whose behavior on input distri... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient broadcast and multicast on multistage interconnection networks using multiport encoding

    Publication Year: 1996, Page(s):36 - 45
    Cited by:  Papers (8)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (996 KB)

    This paper proposes a new approach for implementing fast multicast and broadcast in multistage interconnection networks (MINs) with multiport encoded multidestination worms. For a MIN with k×k switches and n stages such worms use n header flits each. One flit is used for each stage of the network and it indicates the output ports to which a multicast message must be replicated. A single mult... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A loop allocation policy for DOACROSS loops

    Publication Year: 1996, Page(s):240 - 249
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (948 KB)

    The dataflow model of computation, in general, and its recent direction to combine dataflow processing with control-flow processing, in particular, provide attractive alternatives to satisfy the computational demands of new applications, without experiencing the shortcomings of the traditional concurrent systems. This should motivate researchers to analyze the applicability of familiar concepts, s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Linear time approximation schemes for parallel processor scheduling

    Publication Year: 1996, Page(s):482 - 485
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (360 KB)

    The authors present a general framework for approximation schemes on parallel processor scheduling. They propose ε-approximation algorithms for scheduling on identical, uniform and unrelated machines when the number of processors is fixed. For each of the three problems considered, they perform grouping on job processing times in order to produce a transformed scheduling instance where the nu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A unifying methodology for multiple querying on enhanced meshes

    Publication Year: 1996, Page(s):392 - 399
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (740 KB)

    The main contribution of this work is to show that a number of seemingly unrelated problems in database design, pattern recognition, robotics, and image processing can be solved simply and elegantly by formulating them as instances of a general problem-the multiple query (MQ) problem. An arbitrary instance of the multiple query problem consists of a collection A={a1, a2, ...,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improved algorithms and data structures for solving graph problems in external memory

    Publication Year: 1996, Page(s):169 - 176
    Cited by:  Papers (16)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (724 KB)

    Recently, the study of I/O-efficient algorithms has moved beyond fundamental problems of sorting and permuting and into wider areas such as computational geometry and graph algorithms. With this expansion has come a need for new algorithmic techniques and data structures. In this paper, we present I/O-efficient analogues of well-known data structures that we show to be useful for obtaining simpler... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Comparison of two storage models in data-driven multithreaded architectures

    Publication Year: 1996, Page(s):122 - 129
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (688 KB)

    Multithreaded execution models attempt to combine some aspects of dataflow-like execution with von Neumann model execution, with the objective of masking the latency of inter-processor communications and remote memory accesses in multiprocessors. An important issue in the analysis and evaluation of multithreaded execution is the design and performance of the storage hierarchy. Because of the seque... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The “express channel” concept in hypermeshes and k-ary n-cubes

    Publication Year: 1996, Page(s):566 - 569
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (372 KB)

    Low-dimensional k-ary n-cubes have been popular in recent multicomputers. However these networks suffer from high switching delays due to their high message distance. To overcome this problem, Dally (1990) has proposed express k-ary n-cubes with express channels, that allow non-local messages to partially bypass clusters of nodes within a dimension. The paper argues that hypergraph topologies, tha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sorting N items using a p-sorter in optimal time

    Publication Year: 1996, Page(s):264 - 272
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (828 KB)

    A sorting device capable of sorting p items in constant time is called a p-sorter. It is known that the task of sorting N items using a p-sorter requires at least Ω (N log N/p log p) applications of the p-sorter. This bound is tight: there exist algorithms that use O (N log N/p log p) calls to the p-sorter to sort N items. However, there is no known implementable algorithm that can sort N it... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An intelligent system architecture for urban traffic control applications

    Publication Year: 1996, Page(s):10 - 17
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (696 KB)

    This paper describes an intelligent system architecture for urban traffic control which integrates a neural network and an expert system on silicon. The intelligent decision making system consists of a backpropagation based neural network for adaptive learning and a rule-based fuzzy expert system for decision making. Both the neural network and the expert system are implemented as linear systolic ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of parallel algorithms for a fingerprint image comparison system

    Publication Year: 1996, Page(s):410 - 413
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (368 KB)

    This paper addresses the problem of analyzing the performance of parallel algorithms for the training procedure of a neural network based fingerprint image comparison (FIC) system. The target architecture is assumed to be a coarse-grain distributed memory parallel architecture. Two types of parallelism: node parallelism and training set parallelism (TSP) are investigated. These algorithms are impl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A heterogeneous hierarchical solution to cost-efficient high performance computing

    Publication Year: 1996, Page(s):138 - 145
    Cited by:  Papers (2)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (720 KB)

    Two facts that suggest the desirability of a hierarchical approach to cost-effective high-performance computing are empirically established in this paper. The first fact is the temporal locality of programs with respect to the degree of parallelism. Two temporal (instruction and data) locality principles are identified and empirically established for a set of programs. The impact of this behavior ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Impact of load balancing on unstructured adaptive grid computations for distributed-memory multiprocessors

    Publication Year: 1996, Page(s):26 - 33
    Cited by:  Papers (8)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (804 KB)

    The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a new dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view. Whenever the computa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulated annealing applied to multicomputer task allocation and processor specification

    Publication Year: 1996, Page(s):232 - 239
    Cited by:  Papers (10)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (676 KB)

    This paper considers the design problems of processor specification and task allocation for embedded computer systems. A partitioning-based representation is proposed that allows these problems to be solved concurrently. An algorithm based on this representation is described that utilizes simulated annealing coupled with a heuristic processor specification technique. This algorithm, named SA2, is ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Load-balancing in sparse matrix-vector multiplication

    Publication Year: 1996, Page(s):218 - 225
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (652 KB)

    We consider the load-balanced multiplication of a large sparse matrix with a large sequence of vectors, on parallel computers. Due to the associated computational and inter-node communication challenges, we propose a method that combines fast load-balanced work allocation with efficient message passing implementations. The performance of the proposed method was evaluated on benchmark matrices as w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Measurement and simulation based performance analysis of parallel I/O in a high-performance cluster system

    Publication Year: 1996, Page(s):332 - 339
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (700 KB)

    This paper presents a measurement and simulation based study of parallel I/O in a high-performance cluster system: the Pittsburgh Supercomputing Center (PSC) DEC Alpha Supercluster. The measurements were used to characterize the performance bottlenecks and the throughput limits at the compute and I/O nodes, and to provide realistic input parameters to PioSim, a simulation environment we have devel... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.