By Topic

Proceedings of International Conference on Parallel Processing

15-19 April 1996

Filter Results

Displaying Results 1 - 25 of 135
  • Proceedings of International Conference on Parallel Processing

    Publication Year: 1996
    Request permission for commercial reuse | PDF file iconPDF (565 KB)
    Freely Available from IEEE
  • Ocean circulation on the Intel Paragon: modeling and implementation

    Publication Year: 1996
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (884 KB)

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Nested parallel call optimization

    Publication Year: 1996, Page(s):225 - 229
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (577 KB)

    We present a novel optimization called Last Parallel Call Optimization (LPCO) for parallel systems. The last parallel call optimization can be regarded as a parallel extension of last call optimization found in sequential systems. While the LPCO is fairly general, we use and-parallel logic programming systems to illustrate it and to report its performance on multiprocessor systems. The last parall... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Panel on "For a Massive Number of Massively Parallel Machines: What are the Target Applications, Who

    Publication Year: 1996
    Request permission for commercial reuse | PDF file iconPDF (358 KB)
    Freely Available from IEEE
  • Author index

    Publication Year: 1996
    Request permission for commercial reuse | PDF file iconPDF (218 KB)
    Freely Available from IEEE
  • A parallel algorithm for text inference

    Publication Year: 1996, Page(s):441 - 445
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (416 KB)

    In this paper, we describe a highly parallel method for extracting inferences from text. The method is based on a marker-propagation algorithm that establishes semantic paths between knowledge base concepts. The paper presents the structure of the system, the marker-propagation algorithm, and results that show a large degree of parallelism View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Study of scalable declustering algorithms for parallel grid files

    Publication Year: 1996, Page(s):434 - 440
    Cited by:  Papers (14)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (664 KB)

    The efficient storage and retrieval of large multidimensional datasets is an important concern for large-scale scientific computations, such as long-running time-dependent simulations which periodically generate snapshots of the state. The main challenge for efficiently handling such datasets is to minimize response time for multidimensional range queries. The grid file is one of the well known ac... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Converse: an interoperable framework for parallel programming

    Publication Year: 1996, Page(s):212 - 217
    Cited by:  Papers (24)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (640 KB)

    Many different parallel languages and paradigms have been developed, each with its own advantages. To benefit from all of them, it should be possible to link together modules written in different parallel languages in a single application. Since the paradigms sometimes differ in fundamental ways, this is difficult to accomplish. This paper describes a framework, Converse, that supports such multi-... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel algorithms for image processing: practical algorithms with experiments

    Publication Year: 1996, Page(s):429 - 433
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (488 KB)

    We design and analyse parallel algorithms with the goal of obtaining exact bounds on their speed-ups on real machines. For this purpose, we employ the BSP* model, which is an extension of Valiant's (1994) BSP (bulk-synchronous parallel) model and rewards blockwise communication. Further, we use Valiant's notion of c-optimality. Intuitively, the speed-up of a c-optimal parallel algorithm for p proc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of a SliM array processor

    Publication Year: 1996, Page(s):771 - 775
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (420 KB)

    Presents the design and implementation of a SliM (Sliding Memory plane) array processor, which is a mesh-connected SIMD architecture. To build the array processor, we developed a SliM chip consisting of mesh-connected 5×5 processing elements (PEs). Due to the idea of sliding (i.e. overlapping the inter-PE communication with the computation), the SliM chip can greatly reduce the inter-PE comm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Generic methodologies for deadlock-free routing

    Publication Year: 1996, Page(s):638 - 643
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (640 KB)

    This paper introduces a graph-partitioning generic methodology for developing deadlock-free wormhole routing in an arbitrary network. Further extension allows partial cyclic dependencies among virtual channels. A novel fully adaptive nonminimal deadlock-free routing algorithm has been developed for k-ary n-cube torus network. Since our technique is based on decomposing a network into several subdi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance prediction with benchmaps

    Publication Year: 1996, Page(s):479 - 484
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (608 KB)

    Benchmapping is a performance prediction method for data-parallel programs that is based on modeling the performance of runtime systems. This paper describes a benchmapping system, called BENCHCVL, that predicts the running time of data-parallel programs written in the NESL language on several computer systems. BENCHCVL predicts performance using a set of more than 200 parameterized models. The mo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Constructing the spanners of graphs in parallel

    Publication Year: 1996, Page(s):206 - 210
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (468 KB)

    Given a connected graph G=(V,E) with n vertices, a subgraph G' is an approximate t-spanner of G if, for every u, ν∈V, the distance between, u and ν in G' is at most f(t) times longer than the distance in G, where f(t) is a polynomial function of t and t⩽f(t)<n. In this paper parallel algorithms for finding approximate t-spanners on both unweighted graphs and weighted graphs with ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel implementation of Bouvka's minimum spanning tree algorithm

    Publication Year: 1996, Page(s):302 - 308
    Cited by:  Papers (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (632 KB)

    We study parallel algorithms for the minimum spanning tree problem, based on the sequential algorithm of O. Boruvka (1926). The target architectures for our algorithm are asynchronous, distributed-memory machines. Analysis of our parallel algorithm on a simple model that is reminiscent of the LogP model, shows that in principle a speedup proportional to the number of processors can be achieved, bu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A direct block-five-diagonal system solver for the VLSI parallel model

    Publication Year: 1996, Page(s):886 - 890
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (420 KB)

    A VLSI algorithm for solving a special block-five-diagonal system of linear algebraic equations is presented. The algorithm is considered for a VLSI parallel computational model where both the time of the algorithm and the area of its design are components of the complexity estimations. The linear system arises from the finite-difference approximation of the first biharmonic boundary value problem... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The chessboard distance transform and the medial axis transform are interchangeable

    Publication Year: 1996, Page(s):424 - 428
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (404 KB)

    The distance transform (DT) and the medial axis transform (MAT) are two image computation tools used to extract information about the shape and position of foreground pixels relative to each other. Extensively applications of these two transforms are used in the fields of computer vision and image processing, such as expanding/shrinking, thinning, computing the shape factor, etc. There are many di... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new approach to pipeline FFT processor

    Publication Year: 1996, Page(s):766 - 770
    Cited by:  Papers (89)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (424 KB)

    A new VLSI architecture for a real-time pipeline FFT processor is proposed. A hardware-oriented radix-22 algorithm is derived by integrating a twiddle factor decomposition technique in the divide-and-conquer approach. The radix-22 algorithm has the same multiplicative complexity as the radix-4 algorithm, but retains the butterfly structure of the radix-2 algorithm. The single... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Affine-by-statement transformations of imperfectly nested loops

    Publication Year: 1996, Page(s):34 - 38
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (360 KB)

    A majority of loop restructuring techniques developed so far assume that loops are perfectly nested. The unimodular approach unifies three individual transformations-loop interchange, skewing and reversal-but is still limited to perfect loop nests. This paper outlines a framework, that enables the use of unimodular transformations to restructure imperfect loop nests. The concepts previously used f... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fault-tolerant multiple bus networks for fan-in algorithms

    Publication Year: 1996, Page(s):674 - 681
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (720 KB)

    We consider a large class of algorithms called “fan-in algorithms” that are useful for problems involving semi-group operations. This paper deals with the design of fault-tolerant multiple bus networks (MBNs) suited to run fan-in algorithms. We present two methods for constructing fan-in MBNs with tolerance to bus faults, that have nearly optimal performance and processor fan-out. We a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance of asynchronous linear iterations with random delays

    Publication Year: 1996, Page(s):625 - 629
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (408 KB)

    In this paper we investigate the speedup potential of asynchronous iterative algorithms over their synchronous counterparts for the special case of linear iterations. The space of linear iterations of size two is explored by simulation and analytical methods. We find cases and conditions for high asynchronous speedups. However, averaging asynchronous speedups over the whole set of iteration matric... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward symbolic performance prediction of parallel programs

    Publication Year: 1996, Page(s):474 - 478
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB)

    Critical analyses in performance estimators for parallel programs require an algorithm that count the number of integer solutions to a set of inequalities. Most current performance estimators are restricted to linear inequalities for this analysis. In this paper we describe a symbolic algorithm which can estimate the number of integer solutions to a set of both linear and non-linear inequalities. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Self-timed resynchronization: a post-optimization for static multiprocessor schedules

    Publication Year: 1996, Page(s):199 - 205
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (620 KB)

    In a shared-memory multiprocessor system, it is possible that certain synchronization operations are redundant that is, their corresponding sequencing requirements are enforced completely by other synchronizations in the system-and can be eliminated without compromising correctness. This paper addresses the problem of adding new synchronization operations in a multiprocessor implementation in such... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Practical parallel algorithms for dynamic data redistribution, median finding, and selection

    Publication Year: 1996, Page(s):292 - 301
    Cited by:  Papers (9)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (820 KB)

    A common statistical problem is that of finding the median element in a set of data. This paper presents a fast and portable parallel algorithm for finding the median given a set of elements distributed across a parallel machine. In fact, our algorithm solves the general selection problem that requires the determination of the element of rank i, for an arbitrarily given integer i. Practical algori... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Random seeking: a general, efficient, and informed randomized scheme for dynamic load balancing

    Publication Year: 1996, Page(s):881 - 885
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (608 KB)

    Proposes a completely general, informed, randomized, dynamic load-balancing method called random seeking (RS), which is suitable for parallel algorithms with characteristics found in many of the search algorithms used in artificial intelligence and operations research and in many divide-and-conquer algorithms. In this method, source processors randomly seek out sink processors for load balancing b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel algorithms for image enhancement and segmentation by region growing with an experimental study

    Publication Year: 1996, Page(s):414 - 423
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1096 KB)

    Presents efficient and portable implementations of a useful image enhancement process, the symmetric neighborhood filter (SNF), and an image segmentation technique which makes use of the SNF and a variant of the conventional connected components algorithm which we call δ-connected components. We use efficient techniques for distributing and coalescing data as well as efficient combinations o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.