By Topic

Proceedings Scalable High Performance Computing Conference SHPCC-92.

26-29 April 1992

Filter Results

Displaying Results 1 - 25 of 71
  • Proceedings. Scalable High Performance Computing Conference SHPCC-92 (Cat. No.92TH0432-5)

    Publication Year: 1992
    Request permission for commercial reuse | PDF file iconPDF (33 KB)
    Freely Available from IEEE
  • Incremental mapping for solution-adaptive multigrid hierarchies

    Publication Year: 1992, Page(s):401 - 408
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (548 KB)

    The full multigrid method uses a hierarchy of successively finer grids. In a solution-adaptive grid hierarchy each grid is obtained by adaptive refinement of the grid on the previous level. On a distributed memory multiprocessor, each grid level must be partitioned and mapped so as to minimize the multigrid cycle execution time. In this report, several grid partitioning and load (re)mapping strate... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive methods and rectangular partitioning problem

    Publication Year: 1992, Page(s):409 - 415
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB)

    Partitioning problems for rectangular domains having nonuniform workload for mesh-connected SIMD architectures are discussed. The considered rectangular workloads result from application of adaptive methods to the solution of hyperbolic differential equations on SIMD machines. A new form of the partitioning problem is defined in which sub-meshes of processors are assigned to tasks, each task being... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Portable parallel Level-3 BLAS in Linda

    Publication Year: 1992, Page(s):416 - 423
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (564 KB)

    Describes an approach towards providing an efficient Level-3 BLAS library over a variety of parallel architectures using C-Linda. A blocked linear algebra program calling the sequential Level-3 BLAS can now run on both shared and distributed memory environments (which support Linda) by simply replacing each call by a call to the corresponding parallel Linda Level-3 BLAS. The authors summarise some... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel scalable approach to short-range molecular dynamics on the CM-5

    Publication Year: 1992, Page(s):240 - 245
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (356 KB)

    Presents a scalable algorithm for short-range molecular dynamics which minimizes interprocessor communications at the expense of a modest computational redundancy. The method combines Verlet neighbor lists with coarse-grained cells. Each processing node is associated with a cubic volume of space and the particles it owns are those initially contained in the volume. Data structures for `own' and `v... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Complete exchange on a circuit switched mesh

    Publication Year: 1992, Page(s):300 - 306
    Cited by:  Papers (31)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (424 KB)

    The complete exchange (`all-to-all personalized') communication pattern is at the heart of numerous important multicomputer algorithms. Recent research has shown how this pattern can efficiently be performed on circuit-switched hypercubes. However, on circuit-switched meshes, this pattern is difficult to perform efficiently because the sparsity of the mesh interconnect leads to severe link content... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Debugging mapped parallel programs

    Publication Year: 1992, Page(s):200 - 203
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (320 KB)

    As more sophisticated tools for parallel programming become available, programmers will inevitably want to use them together. However, some parallel programming tools can interact with each other in ways that make them less useful. In particular, it a mapping tool is used to adapt a parallel program to run on relatively few processors, the information presented by a debugger may become difficult t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HeNCE: graphical development tools for network-based concurrent computing

    Publication Year: 1992, Page(s):129 - 136
    Cited by:  Papers (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (460 KB)

    HeNCE (heterogeneous network computing environment) is an X Window based graphical parallel programming environment that was created to assist scientists and engineers with the development of parallel programs. HeNCE provides a graphical interface for creating, compiling, executing, and debugging parallel programs, as well as configuring a distributed virtual computer (using PVM). HeNCE programs c... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A methodology for visualizing performance of loosely synchronous programs

    Publication Year: 1992, Page(s):424 - 432
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (756 KB)

    Introduces a new set of views for displaying the progress of loosely synchronous computations involving large numbers of processors on large problems. The authors suggest a methodology for employing these views in succession in order to gain progressively more detail concerning program behavior. At each step, focus is refined to include just those program sections or processors which have been det... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalability of data transport

    Publication Year: 1992, Page(s):1 - 8
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (584 KB)

    Peak floating point rate is a very limited way to characterize high performance computer systems. A better method is to use the bandwidth and latency of data transport for the major components of a system. Bandwidth scales well with increasing system size, but latency does not. The demands placed by a program on data transport determine how well an architecture will execute it. The article discuss... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A runtime data mapping scheme for irregular problems

    Publication Year: 1992, Page(s):216 - 219
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (304 KB)

    In scalable multiprocessor systems, high performance demands that computational load be balanced evenly among processors and that interprocessor communication be limited as much as possible. In this paper, the authors study the problem of automatically choosing data distributions for irregular problems. Irregular problems are programs where the data access pattern cannot be determined during compi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel implementation of the chemically reacting CFD code, SPARK

    Publication Year: 1992, Page(s):342 - 349
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (616 KB)

    Describes a parallel version of the two-dimensional, chemically reacting CFD code, SPARK. The sequential code has been ported to run on the Intel iPSC/860-based parallel computers. Routines have been added to the code which partition the problem based on the global mesh, and then assign the resulting subdomains across the processors. Two subdomain mappings have been considered. The routines which ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scalable parallel molecular dynamics on MIMD supercomputers

    Publication Year: 1992, Page(s):246 - 251
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (520 KB)

    Presents two parallel algorithms suitable for molecular dynamics simulations over a wide range of sizes, from a few hundred to millions of atoms. One of the algorithms is optimally scalable, offering performance proportional to N/P where N is the number of atoms (or molecules) and P is the number of processors. Their implementation on three MIMD parallel computers (nCUBE2, Intel Gamma, and Intel D... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Towards a distributed memory implementation of Sisal

    Publication Year: 1992, Page(s):385 - 392
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (680 KB)

    Sisal is a functional language for scientific applications implemented efficiently on shared memory, vector, and hierarchical memory multiprocessors. The current compiler assumes a flat, shared addressing space, and the runtime system is implemented using locks and shared queues. This paper describes a first implementation of Sisal on the nCUBE 2 distributed memory architecture. Most of the effort... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A matrix product algorithm and its comparative performance on hypercubes

    Publication Year: 1992, Page(s):190 - 194
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB)

    A matrix product algorithm is studied in which one matrix operand is transposed prior to the computation. This algorithm is compared with the Fox-Hey-Otto algorithm on hypercube architectures. The Transpose algorithm simplifies communication for nonsquare matrices and for computations where the number of processors is not a perfect square. The results indicate superior performance for the Transpos... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applications of FORALL-formed computations in large scale stochastic dynamic programming

    Publication Year: 1992, Page(s):182 - 185
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (224 KB)

    Data parallel broadcasting methods have been developed by taking the advantages of the properties of stochastic, nonlinear, continuous-time dynamical systems. The stochastic components include both Gaussian and Poisson random white noise. An example of a grand challenge level application is the resource management problem. The purpose of this paper is to demonstrate that broadcasting can be effici... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Load information distribution via active interconnection networks

    Publication Year: 1992, Page(s):174 - 177
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (292 KB)

    Existing multicomputers typically use passive, dedicated network interfaces. By comparison, an active interconnect network can manipulate the data in messages transitting through a node; these might use existing systolic processors as the network interface. Active interconnects will become increasingly common in distributed memory multicomputers because they can be used to implement a variety of r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An object oriented approach to boundary conditions in finite difference fluid dynamics codes

    Publication Year: 1992, Page(s):145 - 148
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (276 KB)

    Parallel computers have been used to solve computational fluid dynamics (CFD) problems for many years; however, while the hardware has greatly improved, the software methods for describing CFD algorithms have remained largely unchanged. From the physics and software engineering points of view, the boundary conditions consume most of the algorithmic development and programming time, but only a smal... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Intercube communication for the iPSC/860

    Publication Year: 1992, Page(s):307 - 313
    Cited by:  Papers (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (408 KB)

    In this paper, new functions that enable efficient intercube communication on the Intel iPSC/860 are introduced. Communication between multiple cubes (power-of-two number of processor nodes) within the Intel iPSC/860 is a desirable feature to facilitate the implementation of interdisciplinary problems such as the grand challenge problems of the High Performance Computing and Communications Project... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using atomic data structures for parallel simulation

    Publication Year: 1992, Page(s):30 - 37
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB)

    Synchronizing access to shared data structures is a difficult problem for simulation programs. Frequently, synchronizing operations within and between simulation steps substantially curtails parallelism. The paper presents a general technique for performing this synchronization while sustaining parallelism. The technique combines fine-grained, exclusive locks with futures, a write-once data struct... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applications of a parallel pressure-correction algorithm to 3D turbomachinery flows

    Publication Year: 1992, Page(s):153 - 156
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (308 KB)

    A parallel algorithm for the solution of three-dimensional compressible flows in turbomachinery has been developed and demonstrated on a scalable distributed memory multicomputer. The algorithm solves the compressible form of the Euler or Navier-Stokes equations via a compressible pressure correction formulation. To achieve high accuracy for highly turning blade rows, the computational grid is con... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward a scalable concurrent architecture for real-time processing of stochastic control and optimization problems

    Publication Year: 1992, Page(s):46 - 50
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (344 KB)

    Reports on the development of a scalable multiple-instruction multiple-data (MIMD) concurrent architecture which is intended to serve as an effective alternative for solving stochastic differential and optimization systems. This architecture has in turn motivated the application of group theory and invariance analysis to acquire further insights in understanding the original problem. The speed-up ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parameterized memory/processor optimizing FORTRAN compiler for parallel computers

    Publication Year: 1992, Page(s):204 - 207
    Cited by:  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (312 KB)

    A new approach to generating low-conflict parallel instructions for complex applications is introduced in this paper. This method is presented within the context of a FORTRAN compiler. An approximate simulator has been incorporated within a parallel-code/domain-decomposition loop within the compiler. The simulator estimates the performance of candidate instruction segments, and guides the selectio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Phase modeling of a parallel scientific code

    Publication Year: 1992, Page(s):322 - 327
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (424 KB)

    Describes a performance model for a parallel program that solves the nonlinear shallow water equations using the spectral transform method. The model is generated via a phase analysis, and consists of a sequence of simple models whose sum describes the performance of the entire code. This use of a sequence of simple models increases the range of validity of the model as the problem and machine par... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Selective monitoring using performance metric predicates

    Publication Year: 1992, Page(s):162 - 165
    Cited by:  Papers (4)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (304 KB)

    The field of parallel processing is going through an important evolution in technology characterized by a significant increase in the number of processors within such systems. As the number of processors increases, the conventional techniques for monitoring the performance of parallel systems will produce large amounts of data in the form of event trace files. The authors propose one possible solu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.