26-29 April 1994
Filter Results
-
Proceedings of 8th International Parallel Processing Symposium
Publication Year: 1994|
PDF (26 KB)
-
Parallelization of linearized applications in Fortran D
Publication Year: 1994, Page(s):51 - 60
Cited by: Papers (1)Fortran D extends Fortran to parallel computers via specification of the distribution of array variables across processors. When multidimensional arrays have been linearized for optimal performance on vector processors, Fortran D cannot produce the best parallelization because it is limited to one-dimensional distribution, which is less efficient due to surface-to-volume effects. We propose Fortra... View full abstract»
-
Latency hiding in message-passing architectures
Publication Year: 1994, Page(s):704 - 709
Cited by: Papers (22) | Patents (1)The paper demonstrates the advantages of having two processors in the node of a distributed memory architecture, one for computation and one for communication. The architecture of such a dual-processor node is discussed. To exploit fully the potential for parallel execution of computation threads and communication threads, a novel, compiler-optimized IPC mechanism allows for an unbuffered no-wait ... View full abstract»
-
Hybrid resource management algorithms for multicomputer systems
Publication Year: 1994, Page(s):482 - 489
Cited by: Papers (1)Addresses the issue of resource management in parallel systems. Two new hybrid algorithms for general resource management in distributed memory computers are presented. T-hybrid is a decoupled algorithm that combines a static template allocation scheme with a low-cost local demand-driven dynamic algorithm while C-hybrid is a coupled algorithm that combines a simple static allocation scheme with th... View full abstract»
-
A clustered reduced communication element by element preconditioned conjugate gradient algorithm for finite element computations
Publication Year: 1994, Page(s):509 - 516
Cited by: Papers (1)The clustered element by element preconditioned conjugate gradient (EBE-PCG) method can be effectively used to solve problems with symmetric positive definite matrices such as those arising in ANTARES-3D, a metal forming finite element (FE) simulation package. Efficient parallelization of this application on distributed memory multiple instruction multiple data (MIMD) parallel computers require au... View full abstract»
-
On the parallel implementation of OSI protocol processing systems
Publication Year: 1994, Page(s):815 - 819In a heterogeneous computing environment, computers have to use a suitable transfer syntax to communicate with each other because of the differences in internal data representations. Transfer syntax conversions take over 90% of the total processing power needed in OSI protocol processing. Application specific architectures in a heterogeneous system may not be efficient in performing the protocol p... View full abstract»
-
Fault detection and recovery in a data-driven real-time multiprocessor
Publication Year: 1994, Page(s):769 - 774
Cited by: Papers (1) | Patents (1)Introduces the mechanisms required to perform fault detection and recovery in the DART (Data-driven Architecture for Real-Time) multiprocessor architecture. The DART multiprocessor uses prioritized data-driven scheduling to ensure that multiple hard and soft deadlines are met. A data-driven checkpointing scheme has been developed that ensures that these deadlines are met even in the case of proces... View full abstract»
-
Integrating functional and imperative parallel programming: CC++ solutions to the Salishan problems
Publication Year: 1994, Page(s):61 - 67We investigate the practical integration of functional and imperative parallel programming in the context of a popular sequential object-based language. As the basis of our investigation, we develop solutions to the Salishan problems, a set of problems intended as a standard by which to compare parallel programming notations. The language that we use is CC++, C++ extended with single-assignment va... View full abstract»
-
Efficient embedding K-ary complete trees into hypercubes
Publication Year: 1994, Page(s):710 - 714Dilated embedding and precise embedding of K-ary complete trees into hypercubes are studied. For dilated embedding, a nearly optimal algorithm is proposed which embeds a K-ary complete tree of height h, T K(h), into an (h-1)[logK]+[log(K+2)] dimensional hypercube with dilation max(2, φ(K)*), φ(K+2). For precise embedding, we show a (K-1)h+1 dimensional hypercube is large enough ... View full abstract»
-
GRAPE-4: a teraFLOPS massively parallel special-purpose computer system for astrophysical N-body simulations
Publication Year: 1994, Page(s):280 - 287
Cited by: Papers (3) | Patents (3)We are developing a massively parallel special-purpose computer system for astrophysical N-body simulations, GRAPE-4 (Gravity-Pipe 4). The GRAPE-4 system is designed to simulate dynamics of classical particles which interact each other gravitationally by using predictor-corrector methods. We have developed two application-specific LSIs, the HARP (Hermite Accelerator Pipe) chip and the PROMETHEUS c... View full abstract»
-
HyperC: portable parallel programming in C
Publication Year: 1994, Page(s):682 - 687
Cited by: Papers (1)We introduce the HyperC language, a data parallel extension of C intended for portability over a wide range of architectures. We present the main topics of the language: the explicit parallelism through the data, the synchronous semantics and the parallel flow control that allows asynchronous execution, new function qualifiers to emphasize locality properties code and, finally, new communication t... View full abstract»
-
SIMD algorithms for matrix multiplication on the hypercube
Publication Year: 1994, Page(s):492 - 496
Cited by: Papers (2)Presents a new algorithm for n×n matrix multiplication on a hypercube of p processors, which outperforms, in terms of time complexity, the best algorithms known in the literature, due to Dekel, Nassimi and Sahni (1981). These authors presented algorithms of O[nλ/p(λ-1/2)], with 2⩽λ<3 and 1⩽p⩽n2, and O[log(p/n2)+n... View full abstract»
-
Automatic array alignment as a step in hierarchical program transformation
Publication Year: 1994, Page(s):578 - 582
Cited by: Papers (2) | Patents (11)Presents an original approach to automatic array alignment, the step in the hierarchical transformation system aimed at the efficient execution of shared memory programs on distributed memory machines. The array alignment algorithm deals with a broad set of intra-dimension and inter-dimension alignment preferences, including offsets, strides, permutations, embeddings, and their combinations. The a... View full abstract»
-
An evaluation of multiprocessor cache coherence based on virtual memory support
Publication Year: 1994, Page(s):158 - 164
Cited by: Papers (4)This paper presents an evaluation of the impact of several architectural parameters on the performance of virtual memory (VM) based cache coherence schemes for shared-memory multiprocessors. The VM-based cache coherence schemes use the traditional VM translation hardware on each processor to detect memory access attempts that might leave caches incoherent, and maintain coherence through VM-level s... View full abstract»
-
Massively parallel algorithms for solution of the Schrodinger equation
Publication Year: 1994, Page(s):517 - 523Time-parallel algorithms for solution of the Schrodinger equation are developed. By using the Crank-Nicolson method, it is shown that the solution of the problem can be fully parallelized in time, leading to a massive temporal parallelism in the computation with a minimum of communication and synchronization requirements. Our results clearly indicate that the Crank-Nicolson method, in addition to ... View full abstract»
-
Fuzzy communication for guided loop scheduling in multicomputers
Publication Year: 1994, Page(s):439 - 443We propose the use of guided loop scheduling and fuzzy communications to map shared-variable communications into message passing operations among multicomputers. The mapping mechanism converts scalar message passing operations into multiple broadcast or multiple multicast operations. The proposed method is evaluated by both simulation experiments and theoretical analysis. The performance results, ... View full abstract»
-
A parallel parsing algorithm for natural language using tree adjoining grammar
Publication Year: 1994, Page(s):820 - 828
Cited by: Papers (3)Tree Adjoining Grammar (TAG) is a powerful grammatical formalism for large-scale natural language processing. However, the computational complexity of parsing algorithms for TAG is high. We introduce a new parallel TAG parsing algorithm for MIMD hypercube multicomputers, using large-granularity grammar partitioning, asynchronous communication, and distributed termination detection. We describe our... View full abstract»
-
PARAM parallel supercomputer: architecture, programming environment, and applications
Publication Year: 1994, Page(s):388 - 389
Cited by: Papers (2)Recognising parallel processing as a leap-frog path for supercomputing as well as the destiny of future generation supercomputers, C-DAC was launched by the Government of India as a national initiative with a first 3-year mission of designing, and bringing into commercial a state-of-the-art parallel supercomputer with peak performance exceeding 1 GFLOPS, proportionate primary and secondary storage... View full abstract»
-
Parallel bidirectional heuristic search with dynamic process re-direction
Publication Year: 1994, Page(s):242 - 247
Cited by: Papers (1)Non-wave shaping parallel bidirectional heuristic search algorithms have been reported to suffer of the bidirectional search anomaly. Although wave-shaping is considered as the most natural approach, parallel bidirectional wave-shaping algorithms are extremely scarce. We introduce a wave-shaping algorithm for parallel bidirectional heuristic search in distributed memory environments. The method is... View full abstract»
-
Experience with executing shared memory programs using fine-grain communication and multithreading in EM-4
Publication Year: 1994, Page(s):630 - 636
Cited by: Papers (1)We present our experience and results obtained from executing shared memory application programs using fine-grain remote memory access communication and multithreading in the EM-4 multiprocessor. The EM-4 is a distributed memory multiprocessor which has a dataflow mechanism. The dataflow mechanism enables a fine-grain communication packet through the network to invoke the thread of control dynamic... View full abstract»
-
Fault-tolerant scheduling on a hard real-time multiprocessor system
Publication Year: 1994, Page(s):775 - 782
Cited by: Papers (31)Fault-tolerance is an important issue in hard real-time systems due to the critical nature of the supported tasks. One way of providing fault-tolerance is to schedule multiple copies of a task on different processors. If the primary copy of a task cannot be completed due to a fault, the scheduled backup copy is run and the task is completed. In this paper, we propose a new algorithm for fault-tole... View full abstract»
-
Barrier synchronization techniques for distributed process creation
Publication Year: 1994, Page(s):597 - 603
Cited by: Papers (3) | Patents (1)Synchronization techniques are proposed for algorithms which spawn processes remotely on loosely coupled processors based on run-time characteristics. The performance of the proposed synchronization schemes are measured on the iPSC/2 and SNAP-1 multiprocessors and their implementation cost is discussed. Results show that processes created dynamically throughout a distributed system can be synchron... View full abstract»
-
Accommodating polymorphic data decompositions in explicitly parallel programs
Publication Year: 1994, Page(s):68 - 74
Cited by: Papers (1)Explicitly parallel programs have the potential for greater performance than their implicitly parallel counterparts. However, this benefit can be accompanied by additional programming difficulties. We address one particular problem that has implications for both scalability and portability: the need for programs do accommodate diverse data decompositions. We explain why programs with explicit comm... View full abstract»
-
A new combinatorial approach to optimal embeddings of rectangles
Publication Year: 1994, Page(s):715 - 722An important problem in graph embeddings and parallel computing is to embed a rectangular grid into other graphs. We present a novel, general, combinatorial approach to (one-to-one) embedding rectangular grids into their ideal rectangular grids and optimal hypercubes. In contrast to earlier approaches of Aleliunas and Rosenberg, and Ellis (1982), our approach is based on a special kind of doubly s... View full abstract»
-
Building multithreaded architectures with off-the-shelf microprocessors
Publication Year: 1994, Page(s):288 - 294
Cited by: Papers (16) | Patents (3)Present day parallel computers often face the problems of large software overheads for process switching and inter-processor communication. These problems are addressed by the Multi-Threaded Architecture (MTA), a multiprocessor model designed for efficient parallel execution of both numerical and non-numerical programs. We begin with a conventional processor, and add the minimal external hardware ... View full abstract»