High Performance Computing on the Information Superhighway, 1997. HPC Asia '97

April 28 1997-May 2 1997

Filter Results

Displaying Results 1 - 25 of 139
  • Proceedings High Performance Computing on the Information Superhighway. HPC Asia '97

    Publication Year: 1997
    Request permission for commercial reuse | |PDF file iconPDF (832 KB)
    Freely Available from IEEE
  • Index of authors

    Publication Year: 1997, Page(s):757 - 760
    Request permission for commercial reuse | |PDF file iconPDF (241 KB)
    Freely Available from IEEE
  • Implementing LOGFLOW on a workstation cluster

    Publication Year: 1997, Page(s):145 - 150
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (556 KB)

    LOGFLOW is a distributed Prolog implementation running on transputer networks, developed at KFKI-MSZKI. To improve the capabilities and the power of LOGFLOW the system is ported onto workstation-clusters under the name WS-LOGFLOW. The new platform requires modification in the architecture of the system, in token transportation and in work distribution. This paper presents the modified architecture... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comparison of data-parallel collective communication performance and its application

    Publication Year: 1997, Page(s):137 - 144
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (528 KB)

    Collective communications such as broadcast and reduction are commonly used in data parallel programs. It is important to understand the performance of such primitive communications to characterize parallel systems and analyze the performance of parallel applications running on specific parallel systems. We measured the performance of collective communication operations on several multiprocessor s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Linear least squares solutions by Householder transformations with column pivoting on a parallel machine

    Publication Year: 1997, Page(s):134 - 136
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (224 KB)

    Using Householder transformations to solve linear least squares problems is very efficient. We have found a computational form especially suitable for BBN Butterfly, a parallel computer. We modified the algorithm for parallel processing using various constructs from the BBN Mach 1000 system. We were able to obtain a speed-up of 3 with 4 processors, and 3.83 with 6 processors for M=1000, and N=200,... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel implementation of the unit commitment problem on NOWs

    Publication Year: 1997, Page(s):128 - 133
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (428 KB)

    This paper proposes the application of parallelization techniques for solving the unit commitment problem (UCP) on NOWs (networks of workstations). A modified parallel dynamic programming method that takes advantage of NOWs is presented. Our algorithm performs better than the general parallel dynamic programming algorithm for UCP. In order to demonstrate the usefulness of our technique, we apply o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The impact of GMS (Geostationary Meteorological Satellite) data in the GDAPS (Global Data Assimilation and Prediction System)

    Publication Year: 1997, Page(s):35 - 40
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (344 KB)

    Due to the extensive calculations, running a global model requires a high performance computer. The global model and analysis system, called GDAPS, has been set up in a Cray C9O environment. The five-day global forecast was made available twice a day. In order to improve the forecast performance, the assimilation with satellite cloud information is investigated. Since moisture data over the ocean ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel double sort-merge algorithm for object-oriented collection join queries

    Publication Year: 1997, Page(s):122 - 127
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (532 KB)

    In object-oriented databases (OODBs), although path expressions through pointer connections may exist, it is sometimes necessary to perform an explicit join operation between two classes. Since a class may contain collection attributes as well as simple attributes, join queries in OODBs may be based on collections. A need for collection join algorithms arises, since the conventional join algorithm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hybrid full map directory scheme for distributed shared memory multiprocessors

    Publication Year: 1997, Page(s):30 - 34
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (372 KB)

    In this paper, a novel full-map directory cache coherence scheme called the “hybrid full-map directory scheme” is proposed. It reduces the growth rate of the directory storage overhead from O(N 2) down to O(N√N). Moreover, the performance of the scheme is close to those of previous full-map directory schemes View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel generation of k-ary trees

    Publication Year: 1997, Page(s):117 - 121
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (340 KB)

    The only published algorithms for generating k-ary trees in parallel are those of Akl and Stojmenovicˇ (1996) and Vajnovszki and Phillips (1997). In the first of these two papers, trees are represented by an inversion table and the processor model is a linear array multicomputer. In the second paper, trees are represented by bit-strings and the algorithm executes on a shared-memory multiproc... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance impacts of caching I-structure data on frame-based multithreaded processing

    Publication Year: 1997, Page(s):24 - 29
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (444 KB)

    Since long latency due to remote memory access or interprocessor communication could be tolerated in multithreaded processing, caching I-structure memory is expected to have less beneficial effect on the performance than caching ordinary data. The authors suggest an organization and an operation scheme of an I-structure cache in frame-based multithreading, and show quantitatively that caching I-st... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Embeddings into the pancake interconnection network

    Publication Year: 1997, Page(s):73 - 78
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (468 KB)

    The pancake is one of the Cayley graphs that were proposed as alternatives to the hypercube for interconnecting processors in parallel computers. Some good properties of this interconnection network include: vertex symmetry, small degree and diameter, extendability, and high connectivity (robustness). We present constant dilation embeddings of rings, grids, and hypercubes into the pancake. We also... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic load balancing technique for wide area video server

    Publication Year: 1997, Page(s):109 - 116
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (500 KB)

    Proposes a dynamic load-balancing technique for a video-on-demand (VOD) server system which is responsible for transmitting video stream data to a client's set-top box (STB), and shows the result from a VOD system designed to incorporate our proposed load-balancing scheme. Our load-balancing technique is an application of the extended gradient model, in which we modify and update the control param... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchical memory directory scheme via extending SCI for large-scale multiprocessors

    Publication Year: 1997, Page(s):18 - 23
    Cited by:  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (480 KB)

    SCI (scalable coherent interface) is a pointer-based coherent directory scheme for large-scale multiprocessors. Large message latency is one of the problems with SCI because of its linked list structure: the searching latency can grow as a linear order of the number of processors. The authors focus on a hierarchical architecture to propose a new scheme-EST (extending SCI-tree), which may reduce th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Broadcasting on incomplete star graph interconnection networks

    Publication Year: 1997, Page(s):67 - 72
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (500 KB)

    We propose two one-to-all optimal broadcasting algorithms in incomplete star graphs. An incomplete star graph with N nodes, where (n-1)!<N<n!, is a subgraph of an n-star. Using a routing scheme to transmit a message to each substar composed of the incomplete star, our proposed broadcasting algorithm is optimal in O(n log n) on the single port communication model. While broadcasting m message... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel simulation of fluid flow inside the rotating cylindrical container on the Intel Paragon

    Publication Year: 1997, Page(s):104 - 108
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (412 KB)

    The parallelization of Fourier spectral and Chebyshev collocation methods to the cylindrical form of the Navier-Stokes equations with one periodic and two nonperiodic directions has been implemented. The matrix diagonalization technique, which has been applied to solve the elliptic equations which occur in the implicit steps of the three-step time-splitting algorithm, enables one to parallelize on... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cache performance and algorithm optimization

    Publication Year: 1997, Page(s):12 - 17
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (416 KB)

    A technique to enhance the cache performance of some blocked algorithms is proposed. According to the results of number theory, the author presents a principle for array padding so that accesses of array subblocks do not generate conflict misses. The technique is used to calculate LU factorization and matrix multiplication. The principle is tested on a shared memory multiprocessor. The practical r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Shifted Recursive Torus interconnection for high performance computing

    Publication Year: 1997, Page(s):61 - 66
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (436 KB)

    We propose the Shifted Recursive Torus (SRT) interconnection network for high performance computing. By adding multi level links to the torus network recursively, the SRT can achieve excellent interconnection features such as a smaller diameter, a limited number of links per node, an easy implementation in VLSI, and an expandible hierarchical structure. The paper considers the problem of achieving... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Numerical Wind Tunnel (NWT) and CFD research at National Aerospace Laboratory

    Publication Year: 1997, Page(s):99 - 103
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (448 KB)

    The National Aerospace Laboratory (NAL) of Japan is the only national research institute for aerospace engineering and services in Japan. NAL is a pioneer of supercomputer development and utilization in Japan. Since 1993, NAL has developed the Numerical Wind Tunnel (NWT), consisting of 166 parallel-connected Fujitsu vector processors and has been providing substantial computational power resources... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An effective full-map directory scheme for the sectored caches

    Publication Year: 1997, Page(s):7 - 11
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (460 KB)

    In multiprocessor systems, the cache misses due to coherence transactions make up many of the total cache misses. However this type of cache miss is strongly dependent on the type of data sharing among processors, especially false sharing. Until now the small cache block size has been used to avoid false sharing mainly in multiprocessor systems, but the smaller the cache block size, the lower the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Banyan: a language for scalable parallel programming on loosely coupled distributed systems

    Publication Year: 1997, Page(s):535 - 540
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (508 KB)

    Parallel programming on loosely coupled distributed systems is becoming a viable approach with the rapid increase in network speeds and availability of large amounts of unused CPU capacity on individual workstations. Parallel programs are often written for a specific configuration of the distributed system such as the number of nodes, their relative speeds and their network connections. These prog... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Shape optimization of polymer extrusion die by three-dimensional flow simulation

    Publication Year: 1997, Page(s):601 - 604
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (280 KB)

    The shape optimization of polymer extrusion die is achieved in this study. It is necessary to analyze the die flow three-dimensionally in order to get practical flow field due to the complex geometry of extrusion die. The die flow is simulated using three-dimensional finite element method. The die geometry is modeled in new way to get appropriate design variables. For the optimization of die geome... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient polling protocol for the management information gathering over TCP/IP

    Publication Year: 1997, Page(s):466 - 471
    Cited by:  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (400 KB)

    The polling protocol that gathers network management information is an important feature of network performance. This paper shows the design of a polling protocol that can gather management information from agents in an efficient way. Existing polling protocols-current polling with the same polling intervals or separated polling with different intervals-have some defects in that the probability of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of STC104 interconnection networks

    Publication Year: 1997, Page(s):56 - 60
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (448 KB)

    The fast routing chip, Inmos STC104 has been designed and is now available in the market. Many research results concerning the performance, design cost and scalability of the packet switch have been presented and are currently under evaluation. There is a great demand for an efficient and reliable router in high performance parallel processing systems or data management networks. The performance a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulation of aerodynamics problem on a distributed shared-memory machine

    Publication Year: 1997, Page(s):93 - 98
    Request permission for commercial reuse | Click to expandAbstract |PDF file iconPDF (484 KB)

    A complex compressible Navier-Stokes equation using a time-accurate implicit difference scheme is parallelized using domain decomposition (DD) and loop parallelization methods. The numerical scheme used is a lower-upper alternating direction implicit (LU-ADI) factorization method with a Baldwin-Lomax (1978) turbulence model. This paper gives a preliminary result on the performance of the code on a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.