Scheduled System Maintenance
On Tuesday, January 22, IEEE Xplore will undergo scheduled maintenance from 1:00-4:00 PM ET
During this time, there may be intermittent impact on performance. We apologize for any inconvenience.

High Performance Computing on the Information Superhighway, 1997. HPC Asia '97

28 April-2 May 1997

Filter Results

Displaying Results 1 - 25 of 139
  • Proceedings High Performance Computing on the Information Superhighway. HPC Asia '97

    Publication Year: 1997
    Request permission for reuse | PDF file iconPDF (832 KB)
    Freely Available from IEEE
  • Relaxing cache coherence protocol with QOLB synchronizations

    Publication Year: 1997, Page(s):1 - 6
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (540 KB)

    Cache memories are widely accepted in shared-memory multiprocessor systems because they make possible the reduction of network traffic and memory latencies. However, they impose substantial overheads of cache coherency maintenance and also engender some inefficiencies of coherence misses. The paper considers that constraints of cache coherency can be relaxed in a region where exclusive accesses ar... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An effective full-map directory scheme for the sectored caches

    Publication Year: 1997, Page(s):7 - 11
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (506 KB)

    In multiprocessor systems, the cache misses due to coherence transactions make up many of the total cache misses. However this type of cache miss is strongly dependent on the type of data sharing among processors, especially false sharing. Until now the small cache block size has been used to avoid false sharing mainly in multiprocessor systems, but the smaller the cache block size, the lower the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cache performance and algorithm optimization

    Publication Year: 1997, Page(s):12 - 17
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (476 KB)

    A technique to enhance the cache performance of some blocked algorithms is proposed. According to the results of number theory, the author presents a principle for array padding so that accesses of array subblocks do not generate conflict misses. The technique is used to calculate LU factorization and matrix multiplication. The principle is tested on a shared memory multiprocessor. The practical r... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A hierarchical memory directory scheme via extending SCI for large-scale multiprocessors

    Publication Year: 1997, Page(s):18 - 23
    Cited by:  Patents (3)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (535 KB)

    SCI (scalable coherent interface) is a pointer-based coherent directory scheme for large-scale multiprocessors. Large message latency is one of the problems with SCI because of its linked list structure: the searching latency can grow as a linear order of the number of processors. The authors focus on a hierarchical architecture to propose a new scheme-EST (extending SCI-tree), which may reduce th... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance impacts of caching I-structure data on frame-based multithreaded processing

    Publication Year: 1997, Page(s):24 - 29
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (495 KB)

    Since long latency due to remote memory access or interprocessor communication could be tolerated in multithreaded processing, caching I-structure memory is expected to have less beneficial effect on the performance than caching ordinary data. The authors suggest an organization and an operation scheme of an I-structure cache in frame-based multithreading, and show quantitatively that caching I-st... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hybrid full map directory scheme for distributed shared memory multiprocessors

    Publication Year: 1997, Page(s):30 - 34
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (421 KB)

    In this paper, a novel full-map directory cache coherence scheme called the "hybrid full-map directory scheme" is proposed. It reduces the growth rate of the directory storage overhead from O(N/sup 2/) down to O(N/spl radic/N). Moreover, the performance of the scheme is close to those of previous full-map directory schemes. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The impact of GMS (Geostationary Meteorological Satellite) data in the GDAPS (Global Data Assimilation and Prediction System)

    Publication Year: 1997, Page(s):35 - 40
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (377 KB)

    Due to the extensive calculations, running a global model requires a high performance computer. The global model and analysis system, called GDAPS, has been set up in a Cray C9O environment. The five-day global forecast was made available twice a day. In order to improve the forecast performance, the assimilation with satellite cloud information is investigated. Since moisture data over the ocean ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Coupling general circulation model (CCM3) with mesoscale model (MM5) for regional climate studies

    Publication Year: 1997, Page(s):41 - 44
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (259 KB)

    The NCAR/CCM3 general circulation model (GCM) is coupled with regional weather prediction model PSU/NCAR MM5 for regional climate studies. The main motivation for this effort comes from the observation that the representation of sub-GCM grid scale forcing is critical to accurately simulate the regional distribution of climatic variables such as air temperature and precipitation over meso-scale reg... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Model intercomparison study: cloud-radiative forcings and feedbacks

    Publication Year: 1997, Page(s):45 - 49
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (423 KB)

    Effects of cloud-radiation schemes on cloud forcing and feedback are tested using two different GCMs (general circulation models): the NCAR (National Center for Atmospheric Research) CCM2 (Community Climate Model Version 2) and YONU (Yonsei University GCM). The major differences between the cloud-radiation schemes are in the method of treating cloud water content: in CCM2 the cloud water content i... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Geostationary-satellite imagery applications on distributed, high-performance computing

    Publication Year: 1997, Page(s):50 - 55
    Cited by:  Papers (3)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (638 KB)

    We discuss applications of high resolution geostationary satellite imagery and distributed high performance computing facilities for the storage, processing and delivery of satellite data products. We describe our system which is built on a distributed high performance computing environment using a number of software infrastructural building blocks and computational resources interconnected by an ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of STC104 interconnection networks

    Publication Year: 1997, Page(s):56 - 60
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (504 KB)

    The fast routing chip, Inmos STC104 has been designed and is now available in the market. Many research results concerning the performance, design cost and scalability of the packet switch have been presented and are currently under evaluation. There is a great demand for an efficient and reliable router in high performance parallel processing systems or data management networks. The performance a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Shifted Recursive Torus interconnection for high performance computing

    Publication Year: 1997, Page(s):61 - 66
    Cited by:  Papers (4)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (489 KB)

    We propose the Shifted Recursive Torus (SRT) interconnection network for high performance computing. By adding multi level links to the torus network recursively, the SRT can achieve excellent interconnection features such as a smaller diameter, a limited number of links per node, an easy implementation in VLSI, and an expandible hierarchical structure. The paper considers the problem of achieving... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Broadcasting on incomplete star graph interconnection networks

    Publication Year: 1997, Page(s):67 - 72
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (562 KB)

    We propose two one-to-all optimal broadcasting algorithms in incomplete star graphs. An incomplete star graph with N nodes, where (n-1)!<N<n!, is a subgraph of an n-star. Using a routing scheme to transmit a message to each substar composed of the incomplete star, our proposed broadcasting algorithm is optimal in O(n log n) on the single port communication model. While broadcasting m message... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Embeddings into the pancake interconnection network

    Publication Year: 1997, Page(s):73 - 78
    Cited by:  Papers (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (536 KB)

    The pancake is one of the Cayley graphs that were proposed as alternatives to the hypercube for interconnecting processors in parallel computers. Some good properties of this interconnection network include: vertex symmetry, small degree and diameter, extendability, and high connectivity (robustness). We present constant dilation embeddings of rings, grids, and hypercubes into the pancake. We also... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance analysis of multistage interconnection networks using a multicast algorithm

    Publication Year: 1997, Page(s):79 - 84
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (537 KB)

    Studies issues of multicasting in multistage interconnection networks (MINs) for large-scale multicomputers. In addition to point-to-point communication among the processing nodes, efficient collective communication is critical to the performance of multicomputers. This paper presents a new approach to support multicast communication, on the basis of a restricted address encoding scheme which cons... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Computation of a transonic finite wing flow using Hanbit-1 computer

    Publication Year: 1997, Page(s):85 - 88
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (253 KB)

    This paper reports the results of a parallel computation of a transonic flow past a 3D wing using the Hanbit-1 computer, a hypercube parallel computer developed at KAIST (Korea Advanced Institute of Science and Technology). Two different numerical schemes, one a lower-upper symmetric Gauss-Seidel (LU-SGS) method and the other a Runge-Kutta time-stepping scheme, were used. The performance of the Ha... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Analysis of heat and fluid flow in a PCB channel using KAICUBE/Hanbit-1 parallel computer

    Publication Year: 1997, Page(s):89 - 92
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (352 KB)

    A 3D analysis code for mixed convection flow through the printed circuit board (PCB) channel on which heat-generating chip modules are mounted has been developed. For uniformly distributed modules, a small domain containing a single module is considered with periodic and symmetric boundary conditions. Calculations have been made for various Grashof numbers to examine the effects of natural convect... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Simulation of aerodynamics problem on a distributed shared-memory machine

    Publication Year: 1997, Page(s):93 - 98
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (545 KB)

    A complex compressible Navier-Stokes equation using a time-accurate implicit difference scheme is parallelized using domain decomposition (DD) and loop parallelization methods. The numerical scheme used is a lower-upper alternating direction implicit (LU-ADI) factorization method with a Baldwin-Lomax (1978) turbulence model. This paper gives a preliminary result on the performance of the code on a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Numerical Wind Tunnel (NWT) and CFD research at National Aerospace Laboratory

    Publication Year: 1997, Page(s):99 - 103
    Cited by:  Papers (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (487 KB)

    The National Aerospace Laboratory (NAL) of Japan is the only national research institute for aerospace engineering and services in Japan. NAL is a pioneer of supercomputer development and utilization in Japan. Since 1993, NAL has developed the Numerical Wind Tunnel (NWT), consisting of 166 parallel-connected Fujitsu vector processors and has been providing substantial computational power resources... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel simulation of fluid flow inside the rotating cylindrical container on the Intel Paragon

    Publication Year: 1997, Page(s):104 - 108
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (458 KB)

    The parallelization of Fourier spectral and Chebyshev collocation methods to the cylindrical form of the Navier-Stokes equations with one periodic and two nonperiodic directions has been implemented. The matrix diagonalization technique, which has been applied to solve the elliptic equations which occur in the implicit steps of the three-step time-splitting algorithm, enables one to parallelize on... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic load balancing technique for wide area video server

    Publication Year: 1997, Page(s):109 - 116
    Cited by:  Patents (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (566 KB)

    Proposes a dynamic load-balancing technique for a video-on-demand (VOD) server system which is responsible for transmitting video stream data to a client's set-top box (STB), and shows the result from a VOD system designed to incorporate our proposed load-balancing scheme. Our load-balancing technique is an application of the extended gradient model, in which we modify and update the control param... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel generation of k-ary trees

    Publication Year: 1997, Page(s):117 - 121
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (390 KB)

    The only published algorithms for generating k-ary trees in parallel are those of Akl and Stojmenovic/spl caron/ (1996) and Vajnovszki and Phillips (1997). In the first of these two papers, trees are represented by an inversion table and the processor model is a linear array multicomputer. In the second paper, trees are represented by bit-strings and the algorithm executes on a shared-memory multi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel double sort-merge algorithm for object-oriented collection join queries

    Publication Year: 1997, Page(s):122 - 127
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (595 KB)

    In object-oriented databases (OODBs), although path expressions through pointer connections may exist, it is sometimes necessary to perform an explicit join operation between two classes. Since a class may contain collection attributes as well as simple attributes, join queries in OODBs may be based on collections. A need for collection join algorithms arises, since the conventional join algorithm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel implementation of the unit commitment problem on NOWs

    Publication Year: 1997, Page(s):128 - 133
    Cited by:  Papers (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (479 KB)

    This paper proposes the application of parallelization techniques for solving the unit commitment problem (UCP) on NOWs (networks of workstations). A modified parallel dynamic programming method that takes advantage of NOWs is presented. Our algorithm performs better than the general parallel dynamic programming algorithm for UCP. In order to demonstrate the usefulness of our technique, we apply o... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.