By Topic

Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)

20-20 Dec. 1998

Filter Results

Displaying Results 1 - 25 of 63
  • Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)

    Publication Year: 1998
    Request permission for commercial reuse | PDF file iconPDF (95 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 1998, Page(s):v - x
    Request permission for commercial reuse | PDF file iconPDF (250 KB)
    Freely Available from IEEE
  • Data structure distribution and multi-threading of Linux file system for multiprocessors

    Publication Year: 1998, Page(s):97 - 104
    Cited by:  Patents (10)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (240 KB)

    The standard Linux design assumes a uniprocessor architecture. Allowing several processors to execute simultaneously in the kernel mode on behalf of different processes can cause consistency problems unless appropriate exclusion mechanisms are used. In addition, if the file system data structures are not distributed, performance can be affected. We discuss a multiprocessor file system design for L... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • WADE: a Web-based automated parallel CAD environment

    Publication Year: 1998, Page(s):473 - 480
    Cited by:  Papers (1)  |  Patents (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (128 KB)

    We present a novel framework of a Web-based automated parallel CAD environment. The goal of this project is to make available to the CAD community a growing number of design and test applications that support standard interfaces and execute efficiently in a parallel environment. The design files of a user working on a remote machine are transparently shipped to the local Compute Center, the releva... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extrapolation in distributed adaptive integration

    Publication Year: 1998, Page(s):88 - 95
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (132 KB)

    The paper addresses the design of distributed methods which incorporate numerical extrapolation into adaptive multivariate integration, in order to increase the functionality of the integration algorithms. When attempting to deal with singularities, adaptive integration algorithms need a very fine subdivision in the proximity of these “hot spots”. This is not practical in higher dimens... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PERL-a registerless architecture

    Publication Year: 1998, Page(s):33 - 40
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB)

    Reducing processor memory speed gap is one of the major challenges computer architects face today. Efficient use of CPU registers reduces the number of memory accesses. However, registers do incur extra overhead of load/store, register allocation and saving of register context across procedure calls. Caches however do not have any such overheads and cache technology has matured to the extent that ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploiting image processing locality in cache pre-fetching

    Publication Year: 1998, Page(s):466 - 472
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (124 KB)

    Emerging trends in computer design attempt to include specific solutions for handling images also in general-purpose computers, because of the current spread of multimedia, image processing and computer graphics applications. In this context, we propose hardware pre-fetching techniques specific for caching images: the main issue we state is that most algorithms working on images exhibit a 2D spati... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Computation of penetration measures for convex polygons and polyhedra for graphics applications

    Publication Year: 1998, Page(s):81 - 87
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (124 KB)

    Algorithms to compute measures of penetration between convex polygonal objects in ℜ2 and convex polyhedral objects in ℜ3 are presented. The algorithms are analyzed for their asymptotic complexity. Details of implementation on a single processor machine are given. Parallelization of the algorithms is discussed View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data prefetching with co-operative caching

    Publication Year: 1998, Page(s):25 - 32
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1276 KB)

    Recent research in data cache prefetching is found to be selective in nature: achieving high prediction accuracy over a set of selected references such as array access with constant strides. As a result, for applications where the memory latency is mainly due to data accesses in the set of non selected references of a program, they lose their effectiveness. In fact, their performance might be wors... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic load balancing schemes for computing accessible surface area of protein molecules

    Publication Year: 1998, Page(s):326 - 333
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (228 KB)

    This paper presents an experimental study of dynamic load balancing methods for a parallelized solution to a well-known problem in computational molecular biology: computing the accessible surface areas (ASA) of proteins. The main contribution is a better understanding of how certain techniques for load estimation and redistribution must be combined carefully for effectiveness and how these combin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Piecewise fixed-rate retrieval scheme for variable bit rate video

    Publication Year: 1998, Page(s):459 - 465
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (132 KB)

    We consider the retrieval of variable bit rate (VBR) video from the distributed video server. Video servers often employ the constant rate retrieval scheme, in which a fixed amount of disk bandwidth is reserved throughout the retrieval to guarantee the continuous playback requirement. In constant rate retrieval, the allocated disk bandwidth is not always fully utilized in order to avoid excessive ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving error bounds for multipole-based treecodes

    Publication Year: 1998, Page(s):73 - 80
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (144 KB)

    Rapid evaluation of potentials in particle systems is an important and time-consuming step in many physical simulations. Over the past decade (1988-98), the development of treecodes such as the Fast Multipole Method (FMM) and the Barnes-Hut method has enabled large scale simulations in domains such as astrophysics, molecular dynamics, and material science. FMM and related methods rely on fixed deg... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • More on arbitrary boundary packed arithmetic

    Publication Year: 1998, Page(s):19 - 24
    Cited by:  Papers (1)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (76 KB)

    Recent microprocessors have been enhanced with media instruction sets for accelerating media algorithms. They exploit the fact that media algorithms have small data types, and widths much less than that of the processor. Current media instruction sets support only 8-, 16- and 32-bit sub-datatypes. This scheme is inefficient in several applications where bit lengths of 9, 12 and so on are used. We ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • How to improve local load balancing policies by distorting load information

    Publication Year: 1998, Page(s):318 - 325
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (140 KB)

    The paper focuses on local load balancing policies for massively parallel architectures and introduces a new scheme for load information exchange between neighbor nodes. The idea is to distort the exchanged load information to let the policy keep into account a more global view of the system and overcome the limits of the local scope. The presented scheme has been integrated into two variants of a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Strategies for parallel implementation of a global spectral atmospheric general circulation model

    Publication Year: 1998, Page(s):452 - 458
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (140 KB)

    We discuss the parallel implementation of a global spectral atmospheric general circulation model on a message passing platform. We also discuss strategies that need to be employed to improve performance on parallel machines which will have multiprocessor nodes sharing an intra-node memory space. A brief discussion of the cause of load imbalances and simple methods to reduce the same are also pres... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A parallel skeletonization algorithm and its VLSI architecture

    Publication Year: 1998, Page(s):65 - 72
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (156 KB)

    This paper presents a new algorithm to extract the skeleton and its Euclidean distance values from a binary image. A VLSI implementation of the algorithm in a locally connected cellular array is also given. The algorithm runs in O(n) time for an image of size n×n. The extracted skeleton reconstructs the objects in the image exactly View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Precise control of instruction caches

    Publication Year: 1998, Page(s):11 - 18
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (264 KB)

    Instruction caches are usually designed to fetch the whole block from memory in case of a miss. However, the fetched blocks might contain branch instructions which if taken, will render the rest of the block useless. A novel approach is introduced, namely the Precise Control, which fetches only the words of a cache block that are likely to be used. The performance of Precise Control is evaluated a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Selection algorithms for parallel disk systems

    Publication Year: 1998, Page(s):343 - 350
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (212 KB)

    With the widening gap between processor speeds and disk access speeds, the I/O bottleneck has become critical. Parallel disk systems (PDS) have been introduced to alleviate this bottleneck. We present deterministic and randomized selection algorithms for parallel disk systems. The algorithms to be presented, in addition to being asymptotically optimal, have small underlying constants in their time... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Near optimal algorithms for scheduling independent chains in BSP

    Publication Year: 1998, Page(s):310 - 317
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (164 KB)

    The aim of this work is to show that scheduling a set of independent chains on a parallel machine under the BSP model is a difficult optimization problem which can be easily approximated in practice. BSP is a machine independent computational model which is becoming more and more popular. Finding the optimal solution when the number of processors is fixed is shown to be hard. Efficient heuristics ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • GLB: a low-cost scheduling algorithm for distributed-memory architectures

    Publication Year: 1998, Page(s):294 - 301
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB)

    This paper proposes a new compile time scheduling algorithm for distributed-memory systems, called Global Load Balancing (GLB). GLB is intended as the second step in the multi-step class of scheduling algorithms. Experimental results show that compared with known scheduling algorithms of the same low-cost complexity, the proposed algorithm improves schedule lengths up to 30%. Compared to algorithm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A clustering approach in characterizing interconnection networks

    Publication Year: 1998, Page(s):277 - 284
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (204 KB)

    Networks of workstations (NOW) have gained importance in recent years. The interconnection network of NOW systems often consist of generic switches connected in an irregular topology. Traditionally, interconnection networks are characterized by their topological properties, such as number of nodes, diameter, and bisection width. These parameters are not sufficient in characterizing irregular netwo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On topology and bisection bandwidth of hierarchical-ring networks for shared-memory multiprocessors

    Publication Year: 1998, Page(s):262 - 269
    Cited by:  Papers (7)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (148 KB)

    Hierarchical-ring based multiprocessors are interesting alternatives to the more popular two-dimensional direct networks. They allow for simple router designs and wider communication paths than their direct network counterparts. There are several ways hierarchical-ring networks can be configured for a given number of processors. Feasible topologies range from tall, lean networks to short, wide net... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Available parallelism with data value prediction

    Publication Year: 1998, Page(s):194 - 201
    Cited by:  Papers (3)  |  Patents (5)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1332 KB)

    Data dependences (data flow constraints) present a major hurdle to the amount of instruction-level parallelism that can be exploited from a program. Recent work has focused on the use of data value prediction to overcome the limits imposed by data dependences. That is, when an instruction is fetched, its result can be predicted so that subsequent instructions that depend on the result can execute ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Measurement-based modeling and analysis methodology for characterizing parallel I/O performance

    Publication Year: 1998, Page(s):391 - 398
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (116 KB)

    A parallel I/O characterization methodology that consists of a hierarchical modeling and measurement analysis environment for investigating I/O performance is presented. The methodology is illustrated via a case study of a video server workload running under the parallel I/O file system (PIOFS) of IBM SP/2. The measurements demonstrate that for video server and read-intensive workloads, spreading ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An efficient implementation of a progressive image transmission system using successive pruning algorithm on a parallel architecture

    Publication Year: 1998, Page(s):445 - 451
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (200 KB)

    Presented in this paper is an implementation using a combination of a successive pruning algorithm and a parallel architecture using a digital signal processor and a general purpose processor for progressive transmission of still images. The adaptive pruning algorithm is used for ensuring a minimum quality of the image while further progressions on the image are computed using a modified successiv... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.