Proceedings Fourth International Conference on High-Performance Computing

18-21 Dec. 1997

Filter Results

Displaying Results 1 - 25 of 83
  • Proceedings Fourth International Conference on High-Performance Computing

    Publication Year: 1997
    Request permission for commercial reuse | PDF file iconPDF (451 KB)
    Freely Available from IEEE
  • Conference Organization

    Publication Year: 1997, Page(s):xviii - xxiii
    Request permission for commercial reuse | PDF file iconPDF (418 KB)
    Freely Available from IEEE
  • Virtual registers

    Publication Year: 1997, Page(s):364 - 369
    Cited by:  Papers (10)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (631 KB)

    The number of physical registers is one of the critical issues of current superscalar out-of-order processors. Conventional architectures allocate, in the decoding stage, a new storage location (e.g. a physical register) for each operation that has a destination register. When an instruction is committed, it frees the physical register allocated to the previous instruction that had the same destin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 1997, Page(s):539 - 541
    Request permission for commercial reuse | PDF file iconPDF (141 KB)
    Freely Available from IEEE
  • ELMO: extending (sequential) languages with migratable objects-compiler support

    Publication Year: 1997, Page(s):180 - 185
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (564 KB)

    Efficient task migration is an important feature in parallel and distributed programs, in particular to support checkpointing and recovery for fault tolerance. It is also very useful in distributed environments like networks of workstations where external loads are often unpredictable and dynamic in nature. We propose simple language extensions (ELMO) to existing sequential programming languages l... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Design of a parallel C language for distributed systems

    Publication Year: 1997, Page(s):174 - 179
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (476 KB)

    The performance of a distributed system depends upon the efficiency of job distribution among processing nodes, as well as that of its system architecture and operating system. The paper presents an extended C language, ParaC, that supports efficient parallel programming on distributed systems. ParaC is designed to reduce the effort of job distribution on distributed programming environments. Our ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Supporting unbounded process parallelism in the SPC programming model

    Publication Year: 1997, Page(s):168 - 173
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (544 KB)

    In automatic mapping of parallel programs to target parallel machines the efficiency of the compile-time cost estimation needed to steer the optimization process is highly dependent on the choice of programming model. Recently a new parallel programming model, called SPC, has been introduced that specifically aims at the efficient computation of reliable cost estimates, paving the way for automati... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interconnection network behavior on a multicomputer in the parallelization of the MPEG coding algorithm. Worm-hole vs. packet-switching routing

    Publication Year: 1997, Page(s):48 - 53
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (440 KB)

    We propose the implementation of a MPEG encoder developed by the University of California at Berkeley on a multicomputer system. Since this application is in real time, we present a mapping of the video sequence between the EPs of the architecture, where the communication between EPs is minimized. We also propose the necessary load/store process with a simple mechanism input/output, where the glob... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Integer sorting algorithms for coarse-grained parallel machines

    Publication Year: 1997, Page(s):159 - 164
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (600 KB)

    Integer sorting is a subclass of the sorting problem where the elements have integer values and the largest element is polynomially bounded in the number of elements to be sorted. It is useful for applications in which the size of the maximum value of element to be sorted is bounded. In this paper, we present a new distributed radix-sort algorithm for integer sorting. The structure of our algorith... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping of neural network models onto massively parallel hierarchical computer systems

    Publication Year: 1997, Page(s):42 - 47
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (332 KB)

    Investigates the proposed implementation of neural networks on massively parallel hierarchical computer systems with hypernet topology. The proposed mapping scheme takes advantage of the inherent structure of hypernets to process multiple copies of the neural network in the different subnets, each executing a portion of the training set. Finally, the weight changes in all the subnets are accumulat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Adaptive multivariate integration using MPI

    Publication Year: 1997, Page(s):152 - 158
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (588 KB)

    We describe a coarse grain parallel algorithm for multivariate adaptive integration using MPI. The algorithm is asynchronous in nature and allows for load balancing. Timing results show good speedups obtained on a network of workstations for a class of integrals from Bayesian statistics View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Load balancing using symmetric broadcast networks: a PVM-based comparative performance study

    Publication Year: 1997, Page(s):244 - 249
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (476 KB)

    In parallel and distributed systems, an important issue in managing a decentralized task queue is load balancing among multiple processors. In this paper, we propose a scheme for this problem by using a symmetric broadcast network (SBN) which provides an efficient and robust communication pattern between processors. We compare the performance of SBN-based load balancing algorithm with randomizatio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Classification and performance evaluation of simultaneous multithreaded architectures

    Publication Year: 1997, Page(s):34 - 39
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (536 KB)

    In this paper, we classify simultaneous multithreaded architectures based on how they select instructions issued in a single cycle. This classification allows us to study the present trend of technology as well as to explore the new avenues for improvements in simultaneous multithreaded architectures. Based on our classification, we study the impact of various parameters of simultaneous multithrea... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel algorithms for vehicle routing problems

    Publication Year: 1997, Page(s):144 - 151
    Cited by:  Papers (1)  |  Patents (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (672 KB)

    Vehicle routing problems involve the navigation of one or more vehicles through a network of locations. Locations have associated handling times as well as time windows during which they are active. The arcs connecting locations have time costs associated with them. In this paper, we consider two different problems in single vehicle routing. The first is to find least time cost routes between all ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A floating-point validation suite for high-performance shared and distributed memory computing systems

    Publication Year: 1997, Page(s):88 - 93
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (396 KB)

    A methodology to systematically identify and isolate bugs in floating point implementation in high performance multiple CPU computing systems is formulated. A validation suite is written and tested. Results show improper implementation. Proper implementation guidelines are suggested and prototyped View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterizing vulnerability of parallelism to resource constraints

    Publication Year: 1997, Page(s):236 - 243
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    The theoretical available instruction level parallelism in most benchmark is very high. Vulnerability is related to the difficulty with which we can extract this parallelism with finite resources. This study characterizes the vulnerability of parallelism to resource constraints by scheduling dynamic dependence graphs (DDGs) from traces of several benchmarks using different scheduling algorithms an... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A study of tree-based control flow prediction schemes

    Publication Year: 1997, Page(s):28 - 33
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    In order to fetch a large number of instructions per cycle from a sequential program, wide-issue superscalar processors have to predict the outcome of multiple branches in a cycle, and fetch instructions from non-contiguous portions of code. Past research has developed schemes that predict the outcome of multiple branches by performing a single prediction. That is, instead of predicting the outcom... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A high performance two dimensional scalable parallel algorithm for solving sparse triangular systems

    Publication Year: 1997, Page(s):137 - 143
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (604 KB)

    Solving a system of equations of the form Tx=y, where T is a sparse triangular matrix, is required after the factorization phase in the direct methods of solving systems of linear equations. A few parallel formulations have been proposed recently. The common belief in parallelizing this problem is that the parallel formulation utilizing a two dimensional distribution of T is unscalable. We propose... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Concurrency control of nested cooperative transactions in active DBMS

    Publication Year: 1997, Page(s):4 - 9
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (556 KB)

    Active database management systems (ADBMSs) use event-condition-action (ECA) rules. Each ECA rule specifies what action is to be taken when an event occurs and the specified condition is satisfied. In this paper, we introduce a concurrency control scheme for handling nested cooperative transactions using detached-mode ECA rules of an ADBMS. A state transition model has been proposed to specify dif... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new consistency protocol implemented in the CAliF system

    Publication Year: 1997, Page(s):82 - 87
    Cited by:  Papers (1)  |  Patents (81)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB)

    We propose a new consistency protocol for distributed shared memory (DSM) where different shared objects are replicated at each site. This protocol was developed for the cooperative platform called CAliF: Cooperative Application Framework. This system uses DSM to transparently handle the data sharing. We present an algorithm which uses the token technique. Updates of shared data are carried throug... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient algorithm for feature extraction from oceanographic images

    Publication Year: 1997, Page(s):533 - 538
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (484 KB)

    This paper presents a new computational scheme based on multiresolution decomposition for extracting the features of interest from oceanographic images by suppressing noise. The multiresolution analysis from the median presented by (Starck et al., 1994) is used for the noise suppression. A parallel approach is presented for this computationally intensive problem of infrared images View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast reductions on a network of workstations

    Publication Year: 1997, Page(s):468 - 473
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB)

    Reduction operations are very useful in parallel and distributed computing, with applications in barrier synchronization, distributed snapshots, termination detection, global virtual time computation, etc. In the context of parallel discrete event simulations, we have previously introduced a class of adaptive synchronization algorithms based on fast reductions. We explore the implementation of fas... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel domain decomposition and load balancing using space-filling curves

    Publication Year: 1997, Page(s):230 - 235
    Cited by:  Papers (23)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (600 KB)

    Partitioning techniques based on space filling curves have received much recent attention due to their low running time and good load balance characteristics. The basic idea underlying these methods is to order the multidimensional data according to a space filling curve and partition the resulting one dimensional order. However, space filling curves are defined for points that lie on a uniform gr... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PiSMA: an upgradable fault tolerant approach to parallel processing

    Publication Year: 1997, Page(s):277 - 283
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (648 KB)

    Parallel processors reduce the communication overhead problem with the employment of some form of global communication network. This network however, imposes restrictions on the scalability and technological evolution of the parallel processor. In this paper a novel architecture called PiSMA (Parallel Virtual Shared Memory Architecture) is proposed, which consists of a basic substrate, without a n... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A different approach to high performance computing

    Publication Year: 1997, Page(s):22 - 27
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB)

    A common approach to enhance the performance of processors is to increase the number of function units which operate concurrently. We observe this development in all recent superscalar and VLIW (very-long instruction word) processors. VLIWs are easier extensible to high performance ranges because they lack much of the superscalar hardware required for dependence checking and hardware resource allo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.