By Topic

Proceedings Fourth International Conference on High-Performance Computing

18-21 Dec. 1997

Filter Results

Displaying Results 1 - 25 of 83
  • Proceedings Fourth International Conference on High-Performance Computing

    Publication Year: 1997
    Request permission for commercial reuse | PDF file iconPDF (451 KB)
    Freely Available from IEEE
  • Conference Organization

    Publication Year: 1997, Page(s):xviii - xxiii
    Request permission for commercial reuse | PDF file iconPDF (418 KB)
    Freely Available from IEEE
  • Virtual registers

    Publication Year: 1997, Page(s):364 - 369
    Cited by:  Papers (10)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (631 KB)

    The number of physical registers is one of the critical issues of current superscalar out-of-order processors. Conventional architectures allocate, in the decoding stage, a new storage location (e.g. a physical register) for each operation that has a destination register. When an instruction is committed, it frees the physical register allocated to the previous instruction that had the same destin... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 1997, Page(s):539 - 541
    Request permission for commercial reuse | PDF file iconPDF (141 KB)
    Freely Available from IEEE
  • Reconfigurable custom computing as a supercomputer replacement

    Publication Year: 1997, Page(s):260 - 269
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (856 KB)

    Reconfigurable computers are a new class of customisable computers constructed using field programmable gate arrays (FPGAs). They offer us the potential for supercomputing performance at high-end workstation costs for a range of niche applications. The performance achievable is a direct consequence of the machine architecture which gives the user direct exposure to the inherent parallelism present... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A computer aided architecture design tool aimed at image processing applications

    Publication Year: 1997, Page(s):523 - 526
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (428 KB)

    SAGAPA (System Architecture Generated from A Parallel Algorithm) is an image processing system design tool based on an original methodology that automatically carries out the minimal architecture adapted to the application algorithm in respect with a maximal execution time constraint. The obtained architecture is minimal in terms of number of processors, amount of memory, and complexity of interco... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On measuring the performance of adaptive wormhole routing

    Publication Year: 1997, Page(s):336 - 341
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (536 KB)

    Adaptive routing is widely regarded as a promising approach to improving interconnection network performance. Many designers of adaptive routing algorithms have used synthetic communication patterns, such as uniform and transpose traffic, to compare the performance of various adaptive routing algorithms with each other and with oblivious routing. These comparisons have shown that the average messa... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Probabilistic routing in wavelength-routed multistage, hypercube, and Debruijn networks

    Publication Year: 1997, Page(s):310 - 315
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (868 KB)

    Optical networks based on wavelength division multiplexing (WDM) and wavelength routing are considered to be potential candidates for the next generation of wide area networks. One of the main issues in these networks is the development of efficient routing algorithms which require a minimum number of wavelengths. We focus on the permutation routing problem in multistage WDM networks which we call... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Load balancing using symmetric broadcast networks: a PVM-based comparative performance study

    Publication Year: 1997, Page(s):244 - 249
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (476 KB)

    In parallel and distributed systems, an important issue in managing a decentralized task queue is load balancing among multiple processors. In this paper, we propose a scheme for this problem by using a symmetric broadcast network (SBN) which provides an efficient and robust communication pattern between processors. We compare the performance of SBN-based load balancing algorithm with randomizatio... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Custom virtual memory policies for an image reconstruction application

    Publication Year: 1997, Page(s):517 - 522
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (520 KB)

    Some scientific applications have very large memory requirements, and are required to move data between primary and secondary storage during execution. The I/O between processor and disk can be done using standard file interfaces or virtual memory. The use of virtual memory, though simple and straight-forward, has been criticized because of poor performance. However some modern operating systems p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • HOLMES: a tool for monitoring heterogeneous architectures

    Publication Year: 1997, Page(s):486 - 491
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (528 KB)

    Monitoring tools are necessary components in the support of distributed applications and can be used to provide dependability, debugging and testing, to enhance the performance and to make possible the run-time steering of applications. These tools are needed to exploit in the best way all the available high performance computing resources of a heterogeneous environment. The paper describes HOLMES... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving the efficiency of adaptive routing in networks with irregular topology

    Publication Year: 1997, Page(s):330 - 335
    Cited by:  Papers (33)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (552 KB)

    Networks of workstations are emerging as a cost-effective alternative to parallel computers. The interconnection between workstations usually relies on switch-based networks with irregular topologies. This irregularity makes routing and deadlock avoidance quite complicated. Current proposals avoid deadlock by removing cyclic dependencies between channels and therefore, many messages are routed alo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Interconnection network behavior on a multicomputer in the parallelization of the MPEG coding algorithm. Worm-hole vs. packet-switching routing

    Publication Year: 1997, Page(s):48 - 53
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (440 KB)

    We propose the implementation of a MPEG encoder developed by the University of California at Berkeley on a multicomputer system. Since this application is in real time, we present a mapping of the video sequence between the EPs of the architecture, where the communication between EPs is minimized. We also propose the necessary load/store process with a simple mechanism input/output, where the glob... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A study of tree-based control flow prediction schemes

    Publication Year: 1997, Page(s):28 - 33
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    In order to fetch a large number of instructions per cycle from a sequential program, wide-issue superscalar processors have to predict the outcome of multiple branches in a cycle, and fetch instructions from non-contiguous portions of code. Past research has developed schemes that predict the outcome of multiple branches by performing a single prediction. That is, instead of predicting the outcom... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Fast multiplier schemes using large parallel counters and shift switches

    Publication Year: 1997, Page(s):302 - 308
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (628 KB)

    We present novel fast parallel multiplier schemes. In contrast to the full adder binary logic based traditional designs, we use (incomplete) large parallel counters and large shift switch compressors, which are built based on shift switch logic, a logic with shift switches as logic elements performing modulo arithmetic operations on (non-binary) state signals. With the unique feature of shift swit... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Characterizing vulnerability of parallelism to resource constraints

    Publication Year: 1997, Page(s):236 - 243
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (676 KB)

    The theoretical available instruction level parallelism in most benchmark is very high. Vulnerability is related to the difficulty with which we can extract this parallelism with finite resources. This study characterizes the vulnerability of parallelism to resource constraints by scheduling dynamic dependence graphs (DDGs) from traces of several benchmarks using different scheduling algorithms an... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A new orthogonal multiprocessor and its application to image processing

    Publication Year: 1997, Page(s):511 - 516
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (432 KB)

    The authors propose a new orthogonal partially shared memory architecture for the design of multiprocessor systems. The architecture allows processors to partially share a 2-D array of memory modules in an orthogonal way with fewer limitations than those imposed by the traditional orthogonal (OMP) architecture. Processors have direct access to large neighborhoods of memory modules which can be use... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel real-time systems: formal specification

    Publication Year: 1997, Page(s):186 - 191
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (540 KB)

    For many real time applications, parallel computers offer a natural computing platform. However, very little attention has been paid to software support for real time embedded systems on parallel machines. The paper addresses the problem of formal software specification for parallel real time systems, and presents some features of a formal specification language-PRETSEL (Parallel REal Time SpEcifi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel algorithms for vehicle routing problems

    Publication Year: 1997, Page(s):144 - 151
    Cited by:  Papers (1)  |  Patents (9)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (672 KB)

    Vehicle routing problems involve the navigation of one or more vehicles through a network of locations. Locations have associated handling times as well as time windows during which they are active. The arcs connecting locations have time costs associated with them. In this paper, we consider two different problems in single vehicle routing. The first is to find least time cost routes between all ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Communication cost estimation and global data partitioning for distributed memory machines

    Publication Year: 1997, Page(s):480 - 485
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (456 KB)

    Estimating communication cost involved in executing a program on distributed memory machines is important for evaluating the overheads due to repartitioning. The authors present a scheme which will work with reasonable efficiency for arrays with at most 3 dimensions. The hyperplane partitioning technique given by Prakash and Srikant (1997) is extended to complete programs by estimating the communi... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel data cube construction for high performance on-line analytical processing

    Publication Year: 1997, Page(s):10 - 15
    Cited by:  Papers (2)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (516 KB)

    Decision support systems use online analytical processing (OLAP) to analyze data by posing complex queries that require different views of data. Traditionally, a relational approach (ROLAP) has been taken to build such systems. More recently, multi-dimensional database techniques (MOLAP) have been applied to decision-support applications. Data is stored in multi-dimensional arrays, which is a natu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient algorithm for feature extraction from oceanographic images

    Publication Year: 1997, Page(s):533 - 538
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (484 KB)

    This paper presents a new computational scheme based on multiresolution decomposition for extracting the features of interest from oceanographic images by suppressing noise. The multiresolution analysis from the median presented by (Starck et al., 1994) is used for the noise suppression. A parallel approach is presented for this computationally intensive problem of infrared images View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A flow control mechanism to avoid message deadlock in k-ary n-cube networks

    Publication Year: 1997, Page(s):322 - 329
    Cited by:  Papers (24)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (716 KB)

    We propose a flow control algorithm for k-ary n-cube networks which avoids the deadlock problems without using virtual channels. Some basic definitions and theorems are proposed in order to establish the necessary and sufficient conditions to verify that an algorithm is deadlock-free. Our proposal is based on a restriction of the virtual cut-through flow control rather than of the routing algorith... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A memory-optimized visualization system for limited-bandwidth multiprocessing environments

    Publication Year: 1997, Page(s):60 - 65
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (636 KB)

    Object dataflow is a popular approach used in parallel rendering. The data representing the 3D scene is statically distributed among processors and objects are fetched and cached only on demand. Most previous object dataflow methods were implemented on shared memory architectures and exploited spatial coherency to reduce hardware cache misses. We propose an efficient model for object dataflow para... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mapping of neural network models onto massively parallel hierarchical computer systems

    Publication Year: 1997, Page(s):42 - 47
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (332 KB)

    Investigates the proposed implementation of neural networks on massively parallel hierarchical computer systems with hypernet topology. The proposed mapping scheme takes advantage of the inherent structure of hypernets to process multiple copies of the neural network in the different subnets, each executing a portion of the training set. Finally, the weight changes in all the subnets are accumulat... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.