Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)

20-20 Dec. 1998

Filter Results

Displaying Results 1 - 25 of 63
  • Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)

    Publication Year: 1998
    Request permission for commercial reuse | PDF file iconPDF (95 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 1998, Page(s):v - x
    Request permission for commercial reuse | PDF file iconPDF (250 KB)
    Freely Available from IEEE
  • GLB: a low-cost scheduling algorithm for distributed-memory architectures

    Publication Year: 1998, Page(s):294 - 301
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (160 KB)

    This paper proposes a new compile time scheduling algorithm for distributed-memory systems, called Global Load Balancing (GLB). GLB is intended as the second step in the multi-step class of scheduling algorithms. Experimental results show that compared with known scheduling algorithms of the same low-cost complexity, the proposed algorithm improves schedule lengths up to 30%. Compared to algorithm... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Global reactive congestion control in multicomputer networks

    Publication Year: 1998, Page(s):179 - 186
    Cited by:  Papers (17)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (112 KB)

    In this paper we develop a general approach to global reactive congestion control in multicomputer networks. The approach uses a timeout mechanism to detect congestion, and exploits control lines such as those used for handshaking in the flit-level flow control of wormhole routers to distribute information about congestion. It is also based on a mechanism that limits the demands placed by the netw... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The Augmented Composite Banyan Network

    Publication Year: 1998, Page(s):285 - 292
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (168 KB)

    A new multipath multistage interconnection network called the Augmented Composite Banyan Network (ACBN) is proposed. The ACBN is created by adding a link to each SE of the Composite Banyan Network (CBN), which is a multipath network with at least two disjoint paths and was originally proposed in (Seo and Feng, 1995). Therefore, the basic building blocks in the ACBN are 4×4 SEs with log2... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Parallel algorithms for vehicle routing problems

    Publication Year: 1998, Page(s):171 - 178
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (188 KB)

    In a complete directed weighted graph there are jobs located at nodes of the graph. Job i has an associated processing time or handling time hi, and the job must start within a prespecified time window [ri, di]. A vehicle can move on the arcs of the graph, at unit speed and that has to execute the jobs within their respective time windows. We consider three differe... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Extended collective I/O for efficient retrieval of large objects

    Publication Year: 1998, Page(s):359 - 366
    Cited by:  Patents (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (108 KB)

    Object-relational database management systems (OR-DBMS) extend the capabilities of the relational databases by allowing definition of new data types and methods to operate on these data types while retaining most of the relational model semantics. In this paper we examine issues related to parallel processing of queries in the object-relational model with respect to efficient storage and retrieval... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Message passing support on StarT-Voyager

    Publication Year: 1998, Page(s):228 - 237
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (116 KB)

    No single message passing mechanism can efficiently support all types of communication that commonly occur in most parallel or distributed programs. MIT's StarT-Voyager, a hybrid message passing/shared memory parallel machine, provides four message passing mechanisms to achieve high performance over a wide spectrum of communication types and sizes. Hardware and address translation enforced protect... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A clustering approach in characterizing interconnection networks

    Publication Year: 1998, Page(s):277 - 284
    Cited by:  Papers (4)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (204 KB)

    Networks of workstations (NOW) have gained importance in recent years. The interconnection network of NOW systems often consist of generic switches connected in an irregular topology. Traditionally, interconnection networks are characterized by their topological properties, such as number of nodes, diameter, and bisection width. These parameters are not sufficient in characterizing irregular netwo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An improved parallel disk scheduling algorithm

    Publication Year: 1998, Page(s):383 - 390
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (140 KB)

    We address the problems of prefetching and I/O scheduling for read-once reference strings in a parallel I/O system. Read-once reference strings, in which each block is accessed exactly once, arise naturally in applications like databases and video retrieval. Using the standard parallel disk model with D disks and a shared I/O buffer of size M, we present a novel algorithm, red-black prefetching (R... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Broadcasting on a budget in the multi-service communication model

    Publication Year: 1998, Page(s):163 - 170
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (148 KB)

    In this paper we introduce the MULTI_SERVICE model of network communication. This model attempts to capture recent communication technology trends, such as aspects of quality-of-service and their relation to the emerging technology of automatic pricing, e.g. for Internet services. The MULTI_SERVICE model differs from related models by taking communication and service activation time ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Hierarchical architecture for parallel query processing on networks of workstations

    Publication Year: 1998, Page(s):351 - 358
    Cited by:  Papers (1)  |  Patents (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (76 KB)

    Networks of workstations (NOWs) are cost-effective alternatives to multiprocessor systems. Recently, NOWs have been proposed for parallel query processing. Idle CPU cycles of workstations in a NOW-based system can be used to process database query operations. We report on the performance of the hierarchical architecture for parallel query processing on a NOW. We have implemented the hierarchical a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Control flow prediction with unbalanced tree-like subgraphs

    Publication Year: 1998, Page(s):221 - 227
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (212 KB)

    In order to fetch a large number of instructions per cycle from a sequential program, wide-issue superscalar processors have to predict the outcome of multiple branches in a cycle, and fetch instructions from non-contiguous portions of code. Past research has developed schemes that predict the outcome of multiple branches by means of a single prediction. That is, instead of predicting the outcome ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Permutation admissibility in shuffle-exchange networks with arbitrary number of stages

    Publication Year: 1998, Page(s):270 - 276
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (240 KB)

    The set of input-output permutations that are routable through a multistage interconnection network without any conflict (known as the admissible set), plays an important role in determining the capability of the network. Recent works on the permutation admissibility problem of shuffle-exchange networks (SEN) of size N×N, deal with (n+k) stages, where n=log2N, and k denotes the nu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient address sequence generation for two-level mappings in High Performance Fortran

    Publication Year: 1998, Page(s):132 - 139
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (208 KB)

    Data-parallel languages like High Performance Fortran allow users to specify mappings of arrays by first aligning elements to an abstract Cartesian grid called templates and then distributing the templates across processors. Code generation then includes the generation of the sequence of local addresses accessed on a processor. Address sequence generation for non-unit alignment strides, referred t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Efficient retrieval of multidimensional datasets through parallel I/O

    Publication Year: 1998, Page(s):375 - 382
    Cited by:  Papers (7)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (200 KB)

    Many scientific and engineering applications process large multidimensional datasets. An important access pattern for these applications is the retrieval of data corresponding to ranges of values in multiple dimensions. Performance is limited by disk largely due to high disk latencies. Tiling and distributing the data across multiple disks is an effective technique for improving performance throug... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • New number representation and conversion techniques on reconfigurable mesh

    Publication Year: 1998, Page(s):2 - 10
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (200 KB)

    Several new number representations based on the residue number system are presented which use the smallest prime numbers as moduli and are suited for parallel computations on a reconfigurable mesh architecture. It is shown how to convert in O(1) time any integer ranging between 0 and n-1, from any commonly used representation to any new representation proposed in the paper (and vice versa) using a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • How to improve local load balancing policies by distorting load information

    Publication Year: 1998, Page(s):318 - 325
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (140 KB)

    The paper focuses on local load balancing policies for massively parallel architectures and introduces a new scheme for load information exchange between neighbor nodes. The idea is to distort the exchanged load information to let the policy keep into account a more global view of the system and overcome the limits of the local scope. The presented scheme has been integrated into two variants of a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • One to all broadcast in hyper butterfly networks

    Publication Year: 1998, Page(s):155 - 162
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (116 KB)

    The authors further investigate the topological properties of the hyper butterfly networks; they develop algorithms for constructing edge disjoint spanning trees in wrapped butterfly graphs and hyper butterfly networks and they use those results to design asymptotically optimal one-to-all broadcast algorithms in those two classes of networks View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Data prefetching with co-operative caching

    Publication Year: 1998, Page(s):25 - 32
    Cited by:  Papers (3)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1276 KB)

    Recent research in data cache prefetching is found to be selective in nature: achieving high prediction accuracy over a set of selected references such as array access with constant strides. As a result, for applications where the memory latency is mainly due to data accesses in the set of non selected references of a program, they lose their effectiveness. In fact, their performance might be wors... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Selection algorithms for parallel disk systems

    Publication Year: 1998, Page(s):343 - 350
    Cited by:  Papers (7)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (212 KB)

    With the widening gap between processor speeds and disk access speeds, the I/O bottleneck has become critical. Parallel disk systems (PDS) have been introduced to alleviate this bottleneck. We present deterministic and randomized selection algorithms for parallel disk systems. The algorithms to be presented, in addition to being asymptotically optimal, have small underlying constants in their time... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A comparative study of some network subsystem organizations

    Publication Year: 1998, Page(s):436 - 443
    Cited by:  Papers (1)  |  Patents (13)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (84 KB)

    The impact of alternative network subsystem design for realizing low end-to-end latencies and high network throughput in a switched LAN are studied in detail through simulation. These alternatives include choices in the disposition of the network interface card (NIC), DMA priorities and OS services. Our simulation model captures the delays of OS services/software layers, message copying DMAs and, ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Memory bank disambiguation using modulo unrolling for Raw machines

    Publication Year: 1998, Page(s):212 - 220
    Cited by:  Papers (5)  |  Patents (69)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (100 KB)

    We present modulo unrolling, a code transformation technique for enabling array references to be accessed through the fast static network on a Raw machine. A Raw machine comprises of a mesh of simple, replicated tiles connected by an interconnect which supports fast, static near-neighbor communication. Like all other resources, memory is distributed across the tiles. Management of the memory can b... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On topology and bisection bandwidth of hierarchical-ring networks for shared-memory multiprocessors

    Publication Year: 1998, Page(s):262 - 269
    Cited by:  Papers (8)  |  Patents (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (148 KB)

    Hierarchical-ring based multiprocessors are interesting alternatives to the more popular two-dimensional direct networks. They allow for simple router designs and wider communication paths than their direct network counterparts. There are several ways hierarchical-ring networks can be configured for a given number of processors. Feasible topologies range from tall, lean networks to short, wide net... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Performance-driven design and redesign of high-speed local area networks

    Publication Year: 1998, Page(s):416 - 421
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (112 KB)

    Although distributed computing over a network of computers has become a reality, its success mainly depends on the performance of the underlying network. In this paper, we consider the problem of designing a local area network with specified cost and performance constraints. The cost and performance of a local area network (LAN) are directly related to its topology. Using the a priori knowledge of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.