Scheduled System Maintenance
On Saturday, October 21, single article sales and account management will be unavailable until 6 PM ET.
Notice: There is currently an issue with the citation download feature. Learn more.

2015 9th International Conference on Partitioned Global Address Space Programming Models

16-18 Sept. 2015

Filter Results

Displaying Results 1 - 25 of 26
  • [Front cover]

    Publication Year: 2015, Page(s): C4
    Request permission for commercial reuse | PDF file iconPDF (1181 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2015, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (28 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2015, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (72 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2015, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (124 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2015, Page(s):v - vi
    Request permission for commercial reuse | PDF file iconPDF (128 KB)
    Freely Available from IEEE
  • Message from the General Chair

    Publication Year: 2015, Page(s): vii
    Request permission for commercial reuse | PDF file iconPDF (96 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Program Chair

    Publication Year: 2015, Page(s): viii
    Request permission for commercial reuse | PDF file iconPDF (77 KB) | HTML iconHTML
    Freely Available from IEEE
  • Conference Committee

    Publication Year: 2015, Page(s): ix
    Request permission for commercial reuse | PDF file iconPDF (96 KB)
    Freely Available from IEEE
  • Keynote Speakers

    Publication Year: 2015, Page(s):x - xiii
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (118 KB)

    Provides an abstract for each of the keynote presentations and a brief professional biography of each presenter. The complete presentations were not made available for publication as part of the conference proceedings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Invited Speakers

    Publication Year: 2015, Page(s):xiv - xviii
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (106 KB)

    Provides an abstract for each of the invited presentations and a brief professional biography of each presenter. The complete presentations were not made available for publication as part of the conference proceedings. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On the Fence: An Offload Approach to Ordering One-Sided Communication

    Publication Year: 2015, Page(s):1 - 12
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (512 KB) | HTML iconHTML

    Partitioned Global Address Space (PGAS) and one-sided communication models allow shared data to be transparently and asynchronously accessed by any process within a parallel computation. In order to ensure that updates are performed in the intended order, the programmer must either use potentially slower ordered communication, or perform operations that order unordered communication, such as a fen... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Caching Puts and Gets in a PGAS Language Runtime

    Publication Year: 2015, Page(s):13 - 24
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (341 KB) | HTML iconHTML

    We investigated a software cache for PGAS PUT and GET operations. The cache is implemented as a software write-back cache with dirty bits, local memory consistency operations, and programmer-guided prefetch. This cache supports programmer productivity while enabling communication aggregation and overlap. We evaluated an implementation of this cache for remote data within the Chapel programming lan... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Impact of Frequency Scaling on One Sided Remote Memory Accesses

    Publication Year: 2015, Page(s):25 - 37
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (296 KB) | HTML iconHTML

    CPU Frequency scaling is a common approach used for achieving energy savings in parallel applications. A typical approach for achieving power savings is by reducing the frequency of a processor whenever the invested CPU cycles do not contribute to the progress of an application (e.g. polling for events). Many recent research efforts have been directed towards employing this approach within HPC app... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementing High-Performance Geometric Multigrid Solver with Naturally Grained Messages

    Publication Year: 2015, Page(s):38 - 46
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (971 KB) | HTML iconHTML

    Structured grid linear solvers often require manually packing and unpacking of communication data to achieve high performance.Orchestrating this process efficiently is challenging, labor-intensive, and potentially error-prone.In this paper, we explore an alternative approach that communicates the data with naturally grained message sizes without manual packing and unpacking. This approach is the d... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Evaluation of Anticipated Extensions for Fortran Coarrays

    Publication Year: 2015, Page(s):47 - 58
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (568 KB) | HTML iconHTML

    A set of parallel features, broadly referred to as Fortran coarrays, was added to the Fortran 2008 standard. It is expected that several new parallel features, designed to complement or augment this feature set, will be added to the next revision of the standard. This includes statements for forming and changing between image teams, as well as statements for performing communication and synchroniz... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • An Implementation of OFI Libfabric in Support of Multithreaded PGAS Solutions

    Publication Year: 2015, Page(s):59 - 69
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (350 KB) | HTML iconHTML

    In this paper, we present an implementation of the OpenFabrics Interfaces (OFI) libfabric API in support of multithreaded PGAS programming models. Specifically, we describe a libfabric provider implementation for the Cray XCTM system using the Generic Network Interface (GNI) library. OFI libfabric is a new portable network API designed to address the needs of high performance networking software. ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Preliminary Implementation of Coarray Fortran Translator Based on Omni XcalableMP

    Publication Year: 2015, Page(s):70 - 75
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (369 KB) | HTML iconHTML

    XcalableMP (XMP) is a PGAS language for distributed memory environments. It employs Coarray Fortran (CAF) features as the local-view programming model. We implemented the main part of CAF in the form of a translator, i.e., a source-to-source compiler, as a part of Omni XMP compiler. The compiler uses GASNet and the Fujitsu RDMA interface to allocate static and allocatable coarrays and to get and p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Using the Parallel Research Kernels to Study PGAS Models

    Publication Year: 2015, Page(s):76 - 81
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (289 KB) | HTML iconHTML

    A subset of the Parallel Research Kernels (PRK),simplified parallel application patterns, are used to study the behavior of different runtimes implementing the PGAS programming model. The goal of this paper is to show that such an approach is practical and effective as we approach the exascale era. Our experimental results indicate that forthe kernels we selected, MPI with two-sided communications... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PHLAME: Hierarchical Locality Exploitation Using the PGAS Model

    Publication Year: 2015, Page(s):82 - 89
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (859 KB) | HTML iconHTML

    Parallel computers are becoming deeply hierarchical. Locality aware programming models allow programmers to control locality at one level through establishing affinity between data and executing activities. This, however, does not enable locality exploitation at other levels. Therefore, we must conceive an efficient abstraction of hierarchical locality and develop techniques to exploit it. Techniq... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Compiler Transformation to Overlap Communication with Dependent Computation

    Publication Year: 2015, Page(s):90 - 92
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (195 KB) | HTML iconHTML

    Hiding communication latency is essential to achieve scalable performance on current and future parallel systems. In this extended abstract, we present a novel compiler transformation that overlaps communication with computation to hide communication latency. Unlike prior work, we are able to achieve this overlap even in the presence of an overlap-inhibiting data dependence between the communicati... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward a Data-centric Profiler for PGAS Applications

    Publication Year: 2015, Page(s):93 - 95
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (267 KB) | HTML iconHTML

    This paper describes a data-centric profilingtool that provides a way to map performance problemsback to data structures in Chapel programs. Wedescribe the tool's implementation, and illustrate its usewith two simple test programs. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Scaling HabaneroUPC++ on Heterogeneous Supercomputers

    Publication Year: 2015, Page(s):96 - 98
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (830 KB) | HTML iconHTML

    Accelerators/co-processors have made their way into supercomputing systems. These modern heterogeneous systems feature multiple layers of memory hierarchies, and produce a high degree of thread-level parallelism. To ensure that current and future applications perform well on these systems, it is important that users be able to cleanly express the various types of parallelism found in their applica... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • PySHMEM: A High Productivity OpenSHMEM Interface for Python

    Publication Year: 2015, Page(s):99 - 101
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (231 KB) | HTML iconHTML

    OpenSHMEM is a well known high performance communication library implementing the Partitioned Global Address Space (PGAS) programming model. It exposes a broad range of one-sided communication semantics which maps well to modern network technologies and can achieve a level of performance that is close to that of network hardware. In this paper we explore how OpenSHMEM semantics can be integrated w... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • ISx: A Scalable Integer Sort for Co-design in the Exascale Era

    Publication Year: 2015, Page(s):102 - 104
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (228 KB) | HTML iconHTML

    This paper introduces a new scalable integer sort application inspired by the NAS Parallel Benchmark integer sort. We provide a detailed analysis of the NPB integer sort to motivate the development of ISx-a new integer sort for co-design. ISx is a highly modular application implemented in the OpenSHMEM parallel programming model and supports both strong and weak scaling studies. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 2015, Page(s): 105
    Request permission for commercial reuse | PDF file iconPDF (58 KB)
    Freely Available from IEEE