By Topic

Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)

In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no per- - formance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.

Published in:

Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference on

Date of Conference:

17-20 May 2010