By Topic

Implementing High Performance Remote Method Invocation in CCA

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

6 Author(s)
Jian Yin ; Pacific Northwest Nat. Lab., Richland, WA, USA ; Agarwal, K. ; Krishnan, M. ; Chavarria-Miranda, D.
more authors

We report our effort in engineering a high performance remote method invocation (RMI) mechanism for the Common Component Architecture (CCA). This mechanism provides a highly efficient and easy-to-use mechanism for distributed computing in CCA, enabling CCA applications to effectively leverage parallel systems to accelerate computations. This work is built on the previous work of Babel RMI. Babel is a high performance language interoperability tool that is used in CCA for scientific application writers to share, reuse, and compose applications from software components written in different programming languages. Babel provides a transparent and flexible RMI framework for distributed computing. However, the existing Babel RMI implementation is built on top of TCP and does not provide the level of performance required to distribute fine-grained tasks. We observed that the main reason the TCP based RMI does not perform well is because it does not utilize the high performance interconnect hardware on a cluster efficiently. We have implemented a high performance RMI protocol, HPCRMI. HPCRMI achieves low latency by building on top of a low-level portable communication library, Aggregated Remote Message Copy Interface (ARMCI), and minimizing communication for each RMI call. Our design allows a RMI operation to be completed by only two RDMA operations. We also aggressively optimize our system to reduce copying. In this paper, we discuss the design and our experimental evaluation of this protocol. Our experimental results show that our protocol can improve RMI performance by an order of magnitude.

Published in:

Cluster Computing (CLUSTER), 2011 IEEE International Conference on

Date of Conference:

26-30 Sept. 2011