Skip to Main Content
Summary form only given. In this paper, a systematic study of two main types of approach for MPI datatype communication (pack/unpack-based approaches and copy-reduced approaches) is carried out on the InfiniBand network. We focus on overlapping packing, network communication, and unpacking in the pack/unpack-based approaches. We use RDMA operations to avoid packing and/or unpacking in the copy-reduced approaches. Four schemes (buffer-centric segment pack/unpack, RDMA write gather with unpack, pack with RDMA read scatter, and multiple RDMA writes have been proposed. Three of them have been implemented and evaluated based on one MPI implementation over InfiniBand. Performance results of a vector microbenchmark demonstrate that latency is improved by a factor of up to 3.4 and bandwidth by a factor of up to 3.6 compared to the current datatype communication implementation. Collective operations like MPI Alltoall are demonstrated to benefit. A factor of up to 2.0 improvement has been seen in our measurements of those collective operations on an 8-node system.