Skip to Main Content
uDAPL is portable and platform independent communication library, which provides RDMA as well as send/recv operations. Some well known software has attempted to take advantage of uDAPL's portability, such as Open MPI, MVAPICH2, Intel MPI, and Cluster OpenMP. However, network performance is still the bottleneck for those software. Engaging "multirail" network is a method to by-pass it. In this paper, we have designed a non-threaded and a threaded approaches to improve performance of uDAPL over multirail configured clusters. The two approaches will be evaluated on different InfiniBand multirail configured clusters. The results shows that threaded approach improves 33% and 148% of the uni-directional bandwidth on the multi-port and the multi-HCA configured network respectively, and the non-threaded approach improves ~90% of the uni-directional bandwidth on the multi-HCA configured network. A similar improvements have been achieved for the bi-directional bandwidth.