The widespread adoption of multi-core processors in supercomputing arena results in multiple processes in one node competing for the limited resources of the network interface. This is especially true for Collective communication in MPI. InfiniBand, as a prevailing high speed network, provides fine-grained Quality of Service (QoS) through Virtual Lanes (VLs) mechanism. In this paper, we study the possibility of enhancing the performance of MPI collective communication by using multiple Virtual Lanes. The utilization of multiple VLs may equalize the priorities of simultaneous send requests, accelerate the transmission of small messages and increase the utilization of network and memory bandwidth. These benefits speed up the MPI Collective communication. Factors that affect the utilization of multiple VLs are disscussed as well. Evaluations show that Alltoall, Reduce, Allreduce and Reduce_scatter operations benefit from our multiple Virtual Lanes aware design with about 10%~20% performance enhancement. Application evaluations show that our design increases the Fast Fourier Transform performance by 11% in the 1024-core cluster.
Published in:
High Performance Computing (HiPC), 2009 International Conference on
Date of Conference: 16-19 Dec. 2009