Optimizing bandwidth limited problems using one-sided communication and overlap
Bell, C.
Bonachea, D.
Nishtala, R.
Yelick, K.
Comput. Sci. Div., California Univ., Berkeley, CA;
Abstract
This paper demonstrates the one-sided communication used in languages like UPC can provide a significant performance advantage for bandwidth-limited applications. This is shown through communication microbenchmarks and a case-study of UPC and MPI implementations of the NAS FT benchmark. Our optimizations rely on aggressively overlapping communication with computation, alleviating bottlenecks that typically occur when communication is isolated in a single phase. The new algorithms send more and smaller messages, yet the one-sided versions achieve > 1.9times speedup over the base Fortran/MPI. Our one-sided versions show an average 15% improvement over the two-sided versions, due to the lower software overhead of onesided communication, whose semantics are fundamentally lighter-weight than message passing. Our UPC results use Berkeley UPC with GASNet and demonstrate the scalability of that system, with performance approaching 0.5 TFlop/s on the FT benchmark with 512 processors
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.