Skip to Main Content
Increasingly, multi-core processors, multi-processor nodes and multi-core, multi-processor nodes are finding their way into computer clusters. Clusters built using such nodes are already quite common and, inevitably, will become more so over time. As with any new technology, however, the potential benefits are seldom as easy to attain as we expect them to be. In this paper, we explore three fundamental issues related to the use of multi-core, multi-processor nodes in compute clusters using MPI: inter-communication (messaging) efficiency, cache effects (in particular processor affinity) and initial process distribution. Based on some initial experiments using a subset of the NAS parallel benchmarks running on a small scale cluster with dual core, dual processor nodes, we report results on the impact of these issues. From these results we attempt to extrapolate some simple, guidelines that are likely to be generally applicable for optimizing MPI code running on clusters with multi- core, multi-processor nodes.