Skip to Main Content
Load imbalance is an important impediment on the path towards higher degrees of parallelism - especially for engineering codes with their highly unstructured problem domains. In particular, when load conditions change dynamically, efficient mesh partitioning becomes an indispensable ingredient of scalable design. However, popular graph-based methods such as those used by ParMetis require global knowledge, which effectively limits the problem size on distributed-memory machines. On such architectures, space-filling curves (SFCs) offer a memory-efficient alternative and many sophisticated schemes have already been proposed. In this paper, we present a simple strategy based on SFCs that is custom-tailored to the needs of static meshes with dynamically changing computational load. Exploiting the properties of this class of problems, it is not only easy to implement but also reduces memory requirements substantially. Moreover, exclusively relying on MPI collective operations, our load-balancing scheme also offers portable performance across a broad range of HPC systems. Experimental evaluation shows excellent scaling behavior for up to 16,384 cores on a Nehalem-Infiniband system and up to 294,912 processes on a Blue Gene/P system.