Skip to Main Content
Computational Fluid Dynamics (CFD) is an increasingly important application domain for computational scientists. In this paper, we propose and analyze optimizations necessary to run CFD simulations consisting of multi-billion-cell mesh models on large processor systems. Our investigation leverages the general industrial Navier-Stokes open-source CFD application, Code_Saturne, developed by Electricité de France (EDF). Our work considers emerging processor features such as many-core, Symmetric Multi-threading (SMT), Single Instruction Multiple Data (SIMD), Transactional Memory, and Thread Level Speculation. Initially, we have targeted per-node performance improvements by reconstructing the code and data layouts to optimally use multiple threads. We present a general loop transformation that will enable the compiler to generate OpenMP threads effectively with minimal impact to overall code structure. A renumbering scheme for mesh faces is proposed to enhance thread-level parallelism and generally improve data locality. Performance results on IBM Blue Gene/P supercomputer and Intel Xeon Westmere cluster are included.