Skip to Main Content
Lattice Boltzmann methods (LBM) are used for the computational simulation of Newtonian fluid dynamics. LBM-based simulations are readily parallelizable; they have been implemented on general-purpose processors, field-programmable gate arrays (FPGAs), and graphics processing units (GPUs). Of the three methods, the GPU implementations achieved the highest simulation performance per chip. With memory bandwidth of up to 141 GB/s and a theoretical maximum floating point performance of over 600 GFLOPS, CUDA-ready GPUs from NVIDIA provide an attractive platform for a wide range of scientific simulations, including LBM. This paper improves upon prior single-precision GPU LBM results for the D3Q19 model by increasing GPU multiprocessor occupancy, resulting in an increase in maximum performance by 20%, and by introducing a space-efficient storage method which reduces GPU RAM requirements by 50% at a slight detriment to performance. Both GPU implementations are over 28 times faster than a single-precision quad-core CPU version utilizing OpenMP.
Date of Conference: 22-25 Sept. 2009