Skip to Main Content
In this work, we present our implementation of the density functional theory (DFT) plane wave pseudopotential (PWP) calculations on GPU clusters. This GPU version is developed based on a CPU DFT-PWP code: PEtot, which can calculate up to a thousand atoms on thousands of processors. Our test indicates that the GPU version can have a ~10 times speed-up over the CPU version. A detail analysis of the speed-up and the scaling on the number of CPU/GPU computing units (up to 256) are presented. The success of our speed-up relies on the adoption a hybrid reciprocal-space and band-index parallelization scheme. As far as we know, this is the first GPU DFT-PWP code scalable to large number of CPU/GPU computing units. We also outlined the future work, and what is needed to further increase the computational speed by another factor of 10.
Date of Conference: 12-18 Nov. 2011