Skip to Main Content
Profiles of refraction and bending angle, which computed through the forward model for GPSRO (Global Positioning System radio occultation), are extremely important for GPS radio occultation data assimilation to the forecast system of NWP (Numerical Weather Prediction). The daily processing of GPS RO data in assimilation system costs amount of time, thus there is an urgent need to find a new way to reduce the computing time. GPU is suited for many data computation-intensive task and has emerged as an inexpensive high performance co-processor because of their tremendous computing power. In this paper, we demonstrate how forward model for GPS can be accelerated considerably by using throughput-oriented GPU on a standard PC. Our implementation is based on loop unrolling, CUDA stream, SPMD, and SIMD vector parallel computing. We have successfully implemented the forward model on single GPU platform, and then develop a simple CPU/GPU parallel cluster. The results on GTX 480 for a single-GPU show a speedup of up to 259 over CPU-based program. In comparison to a single node, the speedup on our cluster which has three nodes is 2.68. All results demonstrate that the forward model can be high efficiently parallelized on CPU/GPU cluster. Besides, it also indicates that the cluster has good scalability.