By Topic

GPS forward model computing study on CPU/GPU co-processing parallel system using CUDA

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Fukang Yin ; School of computer, National University of defense technology, Changsha China ; Fengshun Lu ; Xiaoqun Cao ; Junqiang Song

Profiles of refraction and bending angle, which computed through the forward model for GPSRO (Global Positioning System radio occultation), are extremely important for GPS radio occultation data assimilation to the forecast system of NWP (Numerical Weather Prediction). The daily processing of GPS RO data in assimilation system costs amount of time, thus there is an urgent need to find a new way to reduce the computing time. GPU is suited for many data computation-intensive task and has emerged as an inexpensive high performance co-processor because of their tremendous computing power. In this paper, we demonstrate how forward model for GPS can be accelerated considerably by using throughput-oriented GPU on a standard PC. Our implementation is based on loop unrolling, CUDA stream, SPMD, and SIMD vector parallel computing. We have successfully implemented the forward model on single GPU platform, and then develop a simple CPU/GPU parallel cluster. The results on GTX 480 for a single-GPU show a speedup of up to 259 over CPU-based program. In comparison to a single node, the speedup on our cluster which has three nodes is 2.68. All results demonstrate that the forward model can be high efficiently parallelized on CPU/GPU cluster. Besides, it also indicates that the cluster has good scalability.

Published in:

Progress in Informatics and Computing (PIC), 2010 IEEE International Conference on  (Volume:1 )

Date of Conference:

10-12 Dec. 2010