By Topic

Accelerating Kirchhoff Migration by CPU and GPU Cooperation

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

12 Author(s)
Jairo Panetta ; Tecnol. Geofisica, Petroleo Brasileiro SA, Rio de Janeiro, Brazil ; Thiago Teixeira ; Paulo R. P. de Souza Filho ; Carlos A. da Cunha Finho
more authors

We discuss the performance of Petrobras production Kirchhoff prestack seismic migration on a cluster of 64 GPUs and 256 CPU cores. Porting and optimization of the application hot spot (98.2% of a single CPU core execution time) to a single GPU reduces total execution time by a factor of 36 on a control run. We then argue against the usual practice of porting the next hot spot (1.5% of single CPU core execution time) to the GPU. Instead, we show that cooperation of CPU and GPU reduces total execution time by a factor of 59 on the same control run. Remaining GPU idle cycles are eliminated by overloading the GPU with multiple requests originated from distinct CPU cores. However, increasing the number of CPU cores in the computation reduces the gain due to the combination of enhanced parallelism in the runs without GPUs and GPU saturation on runs with GPUs. We proceed by obtaining close to perfect speed-up on the full cluster over homogeneous load obtained by replicating control run data. To cope with the heterogeneous load of real world data we show a dynamic load balancing scheme that reduces total execution time by a factor of 20 on runs that use all GPUs and half of the cluster CPU cores with respect to runs that use all CPU cores but no GPU.

Published in:

Computer Architecture and High Performance Computing, 2009. SBAC-PAD '09. 21st International Symposium on

Date of Conference:

28-31 Oct. 2009