By Topic

Parallel Fast Gauss Transform

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Rahul S. Sampath ; Oak Ridge Nat. Lab., Oak Ridge, TN, USA ; Hari Sundar ; Shravan K. Veerapaneni

We present fast adaptive parallel algorithms to compute the sum of N Gaussians at N points. Direct sequential computation of this sum would take O(N2) time. The parallel time complexity estimates for our algorithms are O (N/np) for uniform point distributions and O (N/np log N/+ np log np ) for nonuniform distributions using np CPUs. We incorporate a planewave representation of the Gaussian kernel which permits "diagonal translation". We use parallel octrees and a new scheme for translating the plane-waves to efficiently handle nonuniform distributions. Computing the transform to six-digit accuracy at 120 billion points took approximately 140 seconds using 4096 cores on the Jaguar supercomputer at the Oak Ridge National Laboratory. Our implementation is kernel-independent and can handle other "Gaussian-type" kernels even when an explicit analytic expression for the kernel is not known. These algorithms form a new class of core computational machinery for solving parabolic PDEs on massively parallel architectures.

Published in:

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Date of Conference:

13-19 Nov. 2010