By Topic

Memory Hierarchy Optimization for Large Tridiagonal System Solvers on GPU

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Lamas-Rodriguez, J. ; Res. Center on Inf. Technol., Univ. of Santiago de Compostela, Santiago de Compostela, Spain ; Arguello, F. ; Heras, D.B. ; Boo, M.

Nowadays GPUs are commodity hardware containing hundreds of cores and supporting thousands of threads that can be used to accelerate a wide range of applications. From a programmer's perspective, GPUs offer a stream processing model which requires the application of new techniques to exploit their capabilities. In this paper we present the application of the split-and-merge technique to the following parallel tridiagonal system solvers on the GPU: cyclic reduction and recursive doubling. The split-and-merge technique naturally splits the algorithm flow in parallel paths that can be solved in shared memory, and later merged in global memory. In this way, we can solve large systems of equations efficiently exploiting the memory hierarchy of the GPU. The results obtained show a significant acceleration compared with the direct implementation of the algorithms on the GPU.

Published in:

Parallel and Distributed Processing with Applications (ISPA), 2012 IEEE 10th International Symposium on

Date of Conference:

10-13 July 2012