By Topic

Parallel multiple precision division by a single precision divisor

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Emmart, N. ; Comput. Sci. Dept., Univ. of Massachusetts, Amherst, MA, USA ; Weems, C.

We report an algorithm for division of a multi- precision integer by a single-precision value using a graphics processing unit (GPU). Our algorithm combines a parallel version of Jebelean's exact division algorithm with a left-to- right algorithm for computing the borrow chain, to relax the requirement of exactness. We also employ Takahashi's recently reported cyclic reduction technique [10] for GPU division to further enhance performance. The result is that our algorithm is asymptotically faster, at O(n/p + log p), than Takahashi's algorithm at 0(n/p log p). We report results for dividends with precisions of 1024, 2048, and 4096 bits running on an NVIDIA GTX 480, and show that, for non-constant divisors, our algorithm is 20% slower at 1024 bits (due to startup overhead), by 2048 we are 40% faster, and at 4096 bits we are able to run 2.5 times faster. For division by constants, with precomputed tables, our algorithm is faster at all sizes with a speedup ranging from 2.3 to 6 times faster.

Published in:

High Performance Computing (HiPC), 2011 18th International Conference on

Date of Conference:

18-21 Dec. 2011