By Topic

Scalable parallel matrix multiplication on distributed memory parallel computers

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Keqin Li ; Dept. of Math. & Comput. Sci., State Univ. of New York, New Paltz, NY, USA

Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(Nα), where 2<α⩽3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O (log N) time by using Nα/log N processors. Such a parallel computation is cost optimal and matches the performance of PRAM. Furthermore, our parallelization on a DMPC can be made fully scalable, that is, for all 1⩽p⩽Nαα/log N, multiplying two N×N matrices can be performed by a DMPC with p processors in O(Nα/p) rime, i.e., linear speedup and cost optimality can be achieved in the range [1..Nα/log N]. This unifies all known algorithms for matrix multiplication on DMPC, standard or non-standard, sequential or parallel. Extensions of our methods and results to other parallel systems are also presented. The above claims result in significant progress in scalable parallel matrix multiplication (as well as solving many other important problems) on distributed memory systems, both theoretically and practically

Published in:

Parallel and Distributed Processing Symposium, 2000. IPDPS 2000. Proceedings. 14th International

Date of Conference: