By Topic

Data distribution and communication schemes for IQMR method on massively distributed memory computers

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
L. T. Yang ; St. Francis Xavire Univ., Antigonish, NS, Canada

We study the parallelization of the IQMR method for the solutions of linear systems of equations with unsymmetric coefficient matrices. The IQMR method is an improved version of the quasi-minimal residual (IQMR) method by using the Lanczos process as a major component combining elements of numerical stability and parallel algorithm design. The algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent and communication time required for the inner product can be overlapped efficiently with computation time. Two important schemes are discussed. What is the best possible data distribution and which communication network topology is most suitable for the IQMR method on massively parallel distributed memory computers. A theoretical model of data distribution and communication phases is presented mainly based on (Hoekstra et al., 1991; 1992) which allows us to give a detailed execution time complexity analysis and investigates its usefulness. It is shown that the implementation of IQMR, with a row-block decomposition of the coefficient matrix, on a ring of communication structure is the most efficient choice. Performance tests of the developed parallel IQMR algorithm have been carried out on the massively distributed memory system and experimental timing results are compared with the theoretical execution time complexity analysis

Published in:

Parallel Processing, 2000. Proceedings. 2000 International Workshops on

Date of Conference: