By Topic

A three-parameter fast Givens QR algorithm for superscalar processors

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Carrig, J.J., Jr. ; Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MD, USA ; Meyer, G.G.L.

We present a three parameter fast Givens QR algorithm that exploits parallelism to improve performance on superscalar processors. We provide a selection of parameter values for which the new algorithm reduces to the standard algorithm, but show that non-standard values minimize the number of cache misses, memory references and pipeline stalls. Using a tractable model of a superscalar machine architecture, we derive rules for estimating the optimal combination of parameter values. Applying these rules, we observe a speedup over the standard algorithm of 2.4 on the Intel Pentium Pro system, 2.0 on a single thin POWER2 processor of the IBM SP2, 1.6 on a single wide POWER2 processor of the IBM SP2, and 4.2 on a single R8000 processor of the SGI POWER Challenge XL

Published in:

Parallel Processing, 1996. Vol.3. Software., Proceedings of the 1996 International Conference on  (Volume:2 )

Date of Conference:

12-16 Aug 1996