We are currently experiencing intermittent issues impacting performance. We apologize for the inconvenience.
By Topic

Multi-level parallelism in the block-Jacobi SVD algorithm

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Oksa, G. ; Dept. of Inf., Slovak Acad. of Sci., Bratislava, Slovakia ; Vajtersic, M.

We analyse the fine-grained parallelism of the two-sided block-Jacobi algorithm for the singular value decomposition (SVD) of matrix A∈Rm×n, m⩾n. The algorithm involves the class CO of parallel orderings on the two-dimensional toroidal mesh with p processors. The mathematical background is based on the QR decomposition (QRD) of local data matrices and on the triangular Kogbetliantz algorithm (TKA) for local SVDs in the diagonal mesh processors. Subsequent updates of local matrices in the diagonal as well as nondiagonal mesh processors are required. WE show that all updates can be realized by orthogonal modified Givens rotations. These rotations can be efficiently pipelined in parallel in the horizontal and vertical rings of √p processors through the toroidal mesh. For one mesh processor our solution requires O[(m+n)2/p] systolic processing elements (PEs). O(m2/p) local memory registers and O[(m+n)2/p] additional delay elements. The time complexity of our solution is O[(m+n3/2/p3/4)Δ] time steps per one global iteration where Δ is the length of the global synchronization time step that is given by evaluation and application of two modified Givens rotations in TKA

Published in:

Parallel and Distributed Processing, 2001. Proceedings. Ninth Euromicro Workshop on

Date of Conference: