By Topic

The Effects of Problem Partitioning, Allocation, and Granularity on the Performance of Multiple-Processor Systems

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Cvetanovic, Z. ; Digital Equipment Corporation

In this paper we analyze the effects of the problem decomposition, the allocation of subproblems to processors, and the grain size of subproblems on the performance of a multiple- processor shared-memory architecture. Our results indicate that for algorithms where both the computation and the communication overhead can be fully decomposed among N processors, the speedup is a nondecreasing function of the level of granularity for arbitrary interconnection structure and allocation of subproblems to processors. For these algorithms, the speedup is an increasing function of the level of granularity provided that the interconnection bandwidth is greater than unity. If the bandwidth is equal to unity, then the speedup converges to the value equal to the ratio of processing time to communication time. For algorithms where the computation is decomposable but the communication overhead cannot be decomposed, the speedup is a nondecreasing function of the level of granularity for the best case bandwidth only. If the bandwidth is less than N, the speedup reaches its maximum and then decreases approaching zero as the level of granularity grows. For algorithms where the computation consists of parallel and serial sections of code and the communication overhead is fully decomposable, the speedup converges to a value inversely proportional to the fraction of time spent in the serial code even for the best case interconnection bandwidth.

Published in:

Computers, IEEE Transactions on  (Volume:C-36 ,  Issue: 4 )