By Topic

Fault tolerance of allocation schemes in massively parallel computers

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Livingston, M. ; Dept. of Comput. Sci., Southern Illinois Univ., Edwardsville, IL, USA ; Stout, Quentin F.

The author examines the problem of locating and allocating large fault-free subsystems in multiuser massively parallel computer systems. Since the allocation schemes used in such large systems cannot allocate all possible subsystems a reduction in fault tolerance is experienced. The effects of different allocation methods, including the buddy and Gray-coded buddy schemes for the allocation of subsystems in the hypercube and in the two-dimensional mesh and torus are analyzed. Both worst-case and expected-case performance are studied. Generalizing the buddy and Gray-coded systems, a family of allocation schemes which exhibit a significant improvement in fault tolerance over the existing schemes and which use relatively few additional resources is introduced. For purposes of comparison, the behavior of the various schemes on the allocation of subsystems of 218 processors in the hypercube, mesh, and torus consisting of 220 processors is studied. The methods involve a combination of analytical techniques and simulation

Published in:

Frontiers of Massively Parallel Computation, 1988. Proceedings., 2nd Symposium on the Frontiers of

Date of Conference:

10-12 Oct 1988