By Topic

Creating personal adaptive clusters for managing scientific jobs in a distributed computing environment

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Walker, E. ; Texas Adv. Comput. Center, Texas Univ., Austin, TX ; Gardner, J.P. ; Litvin, V. ; Turner, E.L.

We describe a system for creating personal clusters in user-space to support the submission and management of thousands of compute-intensive serial jobs to the network-connected compute resources on the NSF TeraGrid. The system implements a robust infrastructure that submits and manages job proxies across a distributed computing environment. These job proxies contribute resources to personal clusters created dynamically for a user on-demand. The system adapts to the prevailing job load conditions at the distributed sites by migrating job proxies to sites expected to provide resources more quickly. The version of the system described in this paper allows users to build large personal Condor and Sun Grid Engine clusters on the TeraGrid. Users can then submit, monitor and control their scientific jobs with a single uniform interface, using the feature-rich functionality found in these job management environments. Up to 100,000 user jobs have been submitted through the system to date, enabling approximately 900 teraflops of scientific computation

Published in:

Challenges of Large Applications in Distributed Environments, 2006 IEEE

Date of Conference:

0-0 0