By Topic

Dynamic cluster resource allocations for jobs with known and unknown memory demands

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Li Xiao ; Dept. of Comput. Sci., Coll. of William & Mary, Williamsburg, VA, USA ; Songqing Chen ; Xiaodong Zhang

The cluster system we consider for load sharing is a compute farm which is a pool of networked server nodes providing high-performance computing for CPU-intensive, memory-intensive, and I/O active jobs in a batch mode. Existing resource management systems mainly target at balancing the usage of CPU loads among server nodes. With the rapid advancement of CPU chips, memory and disk access speed improvements significantly lag behind advancement of CPU speed, increasing the penalty for data movement, such as page faults and I/O operations, relative to normal CPU operations. Aiming at reducing the memory resource contention caused by page faults and I/O activities, we have developed and examined load sharing policies by considering effective usage of global memory in addition to CPU load balancing in clusters. We study two types of application workloads: 1) Memory demands are known in advance or are predictable and 2) memory demands are unknown and dynamically changed during execution. Besides using workload traces with known memory demands, we have also made kernel instrumentation to collect different types of workload execution traces to capture dynamic memory access patterns. Conducting different groups of trace-driven simulations, we show that our proposed policies can effectively improve overall job execution performance by well utilizing both CPU and memory resources with known and unknown memory demands

Published in:

Parallel and Distributed Systems, IEEE Transactions on  (Volume:13 ,  Issue: 3 )