By Topic

Quantifying the performability of cluster-based services

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

6 Author(s)
Kiran Nagaraja ; Dept. of Comput. Sci., Rutgers Univ., New Brunswick, NJ, USA ; G. Gama ; R. Bianchini ; R. P. Martin
more authors

In this paper, we propose a two-phase methodology for systematically evaluating the performability (performance and availability) of cluster-based Internet services. In the first phase, evaluators use a fault-injection infrastructure to characterize the service's behavior in the presence of faults. In the second phase, evaluators use an analytical model to combine an expected fault load with measurements from the first phase to assess the service's performability. Using this model, evaluators can study the service's sensitivity to different design decisions, fault rates, and other environmental factors. To demonstrate our methodology, we study the performability of a multitier Internet service. In particular, we evaluate the performance and availability of three soft state maintenance strategies for an online bookstore service in the presence of seven classes of faults. Among other interesting results, we clearly isolate the effect of different faults, showing that the tier of Web servers is responsible for an often dominant fraction of the service unavailability. Our results also demonstrate that storing the soft state in a database achieves better performability than storing it in main memory (even when the state is efficiently replicated) when we weight performance and availability equally. Based on our results, we conclude that service designers may want an unbalanced system in which they heavily load highly available components and leave more spare capacity for components that are likely to fail more often.

Published in:

IEEE Transactions on Parallel and Distributed Systems  (Volume:16 ,  Issue: 5 )