By Topic

Fault detection and localization in distributed systems using invariant relationships

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Sharma, A.B. ; NEC Labs. America, Princeton, NJ, USA ; Haifeng Chen ; Min Ding ; Yoshihira, K.
more authors

Recent advances in sensing and communication technologies enable us to collect round-the-clock monitoring data from a wide-array of distributed systems including data centers, manufacturing plants, transportation networks, automobiles, etc. Often this data is in the form of time series collected from multiple sensors (hardware as well as software based). Previously, we developed a time-invariant relationships based approach that uses Auto-Regressive models with eXogenous input (ARX) to model this data. A tool based on our approach has been effective for fault detection and capacity planning in distributed systems. In this paper, we first describe our experience in applying this tool in real-world settings. We also discuss the challenges in fault localization that we face when using our tool, and present two approaches - a spatial approach based on invariant graphs and a temporal approach based on expected broken invariant patterns - that we developed to address this problem.

Published in:

Dependable Systems and Networks (DSN), 2013 43rd Annual IEEE/IFIP International Conference on

Date of Conference:

24-27 June 2013