By Topic

System anomaly detection in distributed systems through MapReduce-Based log analysis

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Yan Liu ; Ideal Inst. of Inf. & Technol., Northeast Normal Univ., Changchun, China ; Wei Pan ; Ning Cao ; Guangwei Qiao

System anomaly detection is very important for development, maintenance and performance refinement in large scale distributed systems. It's a good way to obtain the troubleshooting and problem diagnosis by analyzing system logs produced by distributed systems. However, due to the increasing scale and complexity of distributed systems, the size of logs must be very large. Thus, it's inefficient for common methods to analyze system logs on single node. Therefore, there is a great demand to adopt a distributed method for anomaly detection techniques based on log analysis. In this paper, a MapReduce-Based Framework is implemented to analyze the distributed log for detecting anomaly. The framework is built on top of Hadoop, an open source distributed file system and MapReduce implementation. We first make use of Random Access File to realize an incremental way for aggregating system logs from each node of the monitored cluster, and collect them to the analysis cluster. Then, we apply the K-means clustering algorithm to integrate the collected logs. After that, we implement a MapReduce-Based algorithm to parser these clustered log files. Furthermore, in order to make the best use of this collected data, a flexible and powerful way is utilized to display monitoring and analysis results. Thus, we can monitor system status of large distributed cluster and detect its anomalies.

Published in:

Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on  (Volume:6 )

Date of Conference:

20-22 Aug. 2010