Abstract:
Cloud system is becoming increasingly complex to accommodate the growth of cloud services, especially in private cloud environments. In a mixed environment comprising con...Show MoreMetadata
Abstract:
Cloud system is becoming increasingly complex to accommodate the growth of cloud services, especially in private cloud environments. In a mixed environment comprising containerized applications over virtual machines in physical host machines of cloud infrastructure, a single failure may simultaneously cause multiple alarms in the cloud system. Therefore, root-cause localization is still a daunting task. In this paper, we propose an automated and real-time root cause localization system named ARCL with a multi-layer approach for monitoring and localizing system incidents. We present a mechanism to locate the root cause by combining predictive methods based on machine learning, which cover incidents in the system early and automatically perform root cause identification. We implement and evaluate A-RCL on a comprehensive real private cloud testbed. The evaluation demonstrates that A-RCL achieved high accuracy of 93,99% and 98,12% in incident prediction and root cause localization, respectively.
Date of Conference: 08-12 May 2023
Date Added to IEEE Xplore: 21 June 2023
ISBN Information:
ISSN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Local System ,
- Cloud Environment ,
- Machine Learning ,
- Virtual Machines ,
- Cloud System ,
- Cloud Infrastructure ,
- Single Failure ,
- Multi-layered Approach ,
- Physical Machines ,
- Private Cloud ,
- Real Cloud ,
- Data Sources ,
- Important Characteristics ,
- Performance Metrics ,
- Feature Space ,
- Unsupervised Learning ,
- Experimental Evaluation ,
- Localization Performance ,
- Types Of Defects ,
- Log Score ,
- Cause Of Incidence ,
- Fault Scenarios ,
- Feature Scores ,
- Number Of Time Steps ,
- Anomaly Score ,
- Performance Evaluation Results ,
- Average Execution Time ,
- Precision Rate
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Local System ,
- Cloud Environment ,
- Machine Learning ,
- Virtual Machines ,
- Cloud System ,
- Cloud Infrastructure ,
- Single Failure ,
- Multi-layered Approach ,
- Physical Machines ,
- Private Cloud ,
- Real Cloud ,
- Data Sources ,
- Important Characteristics ,
- Performance Metrics ,
- Feature Space ,
- Unsupervised Learning ,
- Experimental Evaluation ,
- Localization Performance ,
- Types Of Defects ,
- Log Score ,
- Cause Of Incidence ,
- Fault Scenarios ,
- Feature Scores ,
- Number Of Time Steps ,
- Anomaly Score ,
- Performance Evaluation Results ,
- Average Execution Time ,
- Precision Rate
- Author Keywords