Loading web-font TeX/Math/Italic
Few-Shot Log Anomaly Detection Based on Matching Networks | IEEE Journals & Magazine | IEEE Xplore

Few-Shot Log Anomaly Detection Based on Matching Networks


Abstract:

In order to address the problem of log anomaly detection in scenarios with limited labeled log datasets, this paper proposes Log-MatchNet, a novel few-shot log anomaly de...Show More

Abstract:

In order to address the problem of log anomaly detection in scenarios with limited labeled log datasets, this paper proposes Log-MatchNet, a novel few-shot log anomaly detection method. To tackle issues such as unstructured log data, diversity, and evolution over time, we employ structured processing and log parsing to convert log content information and template ID into vectors. Feature extraction is performed using the BERT model. Additionally, by integrating multiple datasets and conducting post-training on the BERT model for domain adaptation, we obtain BERT\_{}Post , a module with universal feature extraction capabilities in the log domain. Compared to BERT_{base} and CyBERT, our method demonstrates superior performance in log anomaly detection, especially in situations with limited labeled datasets. With only 2 annotated normal logs and 2 annotated abnormal logs, BERT\_{}Post achieves a remarkable 16.14% increase in F1-score. Addressing the challenge of imbalanced data, we introduce a matching network that learns the similarity scores between input and prototype vectors, showcasing strong generalization capabilities with an average accuracy of 99.6%. In few-shot scenarios, our method, Log-MatchNet outperforms traditional methods and Proto-Siamese network in terms of F1-score. In an unstable log evolution environment, our method exhibits robustness against noisy data, achieving an F1-score of 81.2% even with 20% injected noise. Compared to LogAnMeta, our approach yields a 31.71% increase in F1-score. Experimental results demonstrate the effectiveness of Log-MatchNet in detecting anomalies in the presence of limited labeled log data and its robust performance in log evolution scenarios.
Published in: IEEE Transactions on Network and Service Management ( Volume: 21, Issue: 3, June 2024)
Page(s): 2909 - 2925
Date of Publication: 08 February 2024

ISSN Information:

Funding Agency:


I. Introduction

Logs are data used to record detailed information about the operation of IT systems, including system status and application behaviors [1], commonly used to locate the errors. Operation personnel can read the logs to understand the behavior and status of the system. Compared to other data, logs are often used in various operation and maintenance tasks due to their abundant sources and easy access, playing an indispensable role in intelligent operation and maintenance. The existing universal logging tools can provide basic functions such as log collection, retrieval, matching, and visualization, playing an important role in improving monitoring system performance and enhancing system reliability. At the same time, the development of logging technology also faces challenges brought by the characteristics of large scale, rich types, complex structure, and uneven quality of logs.

Contact IEEE to Subscribe

References

References is not available for this document.