Exploring the Adaptability of Word Embeddings to Log Message Classification | IEEE Conference Publication | IEEE Xplore

Scheduled Maintenance: On Tuesday, May 20, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (6:00-10:00 PM UTC). During this time, there may be intermittent impact on performance. We apologize for any inconvenience.

Exploring the Adaptability of Word Embeddings to Log Message Classification


Abstract:

Minimizing the resolution time of service-impacting incidents is a fundamental objective of IT operations. Enriching the meta-data of the events and logs ingested by such...Show More

Abstract:

Minimizing the resolution time of service-impacting incidents is a fundamental objective of IT operations. Enriching the meta-data of the events and logs ingested by such systems using AI-based classifiers greatly increases the efficacy of features such as root cause analysis and workflow automation, and hence reduces incident remediation time.The use of word embeddings in text classification tasks is well-established, however, the general English corpora used to generate off-the-shelf embeddings lack the domain-specific lexicon required for accurate classification of event and log data. In the current contribution, we investigate multiple ways in which this deficiency can be addressed. In addition to augmenting the training-corpus with a domain-specific lexicon, we increase the granularity of our embedding using character n-gram decompositions and sub-word level representations. All implementations improved classification accuracy over the base case. Further, we explore the performance of a sequence classifier with embeddings of varying domain specificity. We observe that the performance of high-specificity models reduces as the volume of previously unseen words in the test data increases. We conclude that for a multi-input use case, and by leveraging sub-word level information, a high-specificity model can be outperformed by a model trained on a low-specificity corpus.
Date of Conference: 17-21 May 2021
Date Added to IEEE Xplore: 30 June 2021
ISBN Information:
Print on Demand(PoD) ISSN: 1573-0077
Conference Location: Bordeaux, France

Contact IEEE to Subscribe

References

References is not available for this document.