Landscape of Automated Log Analysis: A Systematic Literature Review and Mapping Study

Logging is a common practice in software engineering to provide insights into working systems. The main uses of log files have always been failure identification and root cause analysis. In recent years, novel applications of logging have emerged that benefit from automated analysis of log files, for example, real-time monitoring of system health, understanding users’ behavior, and extracting domain knowledge. Although nearly every software system produces log files, the biggest challenge in log analysis is the lack of a common standard for both the content and format of log data. This paper provides a systematic review of recent literature (covering the period between 2000 and June 2021, concentrating primarily on the last five years of this period) related to automated log analysis. Our contribution is three-fold: we present an overview of various research areas in the field; we identify different types of log files that are used in research, and we systematize the content of log files. We believe that this paper serves as a valuable starting point for new researchers in the field, as well as an interesting overview for those looking for other ways of utilizing log information.


I. INTRODUCTION
The need to track a system's behavior during its operation has been a common need since the beginning of software engineering. Traditionally, the main area of focus was failure diagnosis, and the most common form was the recording of actions taken by a system in log files. Studies such as [1] and [2] show that logging is a commonly used practice in the industry. With the rise of cloud computing, new challenges to logging practices have emerged -the distribution of log files among multiple services, a significant increase in log volumes, and a multitude of log formats. At the same time, new opportunities have arisen regarding the potential of the information contained in logs.
One of the rapidly evolving disciplines that explores this potential is log analysis, which strives to discover knowledge from log files (see Fig. 1). The type of knowledge that researchers hope to extract is very broad -from an The associate editor coordinating the review of this manuscript and approving it for publication was Sergio Consoli .
understanding of system behavior during its operation to drawing conclusions about users' behavior. Log analysis also extends the possibilities in traditional areas of the application of logging data -failure diagnosis and root cause analysis. With a continually growing volume of logs and increasing dispersion of log files across services (especially in cloud environments), conducting a manual analysis becomes very challenging. Commonly used technical solutions for log centralization and aggregation, such as Splunk [3] or LogStash [4], supported by automated log analysis, can help address these challenges.
The main purpose of this paper is to present an overview of the automated log analysis domain that would serve as a starting point for researchers new to this field. This study is positioned between a systematic mapping study of the domain and a systematic literature review. We identified the most common areas of interest as well as interesting niches based on a systematic review of the recent literature. We split the domain into subfields, focusing on the various types of knowledge that log analysis is capable of extracting. This allows the information potential that lies in the log files to be appropriately presented. Additionally, to support kickstarting in the domain, we provide an overview of different log files and their usage in various applications. Lastly, we collect information about the content that is commonly found or expected to be present in log files, which assists in good orientation in the domain and validation of whether log analysis has the potential to extract the type of knowledge that is of particular interest to researchers. Our review is performed in the context of our research interest in deriving information about the system's structure and behavior during operations using log analysis. Therefore, this area was treated with particular attention in our work. To sum up, our contribution to the field is three-fold: 1) We present an overview of various research areas in the field, 2) We identify different types of log files that are used in research, 3) We systematize the content of log files. The remainder of this paper is organized as follows. In Section II, we discuss the related work. Section III presents the method we chose to perform the study. Section IV describes the basic assumptions and protocols of the literature review. In Sections V and VI, we present the results of the study, followed by the final conclusions in Section VII. In Appendix 1, we describe the execution of the review according to the defined protocol in detail. The References section contains three types of references: papers mentioned in the article text (references [1] to [7]), papers that were eventually selected for the review after filtering (references [8] to [125]), and papers that were filtered out of the initial set (references [126]- [299]).

II. RELATED WORK
Recently, several reviews related to log analysis have been conducted. [127], [144], and [176] focus on log abstractionautomated methods for generalizing log entries into templates for further analysis. The outcome of log abstraction is log templates, which serve as instructions for log parsers on how to extract meaningful information from a log. Apart from log abstraction, [272] provides a review of research in other log analysis areas, such as failure/anomaly detection and log quality enhancements. The anomaly detection part of [272] (also in the scope of our review) covers the period until 2016, which complements our work. All of the abovementioned papers focus on the technical aspects of logging.
[136] maps the field of failure prediction with correlates with the Operations/Monitoring category in our work. The authors identify different types and sources of log files used in this field and identified the limitations and challenges for future research. They point to log formatting and quality issues, log consistency, and the scale, volume, and complexity of logs as the biggest problems. Our work extends this result by providing a content profile for different types of logs.
[129] is another systematic mapping study focusing mostly on the field of log-based software monitoring, which, according to the authors' definition, corresponds to our Operations and Design areas. In addition to identifying different subfields in this area, the authors also investigate the logging infrastructure and logging practices used by developers. The resulting map of the field is presented from the perspective of the lifecycle of a log. As far as paper selection is concerned, the authors use automated paper filtering in the last stage, which is based on the CORE ranking of conference venues (we perform manual paper filtering based on paper abstracts and/or full text). Because of the different methodologies, focus (log lifecycle vs. knowledge extraction), and date range of the analyzed papers, this research selects different papers for review as compared to our work, and we still find that both works complement each other.
[134] is a recent work reviewing log analysis-related papers with a focus on security (Operations/Intrusion detection in our work). The authors take the perspective of research topics (paper keywords). [135] provides a mapping study of methods for linking log entries with the source code that generated them. It summarizes techniques that benefit from log-to-source linkage, as well as classes of problems that are addressed by this approach.

III. STUDY METHOD
We performed our study following a systematic literature review as defined by Kitchenham [5] and [6]. Our process consists of the following phases, which are further elaborated in the following sections: 1) Definition of research questions and a review protocol, 2) Paper search execution and data extraction, 3) Data analysis and providing answers to the research questions. In the first phase, we defined the research questions that we want to answer. We also described the scope of the study, the inclusion and exclusion criteria for papers, the data source for the research, and the query string used to collect the data. The outcomes of this phase are presented in Section IV. In the second phase, we executed the paper search and VOLUME 10, 2022 filtered the results according to the defined protocol. We also extracted, analyzed, and synthesized the data obtained from the search query. The details of this process are presented in Appendix 1 and the results are presented in Table 6. Finally, in the third phase, we used the collected data to answer the initially defined research questions and present them in Sections V and VI.

IV. REVIEW PROTOCOL A. RESEARCH QUESTIONS
To provide an overview of the log analysis domain and some principal information for the new researchers in this field, we want our review to answer the following research questions: RQ1. What are the different goals of automated log analysis?
RQ2. What common types of log files are used to conduct log analysis?
RQ3. What data attributes can be commonly found in log files?
In the context of our primary research interest (deriving information about the system's structure and behavior from logs), answers to these questions allow us to confirm whether it is a niche worth exploring. They also provide us with a baseline that we can use for performing benchmarks as well as a general overview of data that can be extracted from log files, which we hope will help us in driving our research.

B. INCLUSION AND EXCLUSION CRITERIA
The main driver for our review is research question RQ1, which focuses on the expected outcome of the log analysis processes. Because of this perspective, we include only the papers that clearly describe the effect of log analysis -some valuable information that was collected from log files. At the same time, we exclude papers that focus on the internal mechanics of the process, such as log parsing and improvement of the performance of some algorithms or tools to support the process.
We focus only on automated log analysis, which means that a paper needs to present a consistent, repeatable method for extracting certain information from log files for a particular purpose. We exclude publications that describe manual, adhoc analysis that is not repeatable in a different contextapproaches whose goal is a one-off retrieval of information to understand a particular phenomenon (e.g., data science papers). In addition, visual analysis utilizing tools to visualize log files and support their analysis, which is based on the user's expertise, is excluded.
We limit our review to the analysis of structured log data. We exclude the analysis of audio/video logs, for example, logs of audio calls in a call center or recordings of video surveillance systems.
Finally, we limit the scope of our review by focusing on primary studies written in English language from the period between 2000 and the first half of 2021. The date range covers the period of greatest interest in the log analysis (see Fig. 1). To keep the number of reviewed papers manageable, we put the biggest focus on the last five years of research. From the 2000-2015 period, we selected only the most cited papers (see Section IV.C for details of this selection).
A summary of the exclusion criteria is presented in Table 1.

C. DATA SOURCE AND SEARCH QUERY
We use Scopus [7] as the source of papers for our review, which is considered the largest database of abstracts and citations. When constructing a query, we encountered a number of challenges stemming from the fact that log is a root word in both Latin and Greek (logos). Moreover, it is also a mathematical term, which means that it appears in multiple contexts across multiple fields of science, and consequently returns huge result sets for publication queries. We have also realized that providing a query that precisely applies the earlier defined inclusion/exclusion criteria is nearly impossiblethe query would have to be broader and the result set manually filtered. Therefore, we introduced the following criteria when constructing the query: 1. The process of log file analysis is an important aspect for the paper's authors, 2. We focus only on the computer science research area, 3. The result set needs to be manageable within the assumed time and human resources considering the need for manual filtering (no more than 300 papers returned), 4. The fact of information extraction from log files must be explicitly highlighted by the authors of this paper. The first criterion was met by expecting the article to contain the phrase log in its title and the phrase log analysis in either the title, abstract, or keywords. The resulting Scopus phrase was TITLE (''log'') AND TITLE-ABS-KEY (''log analysis''). It needs to be pointed out that the keywords included in this phrase cover not only those given by the articles' authors but also keywords automatically indexed by Scopus.
21894 VOLUME 10, 2022 The second criterion was achieved by selecting the computer science subject area in the query: LIMIT-TO (SUBJAREA, ''COMP'').
The third criterion was achieved by analyzing the number of publications over time (see Fig. 1) returned by our query. We decided that limiting the scope of our review to the last five years both matched the defined criteria and covered the period of the biggest interest in log analysis.
In order to meet the last criterion, we referred to the keywords given by the articles' authors, assuming that they have the greatest potential in highlighting the attributes of a paper as seen by its authors. We used the following keywords that indicate the fact that information extraction seems relevant for software systems: analysis, retrieval, recovery, mining, reverse engineering, and detection. The resulting Scopus phrase is as follows: The final query that we used was the following: AND (LIMIT-TO(SUBJAREA, ''COMP'')) AND (LIMIT-TO(LANGUAGE, ''English'')) To include prominent papers from 2000 to 2015, we applied the same Scopus query for that period and limited the results to papers with at least 20 citations. The threshold for the number of citations may seem to be chosen arbitrarily, but our detailed literature analyzes showed that it is a suitable criterion for selecting notable papers that are at least five years old.
It is important to note that the abovementioned queries precisely apply only the EX1 and EX2 exclusion criteria. The rest of the criteria were applied only roughly and further refined during the manual process described in Appendix 1.

D. THREATS TO VALIDITY
We have identified two major threats to the validity of this study: 1. The scope of papers selected for the review does not cover all of the relevant important papers, 2. The process of manual paper filtering is subject to misinterpretation, which can result in incorrect classification of papers. Since our research questions are rather broad with the intent of providing an overview rather than a precise answer, it is the size of the paper sample and its representativeness that determines the quality of the answers. Therefore, our mitigation action was to include a broader set of publications while covering the most intensive research period on the subject, even at the cost of manual filtering of papers. The last five-year period was when log analysis was intensively explored; thus, this scope should provide a solid base for providing representative answers to our research questions.
The second threat was mitigated by multiple iterations of the manual classification. For each paper excluded in the manual process, a concrete exclusion criterion was attached together with an argument. Table 7 presents the results of this process.

V. LANDSCAPE OF AUTOMATED LOG ANALYSIS
We provide the answer to RQ1 by presenting the selected papers from the perspective of the goal of log analysis. As all log analysis efforts strive to gain some knowledge, we focus on the different types of knowledge extracted from log files. We identified three types of knowledge that were described in detail in the Section B of Appendix 1 -related to the domain, system design, and system operations. Fig. 2 presents the distribution of selected papers across these categories.  Table 2 summarizes the different application areas that utilize automated log analysis for knowledge extraction. It can be seen that the broadest usage of log analysis takes place in Software Engineering and Cyber-security. The Generic category refers to articles that describe general-purpose log analysis techniques that can be used in multiple areas. Usually, these papers are related to anomaly detection, which is an abstract and generic concept. Two other notable application areas were Business Process Management and E-learning. It can also be noticed that although automated log analysis is currently being applied mostly in software engineering, the number of different fields that are trying to benefit from such an approach is quite broad, showing several interesting niches for future research.
We further divided the three main types of knowledge into research areas describing the different goals of utilizing the extracted information. Fig. 3 presents this categorization, which we refer to as the landscape of automated log analysis. The most prominent research areas and some interesting  niches are further described in the subsequent sub-sections. We also introduce the most cited papers (according to Scopus) in each area.

A. OPERATIONS
This type of knowledge relates to information about the running system and constitutes the mainstream of research involving automated log analysis. We further decompose the relevant papers into three research areas: Monitoring, Intrusion detection, and Root cause analysis.
Monitoring refers to activities aimed at watching a running system and detecting situations when it starts to behave unexpectedly. This is an automation of the typical work of system administrators, which focuses on detecting anomalies in observed logs. [76], [32], and [79] present supervised learning, neural network approaches to anomaly detection where logs are encoded into sequences and a sequence machine learning model is applied. [46] additionally addresses the problem of instability of log statements (due to log statement evolution over time or noise introduced by log processing), and [83] focuses on the real-time aspect of anomaly detection. [81] leverages the concept that log statements are in fact not unstructured, as their structure is defined by the source code that outputs them. The authors constructed a control flow using the source code and then matched it with a log file for anomaly detection. Finally, some researchers have focused on anomaly detection specifically in cloud environments. [109] and [61] focus on detecting anomalies within so-called cloud operations, for example, rolling deployments of services into a cloud. [105] touches on the problem of interleaved logs, typical for cloud environments, where multiple task executions create log statements in parallel and log statements need to be automatically mapped to task execution.
Of the earlier (pre-2016) papers, two are notable. [116] is by far the most cited paper in this area. Apart from proposing a method for problem detection using console logs, the authors provide valuable insights related to log processing in general which makes it an especially valuable work regarding any log analysis task. The proposed approach combines source code analysis to determine log patterns and unsupervised machine learning to detect anomalies. [120] focuses on critical infrastructures in which SCADA systems are deployed. The authors propose a method for automated extraction of non-frequent patterns that potentially represent malicious actions.
Intrusion detection is the second most common research area in automated log analysis. It is also related to anomaly detection, but with an explicit focus on the system's security, where each anomaly is treated as a potential threat. Detection of intrusions varies from identification of the fact that the system is under attack to understanding a particular type of attack taking place. [74] used the access log of a web server to distinguish between regular user behavior and malicious scans performed by bots or web crawlers. [39] utilizes attack trees that describe typical sequences of actions for different attack types and matches that information with the content of the log file. [94] dynamically creates anomaly profiles in the form of rules that are further used for attack identification. Some researchers in this field also focus on the detection of particular types of attacks - [103] detects SQL injections, [108] identifies denial of service, and [72] explores the detection of insider threats (those coming from the inside of the protected network).
Some earlier work (pre-2016) needs to be noted. [121] is an interesting approach to intrusion detection in the online gaming domain. The authors detect bot activity by analyzing the individual and collaborative behaviors of players based on game logs. [123] focuses on detecting threats caused by people inside an organization, as opposed to traditionally perceived threats coming from the outside. It uses a probabilistic approach to detect insiders which strives to maintain a low false alarm rate. [125] explores the area of digital forensics. The authors propose a log model that is later used for the formal analysis and verification of forensic hypotheses based on system logs. They also discuss a real-life example of the usage of their method.
Root cause analysis is a part of bug fixing, the goal of which is to find the core reason for system failure or malfunction. [102] describes an integrated environment for failure detection and root cause analysis based on log files. Correlation analysis is used to identify the root problem. [63] matches system messages stored in a log file with a resource usage log to detect problems related to a lack of resources (e.g., CPU saturation or lack of memory). [20] applied process mining techniques to first reconstruct the process model of the system from its logs and then identify deviations from such a model during process execution. [33] focuses on the analysis of exception logs, mapping them to tasks executed in a cloud environment, and matching them with historical executions of these tasks. [90], [91], and [31] try to identify problems related to specific environments, cloud, and big data platforms (Spark), respectively.

B. DOMAIN
This category is related to the extraction of business knowledge from logs of the software that supports a given domain. The most common research areas in this field are User profiling, Domain model extraction, and Business process model extraction.
User profiling aims to extract knowledge about the structural or behavioral characteristics of users, which support driving further system evolution to better fit users' needs. [89] uses a Hidden Markov Model to extract user intent from actions recorded in logs. [60] explores user intent in a cyberphysical context. It matches user actions in cyberspace (by analysis of web query logs) with the user's physical location (WiFi access point logs) to understand and predict their behavior in the physical world. [101] captures an exploration of the user behavior into a higher-level concept of usage tactics, which, according to the authors, allows for better interpretability and comparability between systems. [95] extracts the structural profile of users to provide product recommendations. It focuses on new (previously unknown) users without any shopping history, for whom it utilizes an access log to derive the user's interests and suggest suitable products.
Of the earlier studies in this area, two are notable. [119] seeks to discover the actual user intent (a subtask that user wants to fulfil) by analyzing the query entered in a search engine together with the corresponding links that were clicked afterwards and additional refining keywords entered in subsequent searches. The authors of [116] use client and server logs capturing user's interactions with a website to build a user profile. The intention is to use such user profiles to personalize the user interface of web applications for specific users.
Domain model extraction refers to understanding some real-life (domain) phenomena using information from log files. [107] uses an anonymized web search query log to identify adverse drug reactions. [70] and [68] explore the educational domain. [70] aims to understand the correlation between students' performance and students' behavior, and their tutor's teaching style. [68] is a boundary paper between domain model extraction and user profiling, which models students' behavior using the Hidden Behavior Traits Model. The authors of [64] learn expert knowledge on applying security rules from computers secured by professionals and apply this knowledge to previously unseen systems of non-experts. This paper treats the security log as a carrier of hidden domain knowledge. [35] uses process mining techniques to discover the ontology of the computer science domain.
Apart from the abovementioned work, there is also some prominent research available from the earlier period that explores the concept that observation of how people search through the Internet allows us to discover their goals or to better understand the topic they are searching for. The earliest paper in this area is [113] which utilizes search engine logs for the categorization of search query terms into a predefined taxonomy. [114], the most cited paper in this field, uses both search engine logs and actual user clicks that follow the search to explain the semantic relationships between search queries. The results are presented in the form of query graphs.
Business process model extraction aims to understand business processes from the log of system actions. [66] uses a frequent itemset mining approach to extract knowledge about the business process from an event log. [67] considers how the level of abstraction of a business process extracted from logs influences conformance with the actual process, which is crucial to balance process abstraction and accuracy. [9] focuses on the detection of anomalies in the event log using the model-agnostic approach, where no reference process model is available. It aims to provide a method for cleaning the event log, which would result in increased accuracy of the derived process model. [117] is a notable earlier work that uses workflow logs to recreate the actual business process realized by an application and to compare it with the anticipated process. According to the authors, such an approach allows for optimizing business processes especially in terms of applying error handling more precisely which should result in lowering the processmodeling cost.

C. DESIGN
The design category relates to extracting knowledge about the internal workings of a system (e.g., system structure), software building process, or attributes related to its design (e.g., quality or security). We split this category into four research areas: Quality analysis, Workflow discovery, Component dependency inference, and Security analysis.
The Quality analysis research area groups papers that refer to the assessment of system quality. [29] uses information from the log files of a running system to reconstruct VOLUME 10, 2022 production-like workloads for further use during system testing. Additionally, the authors analyzed the representativeness of the recovered workloads based on the varying levels of granularity of user actions considered for the recovery process. [93] applies a similar approach of exploring typical user interactions with a system to construct a test that assesses the reliability of the system. The authors used the mean time between failures as a measure of the system's reliability and validated their approach against a real-life system. [54] and [16] focus on the quality of SQL queries in the analyzed software. They analyzed the log of SQL queries executed by a system and detected anti-patterns.
[124] is a notable earlier work (pre-2016) that attempts to explain the usability characteristics of an application by analyzing search queries entered by users in a web browser regarding that application. Such an approach allows to gather user feedback regarding both the existing and the desired functionality of an application.
The Workflow discovery research area is related to the discovery of internal software processes. [65] describes a process mining approach that can discover recursive processes from event logs. [92] reconstructs workflows (series of interactions between services) in a cloud environment with a focus on failed workflows. Additionally, [122] is a widely cited work from 2014 that recovers a Communicating Finite State Machine model of concurrent system behavior. The approach presented by the authors is capable of utilizing any log file but requires users to provide a set of regular expressions to extract the expected pieces of information from log lines.
Component dependency inference captures papers that aim to recover the internal dependencies between software components (services). [98] uses service logs to identify the composition and substitution relationships between services constituting a software system. [67] uses predictive and statistical analyses of web service invocations from service logs to identify the relationships between services. The authors also propose a classification of the types of dependencies between services. Out of the pre-2016 papers, [118] is the most cited in this area. It uses Bayesian Decision Theory to infer dependencies between components in a distributed system and validates this approach against the Hadoop MapReduce framework.
[42] focuses on Security compliance and explores the compliance of the assumed security rules with their actual effect. The authors propose a method for automated analysis of the access log to detect conflicting security rules.

VI. TYPES OF LOGS USED IN RESEARCH AND THEIR CONTENT
Research questions RQ2 and RQ3 are related to the classification and content of log files commonly used in research.
In the Section C of Appendix 1, we define classes of log files, and Table 3 lists their occurrence in various areas of research. It can be seen that the three most commonly used types of log files are: Generic, Proprietary, and Network. The strongest correlation can be observed between the Generic log class and Operations research, and more specifically, the Anomaly detection category, which abstracts from the concrete log format.
It can be also observed that if we take away the Generic log type, Proprietary logs are by far the most used for analysis in research. This suggests the lack of standardization of log files and shows the need to explore the common properties of these logs. Table 4 presents a statistical summary of the contents of the various types of logs. Green-colored columns present the number of occurrences of each attribute class in the papers reporting the usage of a given log type. The color intensity visualizes how common each attribute class is within a given log type.
The last column summarizes the ubiquity factor of the log attribute classes, which is defined in detail in the Section D of Appendix 1. The ubiquity value is [0,1] normalized and represents how common the given attribute class is across all log types reported in the selected papers. Table 4 allows the creation of a statistical profile for each log type. The statistics are gathered based on the log attributes reported in the selected papers, which, depending on a paper, are a mixture of full log contents and only those attributes that the authors found useful for their log analysis. This means that the values presented in Table 4 embed both the availability and usefulness factors for each log attribute class. Access, Event, and Query logs are either well-defined log types (access log) or strongly embedded in a particular field or method (event log -process mining, query log -SQL analysis). Therefore, their profile represents the actual log format specification or the requirements of the technique used. Generic, Network, Platform, and Proprietary log types are non-standardized, which makes their profiles more interesting. The Platform log exhibits Resource use information as the most commonly used attribute class, while Event is the most frequent in the others. The Network log focuses on the Source, Destination, and Data size classes, which are related to the network traffic being tracked by them. All non-standard log files contain Timing as important information.
If we take the attribute class perspective, the ubiquity factor column in Table 4 presents an average statistical profile of a log across all log types. In general, it can be seen as the chance of finding a given attribute class in a log. The average log profile consists of (in order of decreasing ubiquity): Event, Timing information, Action, Destination, Object, and User information.

VII. CONCLUSION
We performed a systematic literature review and a mapping study of the automated log analysis research area since 2000 until halfway through 2021, with the main focus on the last five years. We mapped the area into sub-fields from the perspective of the type of knowledge that can be extracted from different log files and the goal of such an analysis. We presented the results in the form of the landscape of automated log analysis, characterizing each subfield and introducing the most prominent recent research. Additionally, we performed an in-depth analysis of log files and summarized the different types of logs commonly used in research, together with their content. We have provided a statistical profile of each log type, which allows researchers to better understand what type of information is expected to be available in various logs. Additionally, we made all source information that was the basis for our analysis available in the form of appendices.
We hope that our work will be valuable to researchers and practitioners who aim to explore the challenging idea of extracting knowledge on complex, sometimes hard to manage, computer systems from the system logs.
In our future work, we will focus our research on the Component dependency inference, which seems to be fairly unexplored area. Our main interest lies in the assessment of the capabilities of log analysis to extract knowledge about software components and processes that govern them.

APPENDIX 1 -REVIEW PROCESS EXECUTION
We executed the review according to the defined protocol in three phases. First, we executed the defined query and applied the exclusion criteria. The outcome of this phase was a list of relevant papers that were used in the subsequent steps. In the second phase, we extracted features to support answering the research questions, while in the third phase, we synthesized these features. The subsequent sections describe each phase in more detail. For clarity, the feature extraction and synthesis phases are discussed separately for each research question.

A. PAPER FILTERING
Execution of the final query in the Scopus database on 30.06.2021 returned 292 papers. The exclusion criteria EX1 and EX2 were already embedded in the query. For each paper from the result dataset, we applied the following multistep process: 1. Filter out not relevant papers based on abstracts, 2. Apply exclusion criteria using the paper's abstract, 3. If the paper cannot be clearly excluded based on its abstract, apply the exclusion criteria using the full text, 4. Remove duplicates. The first step is necessary because of the assumed strategy for paper selection; as the query is not precise enough, it can retrieve papers that are not relevant to log analysis. We were able to identify all such papers using only their abstracts.
We used the second step to reduce workload during the application of the exclusion criteria. We applied only exclusion criteria EX5, EX6, EX7, EX8, and EX3 at this stage. To avoid falsely excluded papers, we used a defensive approach and omitted the application of EX4. In the cases where the abstract of some of a paper did not provide enough evidence to exclude it based on the abstract, we qualified such a paper for the next step.
After initial filtering based on abstracts, for each paper that was not excluded, we applied the exclusion criteria based on the paper's full text. We focused on exclusion criteria EX4, EX5, EX7, and EX8 and searched for evidence justifying their application. After completing this process, as part of exclusion criterion EX3, we removed duplicate papers. The set of selected papers after the filtering process consisted of 118 publications. Table 5 summarizes the papers excluded. The main reason for excluding articles was their technical focus -not covering direct methods for extracting knowledge from logs, but focusing rather on tools and algorithms supporting this process (e.g., log parsing, template generation, or log visualization). Another commonly excluded category of papers was those describing manual log analysis. Although our work focuses on automatic approaches, the excluded papers often present interesting ideas on utilizing logs for gathering domain knowledge. These approaches have the potential for automation, which could make them fall under the scope of automated log analysis in the future. The third most common exclusion criterion was a lack of clarity. We used this category if the paper's abstract was not clear enough on the outcome of the log analysis, and the full text was not available. We also used it to mark preliminary work or experience reports that did not describe a concrete result. A summary of the excluded papers, together with the exclusion criteria applied and the justification, is presented in Table 7.

B. RQ1 -FEATURE EXTRACTION AND SYNTHESIS
We collected the following information from each paper: • Goal of the log analysis, • Business area/application domain. Such a choice of attributes allows the presentation of various research areas within log analysis from both technical and business perspectives. For each paper, we extracted the data by looking into the paper's title, authors' keywords, and finding additional evidence supporting this selection in the full text of the paper. We further classified the papers according to the type of extracted knowledge, which was further subdivided into research areas. We define the following types of knowledge: • Domain -knowledge about a business domain, for example, improved understanding of business processes, or understanding of user behavior, • Design -knowledge related to a software system and the process of its design, for example, understanding the relationships between components, or detecting system quality issues, • Operations -knowledge related to the running system during operation, for example, detecting anomalies in the system's behavior, or predicting the system's failure. Detailed data on the classification of each paper are presented in Table 6.

C. RQ2 -FEATURE EXTRACTION AND SYNTHESIS
The type of log file was extracted from the full text of the publications. We searched for named types of logs or information that a proprietary log file was used for the research. In some cases, the study used a generic model of a log. We synthesized the various log types used in the papers into the following classes: • Access log -server log recording HTTP requests, • CD log -log of continuous engineering tools (continuous integration/continuous deployment), • Event log -log of business events, used by process mining techniques, • Generic -log format is automatically detected using the technique described in the paper, or the paper assumes some log model, • Network log -log of a network device or service (e.g. SSH, proxy, firewall), • Platform log -log of a specific software platform (e.g. Spark, Hadoop, Android), • Proprietary -log of a particular software system, in a custom format that cannot be classified into other classes, • Query log -log of SQL queries executed by a system, • Search engine log -log of a search engine consisting of search queries entered by a user. A detailed classification of the log types for each included paper is presented in Table 6.

D. RQ3 -FEATURE EXTRACTION AND SYNTHESIS
To extract the various data attributes that can be found in log files, we again referred to the full text of the article, searching either for a named type of log file or an enumerated list of attributes used in that particular research. Named types of logs often represent a well-established log standard in a given area that is publicly described. In such cases, we derived the data attributes from the formal definition of the log file. We classified the identified attributes into the following classes that represent the different types of information represented by each attribute: • Action -information related to a recorded user/client action, • Authentication information -information related to a user's/client's credentials, • Communication channel -information related to a channel on which a communication that was recorded as log entry was established,   • Component -information about a software component/module that the log entry is related to, • Data size -information related to the size of data processed/transferred as a result of executing an action, • Destination -target (host/system/component) of a recorded communication event, • Event -details of a recorded event (usually a message text),  • Log file information -information about the file in which the log entry was created, • Object -information about the destination system's business object that is the subject of a recorded event, • Resource use information -information related to the utilization of a system's resources, • Severity -information about the importance of a recorded event, • Source -source (host/system/component) of a recorded communication event,  • Timing information -information related to the time that a recorded event took place and its duration, • User information -information related to the user that a recorded event is related to. Details of the classification of each attribute identified in the selected papers are presented in Table 8.
For each attribute class, we calculated a ubiquity factor u c , which describes how often attribute class c is used in logs. We used the following formula: where: • n c − number of occurrences of attribute class c in the selected papers, • l c − number of distinct log types in which attribute class c is reported in the selected papers, • L -total number of log types identified in the selected papers, • maxl c − maximum number of attribute occurrences over all attributes identified in the selected papers.