Performance Evaluation of Open-Source Endpoint Detection and Response Combining Google Rapid Response and Osquery for Threat Detection

Detecting the latest advanced persistent threats (APTs) using conventional information protection systems is a challenging task. Although various systems have been employed to detect such attacks, they are limited by their respective operating systems. Furthermore, they are developed as closed platforms and cannot be customized to meet user environments. To overcome these limitations, open-source endpoint detection and response (EDR) techniques are needed. In this study, we construct one that integrates open-source security frameworks combining GRR (Google Rapid Response) and osquery. A threat-detecting case study is conducted to validate the feasibility of the proposed open-source EDR system. Additionally, APT coverage for the proposed EDR system is analyzed using MITRE’s Adversarial Tactics, Techniques, and Common Knowledge model. The assessment result shows that APT tactics having high levels of threat detection using non-customized osquery configurations comprise 28.5 % of all detections, which is lower than the other response levels. The performance of open-source EDR can be increased by customizing osquery for specific purposes and environments. Open-source EDR combining GRR and osquery has the potential to provide the detection-coverage efficient threat detection system and has the advantage of flexible integration with other applications; it can also be developed for evolving system environments such as cloud and internet of things.


I. INTRODUCTION
Cyber-attack techniques constantly improve, and advanced persistent threats (APTs) cause serious security problems for companies and organizations [1]. It is difficult for existing information protection systems to detect the latest APTs because they attack the targets persistently for a prolonged period using intelligent advanced hacking techniques in a high-density, high-capacity, and high-speed network environment [2]. The most representative method of responding to APTs includes using a cyber kill chain, as first devised as The associate editor coordinating the review of this manuscript and approving it for publication was Sabu M. Thampi . a military security concept in 2011 by Lockheed Martin. It defines cyberattacks in multiple stages, identifies threats to organizational processes in advance, and analyzes, detects, and prevents cyberattacks and intrusions [3]. Its attack stages consist of reconnaissance, weaponization, delivery, exploitation, installation, command and control (C2), and actionson-objective. Additionally, cyber kill-chain models (e.g., the Mandiant attack lifecycle and the Bryant kill chain) have been developed [4]. Bahrami et al. [5] proposed a taxonomy of APT attacks based on the cyber kill-chain model, according to which tactics, techniques, and procedures for detecting APT attacks were identified. Although the cyber kill chain has emerged as a new framework for responding to APTs, it has VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ a limitation of only presenting the threats in a single attack stage, and it relies on security solutions, such as intrusion prevention systems, firewalls, and security information and event management tools for responding.
To respond effectively to APTs, behavior-based detection techniques must be applied alongside the kill-chain model [6]. A representative example of signature-based detection is the conventional anti-virus (AV) suite, which defeats malware by substituting their analyzed signatures in the suspected sample. However, signature-based detection has disadvantages of vulnerability to zero-day attacks, a high false-alarm rates, and difficulty responding to attacks that bypass signature detection [7]. By contrast, behavior-based detection techniques analyze the behaviors of malware using artificial intelligence, big data, visualization, and cloud technologies. Refs. [8] and [9] proposed behavior-based malware detection systems for Android. In particular, the system proposed by [8] was based on the fact that new malware is usually a variant of an existing one, whose signatures can be used to detect the new ones based on behaviors and to prevent its future obfuscation and transformation.
However, existing behavior-based detection methods are limited by the operating system (OS) and are developed as closed platforms, making it difficult to customize the user environment. Furthermore, it is impossible to effectively respond to quickly evolving cyberattacks. The larger the network and the more diverse the systems, the more severe the vulnerabilities. Endpoint detection and response (EDR) methods are designed to overcome the disadvantages of behavior-based detection techniques. Because EDRs support actions in various OSs and are developed using open sources, they allow experts worldwide to collaborate and prepare responses faster than the evolution speed of attacks.
Gartner [10] classified threat detection technologies based on the detection point in the network, the endpoint, and the user and classified products, as shown in Table 1. Accordingly, EDRs are being developed into next-generation endpoint threat detection tools by integrating them with endpoint protection platforms. Furthermore, it was forecasted that the total endpoint security market will grow at a compounded annual growth rate of 7.6 % from USD 12.8B in 2019 to USD 18.4B in 2024 [11]. These trends of endpoint and EDR markets highlight that the demand and market size for EDR solutions are expected to grow steadily.
EDRs should be used in conjunction with multiple solutions because they must detect and respond to all events occurring at network endpoints. Therefore, EDR solutions typically incorporate endpoint anomaly detection and response technologies alongside existing AV solutions and techniques. Unlike traditional security flows, recent security solutions are built open-source. A representative solution is GRR Rapid Response (GRR), a remote live forensics tool for responding to enterprise intrusions. It also provides digital evidence collection for forensics and incident-response applications. Open-source security solutions, including GRR, have the advantage of extending their capabilities to other open-source solutions and application programming interfaces (APIs), thus enabling flexible utilization. There are several studies related to such open-source security tools, but they only analyze the functionalities of the tools and the feasibility of utilizing them at some specific circumstances. In this study, we propose next-generation endpoint security-threat detection and response methods and evaluate the detection coverage of the suggested EDR system for the first time.
The contributions of this study are as follows: • An open-source EDR system combining GRR and osquery is evaluated for threat detection, and incident detection experiments are conducted.
• Using MITRE ATT&CK, all available attacks are organized, and the detection coverage of the EDR environments are identified. The remainder of this paper is organized as follows. Section II describes previous studies on open-source EDRs. Section III describes the structure and functions of open-source EDR systems implemented using GRR and osquery. Section IV conducts attack and detection experiments. Section V defines the detection criteria of opensource EDRs and analyzes detection coverage. Section VI presents the discussion and future research directions. Finally, section VII concludes the study.

II. RELATED WORKS
In this section, we summarize the literatures related to digital evidence collection and incident response using open sourcebased digital forensics and incident response (DFIR) tools and the security solutions. Like this study, several others have attempted to detect, respond to, and analyze cyber-security limitations using GRR and osquery. Table 2 describes the related works.
Reference [12] created memory corruption and persistent attacks using Metasploit on GRR clients and responded to them using GRR's hunt functionality (i.e., GRRScanMem-oryHunt, NetworkStatusHunt, GRRRegistryFinderHunt). Using GRR's cronjob, hunt tasks were scheduled, and memory, network status, and Windows registry-key analysis were performed, resulting in a well-documented analysis of the limitations of slow detection caused by the hunt cycle taking a long time (∼5-10 min). During this period, attackers had enough time to complete their attacks and cover their tracks. Furthermore, [13] described the system structure and functionality of GRR, osquery, and Mozilla InvestiGator, comparing their performances by establishing representative features that were successfully handled. Although both studies suggest the limitations of EDR, there is a limitation in that they do not specifically address the solution. In addition, although attempts have been made to identify the core functions of EDR, there is insufficient research on how to respond to overall cyber-attacks.
References [14]- [17] applied a GRR to several domains to collect data and attempted forensic analysis. Ref. [14] used GRR to respond to cyber threats in a healthcare system, displaying the benefits of operating the system simultaneously with remote forensics. Furthermore, they rendered the tool compatible with multiple frameworks, owing to its opensource nature. However, there was an issue that an additional security mechanism is needed for GRR to safely handle sensitive information in the medical system. Containers have also been frequently used to increase the convenience and efficiency of software engineers. Refs. [15]- [17] highlighted the need for digital forensics for web servers operating in containers (e.g., Docker and Kubernetes) and used some to respond to distributed denial-of-service attacks. In these studies, the method of recognizing a DDoS attack by checking the web server log and IP address was used. However, it is not different from the existing network security technique as only a remote tool is used to collect the log, and attempts to automate the attack detection using the advantages of remote live forensics have been lacking.
Studies that attempted to integrate GRRs with other analytical tools include [18] and [19]. Reference [18] integrated zeek (bro), an open-source network traffic analyzer and conducted network forensics. Ref. [19] conducted network forensics using the open-source project, DroidWatch, which is designed to collect and monitor data from Android devices alongside GRR. Although some contributions showed that data could be collected from GRRs that did not support Android, limitations persisted in that some log data failed to be collected, and routing was used instead of normal methods to access logs.
Furthermore, [20] analyzed and resolved the database limitations of GRR. Log data were stored in a database for incident response and forensic evidence collection, but the method had limitations of increased data with larger-scale services, resulting in resource shortages. Ref. [20] proposed a distributed data repository to overcome the limitations of GRR storage systems, improving scalability, processing speeds, and efficiency. However, it is necessary to study not only the data collection stage, but also the performance of processing, transmitting, and utilizing the collected data. When using the collected log data, additional studies such as performance comparison between the existing method and the distributed storage method can be conducted. In this paper, in conjunction with EDR and Kafka, we propose a way to stream the collected log data and utilize it.
Most of the existing studies dealt with only a few attacks in specific situations and conditions, and were limited to basic explanations of the main functions and limitations of each EDR tool. However, in order to respond to APT attacks that are becoming more intelligent day by day, it is necessary to study the EDR system that analyzes attacks at each stage of the APT scenario like a cyber kill chain and systematically responds to them. In this work, we further seek to construct a responsive EDR system with endpoint anomaly detection by processing the collected endpoint data used in extant studies, wherein GRR and osquery are used for DFIR. Also, the performance of proposed EDR system is evaluated via analyzing the detection coverage for all APT stages.

III. OPEN-SOURCE EDR SYSTEMS OVERVIEW
This section describes GRR and osquery systems and introduces the EDR environments that combines them.

A. GRR
GRR is an incident-response framework for remote live forensics and includes a client and a server [21]. GRR clients are distributed to systems that need to be detected, and they periodically poll the server for work, referring to a client action that includes file finder, memory, network, OS, and osquery. The GRR client and server communicate via the hypertext transfer protocol (HTTP) by default, and each communication is encrypted using AES (Advanced Encryption Standard) 256.
The GRR server comprises a front-end server, a worker, and a user-interface, and its core functions include flow and hunt. The GRR server uses Flows, a type of state machine, to solve resource problems. A Flow is a core entity of the server, and it calls client actions. When the GRR server first enters its flow state, it requires a client action for the state. While waiting for a response from the client, the server initializes all resources. When a response arrives, it fetches the appropriate resource and runs the flow state. This way, the GRR can resolve resource-hogging issues. Flows can be done on thousands of client machines, which is called a hunt. Hunt specifies the flow components for which machines to run the flow on.
GRR stores forensic data collected from clients and abstracts and represents them using a virtual file system. Hence, analyzers can identify the client file system. The GRR artifact is a function that collects and manages data generated while using the OS or applications. It can be conveniently used in conjunction with other tools because the information collected from the host is stored in an external storage system in YAML (YAML: Ain't Markup Language) format, a human-friendly data serialization standard for all programming languages. GRR-specific or private artifacts are stored locally. For other functions, GRR supports the Python library (i.e., grr_api_client) for automation and uses PowerShell for automation and scripting. Furthermore, it provides a function that interfaces osquery during the flow of the GRR and obtains client information in a structured query language (SQL) format for more effective analysis.

B. OSQUERY
Osquery is an operating system instrumentation framework that abstracts and displays information for analysis and monitoring in an SQL table and can be used with MacOS, FreeBSD, Linux, and Windows. Since October 2020, it supports 264 basic schema and provides system information (e.g., processes, CPU, disc, device, memory, kernel, files, Wi-Fi, and network information etc.) and information related to YARA malware research and detection, Docker containers, and Azure cloud services. New schema can be defined directly by users as needed.
Osquery supports the shell/console mode, osqueryi, and the hosting monitoring daemon, osqueryd. In particular, osqueryd performs scheduling via osquery configuration. As shown in Fig. 1, the interval time for periodic monitoring can be set. Fig. 1 also shows sample code for scheduling and running ''SELECT * FROM file_events;'' every 300 s. osquery supports plugins for logging various interfaces.
In this study, the log results are transmitted by interfacing kafka_producer. Kafka is a distributed streaming platform. When a producer sends a message (topic) to Kafka, the consumer reads it. This has advantages of constructing of a reliable real-time data pipeline between the system and applications with real-time streaming that reacts to and changes data [22]. Thus, the detection data generated by GRR and osquery can be streamed using Kafka, and they can be used for real-time responses. Schema can be managed and queried through Kafka using a confluent schema registry. Hence, when a producer registers schema in the registry, serialized Avro/Protobuf/JSON data are sent to Kafka for each schema ID. Consumers can receive a schema ID from the schema registry, search the corresponding schema, and use it by reverse-serializing it [23].

C. EDR SYSTEM ENVIRONMENT
The experimental environment of open-source EDR was set up by building a GRR server and client as virtual machines using VMware. First, we verified whether the detected data were properly transmitted to the EDR in an experimental environment consisting of the GRR server, client, and Kafka. The GRR server does not provide an osquery agent installer, and a process is required to install osquery's official source in the client machine. When osquery is installed on the client machine and placed on a path that can be accessed by GRR, GRR and osquery can be used together. Fig. 2 shows the EDR experimental setup comprising GRR, osquery, and Kafka. The GRR server is connected to the GRR client and uses the HTTP protocol for communication. The GRR client in which osquery has been installed sends query results to the GRR server according to its request. MySQL, the database of the GRR server, is linked to Kafka and can stream the sent detection results.
The hunt function of GRR can cause performance degradation of running clients and GRR systems because it results in an extensive load on the system [24]. This causes the hunt to expand its coverage, making it difficult to use for live forensics. When utilizing the hunt function to solve this problem, limits on resource usage should be set (limited). To do so, data can be accumulated in external systems, and forensics can be performed separately. Kafka can be used to transmit data from internal to external systems and to store, manage, and investigate them externally [25]. Kafka is a distributed streaming platform with proven big-data processing performance. In this paper, to resolve the abovementioned performance and scalability problems, the Kafka-based opensource EDR is studied.   an organizational level. It provides organizational security awareness, helps identify gaps in defense, and prioritizes risks [26]. The enterprise ATT&CK framework used in this study is based on the general attack procedure, which refers to actions performed from the intrusion stage to the goal achievement stage. This framework sequentially describes 12 attack stages (i.e., initial access, execution, persistence, privilege scaling, defense evasion, credential access, discovery, lateral movement, collection, C2, exfiltration, and impact), which include a total of 184 techniques.

IV. ATTACK AND DETECTION EXPERIMENTS
The attack scenario for penetration testing consists of initial access, execution, C2, and impact stages of ATT&CK. For penetration testing, we used Metasploit, an exploitation and vulnerability validation tool that assists penetration testing, and the open-source ransomware technique, RAASNet. Wine was also used to run the Windows.exe file in Linux OS. The attack scenario was set based on the APT. First, a Metasploit payload disguised as a normal file is generated and moved to a universal serial-bus (USB) connected to the victim's personal computer (PC). When a malicious payload is stored and executed on a victim's PC via USB, the attacker's C2 server and the victim's PC become connected. Hence, abnormal symptoms detected in the network traffic can be verified and responded to. Furthermore, the victim's PC is infected by the RAASNet ransomware, and it can be responded to by detecting the characteristics encrypted in a file with a specific extension (demon), and the file size and time property are changed. Fig. 3 shows the attack-and-response environment of the experiment, which comprises a GRR server, a client, and an attacker. It shows the attack stages of initial access, execution, C2 corresponding to the attack scenario, and the flow of the attack. The GRR server and client respond to attacks while communicating via HTTP. Table 3 shows the conditions of the attack-and-response experimental conditions. Fig. 4 shows a flowchart of the attack-and-response experiment. The procedure and method of the three-stage attack-and-response are as follows.

1) INITIAL ACCESS
The initial access-stage attack consists of techniques related to the attacker's network access attempts. The attacker can access the victim's PC first by inserting malicious exploit code or an attachment file into the web browser, e-mail, or acquiring network access permission. The attacker can also attempt malicious actions by copying malware to a mobile device and inserting it into the system. In this experiment, malware was stored on a USB mobile storage device and inserted into the victim's system. ''SELECT * FROM usb_devices;'' was used as the query statement to respond to the initial access stage. The information regarding the USB devices connected to the PC (class, model (modelname), model_id, protocol, removable, serial, subclass, usb_address, usb_port, vendor, vendor_id, version) can be known, and traces of the USB connection at the time of initial attack can be proven. In particular, the ID and serial number in the USB device information can be used as critical information to detect unauthorized USBs in addition to those registered by the organization. Furthermore, the file in the specified path inside the USB can be identified using the query statement ''SELECT * FROM file WHERE path like ''/media/account/USBname/%'';''

2) EXECUTION
The Metasploit payload injected via the USB in this experiment was a malicious execution file disguised as a normal file, and it bypassed security and AV programs. Among the detailed attack techniques of malware, user execution occurs when the user opens the file. When the payload is executed, a session is then formed between the C2 server and the victim's PC, and network traffic is generated. Because the experimental environment of the victim's PC was Ubuntu OS, Wine was used to run the exe file.

3) C2
A Trojan attack induces the internal system to voluntarily open a port to enable communication with external systems. In this case, it opened a specific port for C2-server communications. When the network traffic is detected, it can be judged as an abnormal symptom, and hacking can be suspected. In this case, it was verified that a connection between the attacker and the victim's system was initiated, owing to verification via a query designed to detect suspicious outbound network activities: ''SELECT s.pid, p.name, local_address, remote_address, family, protocol, local_port, remote_port FROM process_open_sockets s join processes p on s.pid = p.pid WHERE remote_port not in (80, 443) and family = 2;''

4) IMPACT
This step is the final step of APT attack, and it interferes with the availability of systems, services, and resources, or it damages integrity. In this experiment, the victim's PC was infected with ransomware, thereby damaging the availability and integrity of the target folder. The characteristics that can be used to check the behaviors of the ransomware are file extension, timestamp (time information), and change in file size. The properties of the folder, file extension and size, and timestamp can be checked using the query, ''SELECT * FROM file WHERE path LIKE ''/home/user_name/ransomwareinfected folder/%''';'' Additionally, we can use a query that only shows information regarding a specific extension, ''SELECT * FROM file WHERE path LIKE ''/home/user name/ransomware-infected folder/%.extension'';'' or one that determines changes in time values by ransomware infection by checking atime, mtime, and ctime via ''SELECT filename, atime, mtime, ctime FROM file WHERE path LIKE ''/home/user name/ransomware-infected folder/%'';'' Furthermore, the information of file_events in a table related to file-integrity monitoring can be output using the query, ''SELECT * FROM file_events;'' to display the states of files as created, deleted, or updated. Table 4 shows the summary of the attack and detection experiment methods. It reflects the stages of the MITRE ATT&CK, the attack techniques, and the osquery queries for all attacks. Queries that show the most representative characteristics of each attack were selected for this experiment, and the attacks were detected using them.

B. RESULT OF DETECTION SCENARIO
As a result of conducting the attack-and-response experiment assuming an APT attack scenario based on MITRE ATT&CK, various attacks, including the use of an unauthorized USB, a Trojan horse, and ransomware, were detected using osquery input query statements. GRR provided a client API that enabled Python scripting and automation, and osquery provided the osqueryd daemon, which recorded logs when events were detected. This was used to set response schedules to activate the EDR system according to the established period. Therefore, the occurrence of events in a specific path or file can be detected in real time, and any abnormal symptoms can be responded to immediately by scheduling osqueryd. Through this process, automated detection and response tools that detect changes via specific events and perform investigations of corresponding artifacts are made feasible. Furthermore, the detected data can be transmitted via Kafka, one of osquery's logger plugins, to be utilized for analysis and response. The results of this detection experiment show that several open-source security tools can be interlocked to form an EDR environment to detect, analyze, and respond to endpoint security incidents.

V. PERFORMANCE EVALUATION
This section defines the detection criteria of open-source EDR, tests its detection coverage, and analyzes the results. Additionally, a developmental direction for improving detection performance is presented.

A. DEFINITION OF DETECTION CRITERIA
In our research environment, the security-incident detection criteria were established by organizing the information and query-statement requirements required to detect MITRE ATT&CKs. For example, for drive-by-compromise attack techniques during an initial attack tactic, because the attack can be detected through website-access and script execution logs, the query statement to verify the log becomes a requirement for detection. Additionally, the valid-account attack technique can be detected from user account information, logchange permissions, etc., which require query statements to verify that new unauthorized local accounts have been created. It also requires queries to check for unusual changes in privilege escalation alongside all appliances and applications for default credentials and secure-shell protocol keys.

B. ANALYSIS OF DETECTION COVERAGE
For detection coverage analysis, we investigated and developed query statements to collect a total of 38 queries from osquery GitHub 1 opened by Facebook, osquery query packs, 2 osquery-configuration GitHub, 3 osquery document, 4 and other related research and publication data websites. 5 A total of 691 were collected and developed, and 381 were summarized as a result of merging and deleting duplicate query statements (160 from osquery GitHub, 136 from the osquery query pack, 48 from websites, 31 from related research data, four from osquery documents, and two custom produced according to the collection routes). 6 The queries used in this experiment are all available on github. Based on the ATT&CK framework, the distribution of detection data was generally proportional to the number of techniques at each stage.
To define detection coverage, the 381 query statements were analyzed to see whether they satisfied the requirements of detection. Then, the detection level of the GRR was identified by converting the ratio of the number of query statements that satisfied the requirements to a percentage.
The response levels for the attack techniques are outlined in Table 5. Depending on the percentage that show satisfaction, fewer than 40 % were classified as low, 40 % or higher and less than 70 % as medium, and 70 % or higher as high. In addition, the importance of each technique was calculated alongside the classification criteria for the response levels given in Table 5, and a high-level analysis of detection coverage was performed. To this end, weights were calculated based on the Teach model, which defines the difficulty of attack for each technique opened by ATT&CK. We assumed that the lower the attack difficulty, the more frequent the attack and the resulting damage appears; hence, a higher weight was given to them, as shown in Table 6. The detection coverage was calculated by applying the following equation according to the response level and attack difficulty; it was converted to a percentage using Eq. (1): (1)

C. ANALYSIS RESULTS
The analysis result of the detection coverage based on ATT&CK is shown in Fig. 5. In the open-source EDR environment of GRR, the ratio of techniques having high response levels was found to be 28.5 %, the ratio of techniques having medium response levels was 35.1 %, and the ratio of techniques having low response levels was 36.4 %. Thus, the ratio of the high response level was lower than that of the others. The reason for low detection coverage is the lack of query statements for attack detection, which can improve coverage and performance through query statements custom-made for the environment and purpose, rather than basic query statements. Because osquery is based on relational databases, it can obtain the necessary information flexibly by joining multiple schema. For example, in Table 4, a query statement for detecting Trojan attacks was generated by joining the process_open_sockets schema with the processes schema. To maximize the utilization of GRR and osquery-based opensource EDR, such customized settings for user environments are required.
Furthermore, we conducted in-depth research on the detection level and the ratio of detection data at each attack stage and the stages having low detection levels and summarized the performances and development directions of open-source EDR. Fig. 6 analyzes the possibility of response to the attack techniques in each of the 12 attack stages of the MITRE ATT&CK, which represents the detection level of each attack stage. The average number of query statements for threat detection in each attack stage was 52.25. The top-four stages with low response levels where the number of queries is lower than the average were initial access, execution, credential access, and lateral movement, and the numbers of queries in each stage are 17, 26, 29, and 17, respectively. Fig. 7 lists the ratios of detection data of each attack stage in ascending order, and the ratio was determined by the number of query statements for threat detection to the total number of query statements. The average ratio of the detection data was 5 %. The ratios were equal to or lower than the average ratios in the four stages having insufficient response: 3 % for initial access and lateral movement, 4 % for execution, and 5 % for credential access. Table 7 lists the techniques having high importance and low detection levels among the total 184 techniques.  The weights were set as four or higher by referring to Table 6 for importance. Among all techniques, the number of techniques having low detection levels was 64, and techniques having high importance among these was 29, which accounted for 45 %. This is approximately half and corresponds to approximately 15.7 % of all techniques. In other words, these attack techniques belong to the stage having a higher possibility of attack than other attack techniques, owing to the lower difficulty of attack. However, the detection level was low. In particular, the attack techniques corresponding to the four stages having the lowest detection level in Fig. 7 (i.e., initial access, lateral movement, execution, and credential access) had the lowest relative detection levels in all attacks. Therefore, to improve performance and coverage of the open-source EDR system, additional research and development of technology that can respond to the stages in Table 7 and detailed attacks based on the above four stages are required.

VI. DISCUSSION AND FUTURE WORK
The open-source EDR is convenient and effective for managing multiple devices, and the source code can be freely modified and improved to support various applications and OSs. Therefore, it is superior to conventional AV and forensic tools in terms of scalability, efficiency, and cost. In this paper, the detection coverage of the proposed EDR system combining GRR and osquery was evaluated, but more effective tools can be combined and improve the detecting performance.
Remote monitoring can be effectively performed using an open-source EDR in the cloud, across the internet of things (IoT), and in large-scale service environments. Additionally, using the collected detection data to respond to attacks or for digital forensics can be considered.
However, current digital forensics procedures require data collection while blocking all network connections to the system, from which evidence may be collected with integrity. Because system data can be collected remotely using EDRs, and the integrity can be verified using hash functions, digital forensics procedures and regulations must be updated in accordance to attack trends. Furthermore, privacy and critical information leakage can occur because real-time logs are collected while confidential information is processed [14]. Hence, information protection at the management server is needed for processing the detection data, responding to attacks, and conducting remote forensics using the opensource EDR.
Furthermore, cloud and IoT usage increases with the acceleration of digital transformation and the development of wireless communication and network technology. Cloud and virtual system environments require data integrity and digital evidence analysis solutions. IoT has security vulnerability problems as a result of its pursuit of low price, light weight, and unburdened performance. Open-source EDR must evolve to meet these needs. As a future work, the proposed framework will be applied to and evaluated in environments having large-scale IoT devices and cloud environments.

VII. CONCLUSION
The open-source EDR is a cost-effective security tool with high expected value in terms of flexibility, utilization, and scalability, and it can be used in next-generation digital platforms that become hyper-connected, hyper-intelligent, and globally scaled. In this study, attack detection and coverage analysis were possible for all APT attack stages according to MITRE att&ck through open source EDR for the first time.
A few stages showed a low detection rate due to insufficient query settings to detect detailed attacks of each stage. Indepth performance evaluation was conducted using Teach model, and attacks which had high importance and low detection level were analyzed. To ensure the coverage of GRR and osquery open-source EDRs, appropriate query statements tailored to the required environment and conditions must be created, especially those based on low-level and highimportance attacks. Also, other effective open source-based tools can be used to increase the performance of EDR system. As a future work, other opensource tools can be compared for their performance using the evaluation framework of this study.  He is currently a Professor with the Department of Convergence Security Engineering, Sungshin Women's University (SWU), Seoul. Before joining SWU in March 2017, he was with the Electronics and Telecommunications Research Institute (ETRI) as a Senior Researcher from 2005 to 2017. He served as a Principal Architect and a Project Leader with Newratek, South Korea; and Newracom, USA; from 2014 to 2017. He has authored/coauthored more than 90 technical articles in the areas of information security, wireless networks, and communications, and holds about 160 patents. His current research interests include wireless/mobile networks with an emphasis on information security, networks, and wireless circuit and systems.
Dr. Lee is also an Active Participant of and a Contributor to the IEEE 802.11 WLAN Standardization Committee. VOLUME 10, 2022