TEE-PA: TEE Is a Cornerstone for Remote Provenance Auditing on Edge Devices With Semi-TCB

AI&IoT edge devices run complex applications and are under the threat of stealthy attacks that are not easily detected by traditional security systems. Provenance auditing is a promising technique for determining the ramification of an attack from event logs. However, the original provenance auditing was designed for personal computers and is unsuitable for edge devices. Therefore, introducing provenance auditing on edge devices raises the following three problems. (1) Current edge devices have relatively powerful CPUs but are not enough for provenance auditing. (2) Most provenance auditing tools are developed as normal applications, and the log data is exposed to an untrusted area. (3) Most edge devices are used outdoors without an administrator and must be managed by secure M2M (Machine to Machine). To solve these problems, we propose TEE-PA to securely collect system call logs on an edge device using TEE (Trusted Execution Environment) and send them to a remote provenance auditing on a powerful server. The system call logs are directly transferred from the kernel to the TEE and are not exposed to administrators as well as attackers. Although the kernel runs in an untrusted world and has a semantic gap from the TEE, TEE-PA offers a semi-TCB (Trusted Computing Base) that measures the kernel integrity check mechanism from the TEE at boot time and partially trusts the kernel. Operational correctness is periodically confirmed by unpredictable heartbeat messages sent from the remote provenance auditing server. If the correctness is not confirmed in the logs on the server, heartbeat message is not sent, triggering an autonomous recovery with a system reset of the watchdog timer protected by the TEE. We implemented a prototype of TEE-PA on the Arm TrustZone of Raspberry Pi3 with SPADE and LKRG (Linux Kernel Runtime Guard) as remote provenance auditing and kernel integrity check. We demonstrate that TEE-PA can determine the ramifications of stealthy attacks (fileless malware and shell command attacks) with acceptable performance. The performance evaluation estimates that TEE-PA is 19 times faster than on-board provenance auditing.


I. INTRODUCTION
In recent years, many AI&IoT edge devices have been geographically distributed without an administrator, e.g., smart city.These edge devices run several intelligent applications because they not only gather sensor data but also process The associate editor coordinating the review of this manuscript and approving it for publication was Jiafeng Xie.them to reduce network traffic and server load.Microsoft Azure IoT Edge [1] and Amazon AWS IoT Core [2] offer SDK (software development kits) based on Linux as well as embedded OS to implement AI&IoT edge applications.They understand the security of these devices and provide M2M (Machine to Machine) management services.
Unfortunately, AI&IoT edge devices are under threat of stealthy attacks, such as fileless malware [3], [4] and shell command attacks [5], [6], [7].In addition, recent attacks attempt to disable firewalls, delete logs, and modify log timestamps to evade detection [8].To detect stealthy attacks, provenance auditing [9], [10], [11] (alternatively known as attack causality analysis [12], [13], [14]) is a promising technique for determining the ramification of an attack.Provenance auditing uses event log data and presents the causal relationships of events as a DAG (Directed Acyclic Graph).The DAG shows the events hidden by stealthy attacks, allowing analyzers to determine the ramifications of the attacks.
It is desired to adapt provenance auditing to AI&IoT edge devices, but there are three problems.First, edge devices cannot provide the computational resources required by provenance auditing [15].Provenance auditing requires a powerful CPU and large storage that can store the logs generated by the system for a long period.Since original provenance auditing is designed for personal computers, edge devices do not have suitable computing resources.Second, most provenance auditing tools are developed as normal applications, and the log data are exposed to the untrusted area.In such an implementation, we cannot guarantee that attackers will not tamper with the logs generated in edge devices in general.Microsoft Azure IoT Edge and Amazon AWS IoT Core also collect the event log data from remote edge devices to cloud servers.However, the log collection mechanisms depend on the normal application and cannot be fully trusted.Third, when an abnormality occurs in an AI&IoT edge device, it is desired to dispatch an administrator to the edge device to restore, but it causes a high cost.Some devices have a remote management function, such as IPMI (Intelligent Platform Management Interface), but it also causes an attack surface, as pointed out by US-CISA [16].For edge devices managed by M2M, we should add a mechanism to detect abnormality and recover autonomously.
We propose TEE-PA for provenance auditing on AI&IoT edge devices, which solves the mentioned problems based on TEE (Trusted Execution Environment).TEE is a current CPU function that creates an isolated execution environment from a normal OS.Some implementations of TEE configure trusted OS before normal OS and work as TCB.TEE-PA utilizes TEE for secure logging and sending the log to a remote provenance auditing server.The secure logging from the TEE [17], [18], [19], [20] needs the help of the Linux kernel in general because the kernel runs on untrusted REE (Rich Execution Environment) and has a semantic gap from TEE.To increase the trust of the Linux kernel, even if only slightly, TEE-PA offers semi-TCB (Trusted Computing Base), which measures the kernel integrity check mechanism from TEE at boot time.TEE-PA also implements autonomous recovery using a watchdog timer protected by TEE, which is desired by M2M management.This Paper's Contributions and Challenges: 1) Secure system call logging from kernel to TEE: The kernel communicates the TEE, and the system call log is transferred directly; namely, the system call log is not exposed to user space.However, the kernel runs in the untrusted REE, and we develop a semi-TCB mechanism to treat the kernel as a trustable part.2) Introduction of semi-TCB: TEE-PA cannot move the whole system call logging mechanism to TEE (i.e., TCB), and TEE-PA introduces semi-TCB to increase the trust of the Linux kernel.The idea is based on the measurement architecture of TPM (Trusted Platform Module) which passes the trust to the measured software [11], [21], [22], which is referred by ''chain of hash'' or ''chain of trust''.However, the Linux kernel runs on another world (i.e., untrusted world) and has a self modification mechanism (i.e., kprobe, eBPF) which changes the integrity dynamically.To solve this issue, TEE-PA utilizes the kernel integrity check mechanism (i.e., LKRG: Linux Kernel Runtime Guard), which can treat runtime modifications.The semi-TCB measures the integrity (i.e., hash) of LKRG from TEE at boot time and sends the hash result to the remote provenance auditing server to confirm the mechanism, which works as a remote attestation.The semi-TCB cannot extend true TCB to the kernel on REE, but it can increase the trust.3) Confirming operational correctness with unpredictable heartbeat and autonomous recovery: An attacker may interfere or stop the logging.The provenance auditing server periodically sends a heartbeat message to the AI&IoT edge device and confirms the correct logging.The heartbeat message includes a random string, making it difficult to predict by attackers.If the correct logging is not approved, the watchdog timer protected by TEE is not reset and causes a reboot process to recover.4) Implementation: The proposed TEE-PA is implemented on Raspberry Pi3 with OP-TEE [23], a Trusted OS on Arm TrustZone.As a remote provenance audit, we use SPADE [9] on a powerful server.Some benchmarks showed proper performance, and some known attacks were detected, including stealthy attacks (fileless malware and shell command attacks).
The implementation shows that the first problem is solved by a powerful remote server, which is 19 times faster than on-board provenance auditing.
The second problem is solved by system call logs sent from the Linux kernel to the remote provenance auditing server via the TEE.The implementation does not expose the logs to the untrusted user space.The comparisons of other researches are listed in Table 1 in Section IV.The third problem is solved by semi-TCB and autonomous recovery.The semi-TCB extends TEE's TCB to the untrusted kernel with the manner of TPM.The autonomous recovery is ensured by the reboot caused by the watchdog timer protected by TEE and periodic heartbeat extensions.The source code is available as open source. 1he remainder of this paper is organized as follows.Section II describes the threat model and assumption.Two fundamental techniques used by TEE-PA, provenance auditing and TEE, are introduced in Section III.Section IV introduces related works.Section V mentions the design of TEE-PA, and Section VI describes the implementation details.Section VII shows the measured performance and security features on a real AI&IoT edge device.Section VIII discusses some topics, and Section IX concludes this paper.

II. THREAT MODEL AND ASSUMPTION
Our threat model is similar to [15] and [24], studies on detecting attacks against IoT devices.Previous works have treated falsification attacks as out of scope.However, on real IoT devices, the user space and even the kernel can be compromised, and an attacker deletes logs or modifies the logging system.In TEE-PA, we assume that an attacker tries to compromise the logging system.It means that TEE-PA keeps the correctness of the logging system.Even if the correctness is broken, TEE-PA detects the un-correctness and recovers the system autonomously.
TEE-PA targets AI&IoT edge devices that run complex applications, such as edge computers for federated learning.The prototype of TEE-PA is implemented on Raspberry Pi3 (Arm Cortex-A type TrustZone) with OP-TEE [23] for trusted OS.We assume there are no vulnerabilities in the TEE application and hardware TEE itself.Namely, known vulnerabilities (such as [25], [26]) are assumed to be handled, and known protections (such as [27], [28]) are considered to be applied.However, the current prototype has some hardware and software restrictions.TEE-PA assumes that TEE protects a watchdog timer, but Raspberry Pi3 allows access from both TEE and REE.Raspberry Pi3 has no RoT (Root of Trust) hardware, such as SE (Secure Element), and cannot offer valid remote attestation and secure storage.Although the Raspberry Pi3 has no secure boot, TEE-PA assumes the boot procedures are secure.2Even if secure boot is not enabled, OP-TEE boots before Linux on REE, and it can act as TCB after the Linux booting.
The semi-TCB utilizes LKRG (Linux Kernel Runtime Guard) [30] at the REE part.LKRG protects the integrity of the Linux kernel, but it is not perfect.LKRG developers mention the limitations [31].The semi-TCB accepts them.
We assume that communication between TEE and a provenance auditing server is protected using TLS (Transport Layer Security) because an attacker modifies exploited system call logs to hide exploited system call logs.We also assume that servers in the cloud are operated in a trusted environment and are outside the scope of the attack.Finally, Provenance auditing is the security technique used for attack investigation [10], [12], [13], [32].Provenance auditing represents the information flow of events executed in the system as a DAG (Directed Acyclic Graph), where nodes represent objects (e.g., process, file, socket) and edges represent relationships between objects (e.g., system calls; open() and close()) [33].To detect stealthy attacks, we must collect the system call log [14].System call logs are events related to processes, files, and networks.To collect the system call log, we use the mechanism provided by the OS.For instance, Linux supports Linux Audit (or LA) [34], and Windows IoT Core supports ETW (Event Tracing for Windows) [35].In this study, we collect system call logs by using LA.
Methods for safely capturing system call logs such as PASS [36], SPADE [9], Linux Provenance Module [11], and CamFlow [37] were proposed.PROVDETECTOR [14] offers a method for detecting stealthy attacks on traditional computers using provenance auditing.However, these methods aim to collect system call logs on a traditional computer.They cannot be directly applied to AI&IoT edge devices because they do not have suitable computing resources (e.g., storage and memory).LA is known to generate a large number of logs on a daily basis [38], and thus, it is difficult to store these logs in edge devices with limited storage capacity.
We use SPADE [9] to provide remote provenance auditing in a cloud environment.SPADE provides the system call log collection and provenance auditing analysis and storage.SPADE is a cross-platform system that can track and analyze the origin of the collected system call log.For the collection of the system call log, there are reporters for each platform provided by SPADE.Each reporter collects the system call logs and converts them into provenance semantics.Examples of reporters are the LA reporter for Linux and the ProcMon reporter for Windows.The SPADE server integrates the provenance semantics sent from these reporters.SPADE's query engine can query the stored provenance semantics and retrieve the graph's nodes and edges with values matching the query.In addition, the system administrator can search for child nodes and parent nodes of the specified graph nodes.SPADE's query engine can output the matched DAG as a file in DOT format.The administrator can determine the ramifications of the attack from the DAG.

B. TRUSTED EXECUTION ENVIRONMENT
TEE is an isolated execution environment provided by current CPU hardware.TEE has become a popular function, and 26538 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.some commercial CPUs equip it (e.g., Arm TrustZone [28], Intel SGX [39], and AMD SEV [40]) as well as research CPUs (e.g., RISC-V Keystone [41], [42], Sanctum [43], and TIMBER-V [44]).We use Arm TrustZone because Arm Cortex-A is a popular CPU for AI&IoT edge devices and TrustZone has a lot of experiences in key management and digital rights management on Android smartphones [28], [45], [46].TrustZone divides a CPU into two independent execution environments, the normal and secure worlds.The normal world is also called REE (Rich Execution Environment).A general-purpose OS such as Linux is executed in the normal world, and in the secure world, a trusted OS is executed.One example of a trusted OS for Arm TrustZone is OP-TEE [23] developed and maintained by Linaro.Figure 1 shows the structure of OP-TEE.OP-TEE is based on the GlobalPlatform's TEE Internal Core API [47] and TEE Client API [48].The communication between OP-TEE and Linux is managed by the Secure Monitor which runs on the highest exception level EL3 (monitor mode).An application on OP-TEE, which is called TA(Trusted Applications), is launched and terminated by CA (Client Application) on Linux in general using TEE Client API which is offered as the shared library libtee.so.(Caution: the TA launch style is revised by TEE-PA, which allows launching a TA from the Linux kernel for security.See the detail in Section V-B).
The launched TA executes an intended process using GlobalPlatform's TEE Internal API which is offered as the static library libutee.a.In addition, a TA can write files and communicate with the network.These functions use system calls of Linux.To run the system calls on Linux, OP-TEE utilizes a proxy process named TEE-Supplicant.A normal TA runs on the user mode (EL0) on OP-TEE, but it is not enough for a special TA that requires privilege mode (EL1) (e.g., access to the memory on REE).OP-TEE offers a programming method for privilege mode, which is named PTA (Pseudo Trusted Application).PTA is an interface of OP-TEE kernel and is called from TA. TEE-PA's semi-TCB is implemented using PTA to get a kernel memory image on REE.
The CPU-segmented worlds require special instruction to switch between worlds, as the CPU can execute only one world at a time.Arm Cortex-A performs world switching with SMC (Secure Monitor Call) instruction.World switching causes a context switch as a system call between the application and kernel, and frequent switching of the world harms system performance (It is related to the optimization of TEE-PA because the world switch for each system call is high overhead.See the buffering optimization between kernel and TEE described in Section V-B).

IV. RELATED WORK
This section introduces related works of secure logging and introspection from TEE.

A. SECURE LOGGING
Many studies have been conducted to protect logs, and some of them utilize TEE.SGX-Log [17] proposes a logging system that uses Intel SGX to guarantee the integrity and confidentiality of logs.CUSTOS [18] is a practical framework that supports tamper-proof and distributed logging using Intel SGX.EmLog [19] proposes the first tamper-resistant logging method applied to devices with small computing resources, such as AI&IoT edge devices.EmLog uses GlobalPlatform's TEE and Arm TrustZone to store logs in secure storage.T-Log [20] proposes a method to extend the existing logging system to a distributed logging system.T-Log uses KP-ABE (Key-Policy Attribute-Based Encryption) [49] to control the access permissions to the logs.
Table 1 summarizes the features of logging systems that utilize TEE.The existing methods use TEE to protect the logs managed by user space processes (i.e., rsyslog, auditd, and syslogd) which are not the system call logging framework in the kernel itself.In general, SGX runs on ring 3 (user space) only and cannot protect the system call logging framework in the kernel.In contrast to existing methods, TEE-PA, based on Arm Cortex-A, enables that the kauditd in the Linux kernel communicates to the TEE directly and transfers the system call logs; namely, the logs are not exposed to the user space.The kauditd is a Linux kernel component to collect system call logs, which is used by TEE-PA as well as traditional provenance auditing.However, it runs in the kernel on REE, and TEE-PA has semi-TCB to verify the kernel integrity from TEE and REE.

B. INTROSPECTION FROM TEE
Introspection is a popular technique to monitor system calls, and many hypervisors [54], [55], [56], [57] offer VMI (Virtual Machine Introspection).The CPUs, which have hardware support to implement hypervisor, provide a mechanism to trap privileged instructions.The mechanism makes it easy to implement introspection because system calls are trapped with privileged instructions by the virtualization mechanism.Unfortunately, TEE cannot implement an introspection easily because TEE cannot trap privileged instructions.However, some TEEs (e.g., Arm TrustZone) can access the memory region allocated for a kernel and analyze the kernel.Therefore, some researchers tried to implement an introspection on TEE.SPORBES [50], TZ-RKP [51], and SeCReT [52] offer active monitoring of kernel integrity from TrustZone, but they require the customization of their kernels.KShot [53] provides a unique introspection using Intel SGX to implement a live kernel patch.SGX cannot access kernel memory, and KShot enables communication from SGX to Intel SMM (System Management Mode), which can access any memory and achieve indirect introspection.The SMM accomplishes the kernel patch.
Table 2 summarizes the features of TEE introspections.The existing methods access the memory in REE and perform introspection from TEE.They require kernel customization to introspect the integrity.On the other hand, TEE-PA does not require kernel customization for introspection, although it requires kernel customization for sending system call logs to TEE.The reason is that the TEE-PA takes the kernel memory integrity at boot time and delegates the integrity responsibility to the LKRG (Linux Kernel Runtime Guard) on REE.This implementation is named semi-TCB, which idea is based on the measurement architecture of TPM (Trusted Platform Module) [11], [21], [22].In addition, semi-TCB tries to use the Linux kernel runtime modification mechanisms (e.g., kprobe, eBPF) because traditional integrity checks from TEE cannot treat runtime modification.The details are described in Section V-C.

V. DESIGN
To determine the ramifications of stealthy attacks on AI&IoT edge devices, we design TEE-PA which enables remote provenance auditing using secure logging with TEE.TEE-PA also has an autonomous recovery mechanism protected by TEE.As shown in Figure 2(a), in the traditional style, the kauditd in Linux kernel and the provenance auditing in the user space run on the same computer.The kauditd gets the system calls executed in the kernel based on Linux Audit (or LA) rules and records them as system call logs.Since the OS kernel and LA are treated as part of TCB, it is assumed that LA itself will not be attacked.However, real attackers can edit the system call logs to confuse the person analyzing the attack [18].If the system call log is tampered with or deleted by an attacker, the DAG generated by provenance auditing cannot be trusted.
Figure 3 shows the process flow of the components of LA, from an application to a LA log.When the application calls a system call, the control moves to the kernel, and the syscall handler receives the request.The syscall handler passes the system call to the LA filter before calling the kernel service.If a system call matches the LA rule, the LA filter will generate a system call log and add it to Kauditd Buffer (or KB).The reason for using KB is that synchronous generation of the system call log and writing of the log would cause a significant increase in latency [58].
The system call log contains information such as the type of LA record, timestamp, executed system call, and PID.The kernel thread (i.e., kauditd) picks up the system call log from the KB and sends it to the auditd running in user space via Netlink.The auditd writes the system call log as a file in varlogauditaudit.log.The kernel then processes the system call and returns the results to the application.The system administrator can use the auditctl command to set the LA rule.
There are three challenges in applying provenance auditing to AI&IoT edge devices on the TEE-PA.First, it is difficult for edge devices to perform provenance auditing on a device because they do not have enough computational resources.Second, it is not easy to assume that the OS kernel and LA are part of TCB in current edge devices because they run on untrusted REE and are not easy to detect tampering and deleting the system call logs.Third, since many edge devices are deployed in geographically distributed environments, they do not have system administrators and must be managed by autonomous M2M.
As shown in Figure 2(b), TEE-PA uses TEE to securely collect system call logs in AI&IoT edge devices and send them to a remote provenance auditing server to solve these problems.(1) To solve the first problem, we perform provenance auditing on a cloud server that is rich in computational resources.The system call logs are sent to the server via kernel and TEE with secure communication.It means the logs are not exposed to the user space, namely attackers.(2) To solve the second problem, we developed semi-TCB which verifies the integrity of the kernel from TEE and REE at boot time.The Linux kernel integrity includes LKRG (Linux Kernel Runtime Guard) and achieves self periodic protection.When the LKRG detects an exploit, it causes kernel panic.The semi-TCB follows the chain of trust of TPM style and trusts the LKRG after the confirmation at boot time by the server.(3) To solve the third problem, a remote server sends a heartbeat message to set a certain time limit on the TEE-protected watchdog timer (CIDER [59], RO-IoT [60]).If the following heartbeat message is not sent within the set time, the watchdog timer will reach its time limit and cause a system reset.The heartbeat message includes a random string, making it difficult to predict by attackers.The random string is used by the log receiver to check the functioning correctness of the IoT device.If the heartbeat message is not included in the system call log, the remote server will stop sending heartbeat messages.For example, if an attacker deletes or tampers with the log, the heartbeat message will no longer be included in the system call log.

B. LOG PROTECTION FROM KERNEL TO SERVER USING TEE
Figure 2(b) describes how to securely send the system call logs from the kauditd in the kernel to a server on the cloud using TEE.There are three special components: TA-Collect in the TEE, TEE-Supplicant in the REE, and log receiver in the cloud server.The TA-Collect is launched from the kernel thread (i.e., kauditd), which is more secure than a TA launched from an application because it is covered by semi-TCB mentioned in Section V-C.The TA-Collect receives system call logs generated by the LA filter in the kernel and sends them to the log receiver via the TEE-Supplicant.(TEE-Supplicant is a daemon for OP-TEE [23]; trusted OS in TEE.It manages I/O handling in Linux user space.) At first, in order to implement TEE-PA, the LA filter adds the generated system call log to the Ring Buffer (or RB) in the kernel.The RB size is 15,000 bytes, and we assume that the RB is large enough to avoid loss of system call logs.If the LA filter passes the system call log to the TA-Collect synchronously, the kernel will invoke TA while the system call is being processed, which will increase the overhead.The reason for preparing RB separately from KB is to avoid conflicts with the existing KB and to prohibit resizing the buffer size.If the CPU scheduling is not assigned to kauditd, the contents of the KB will be filled with logs, and new log information may be lost [61].TEE-PA needs to prepare buffers of the largest size to prevent the loss of logs as much as possible.In addition, the size of the KB can be easily changed using the user space command.To prevent an attacker from resizing the buffer and intentionally filling the buffer to 100%, we should prepare an RB that cannot be resized separately from the KB.Next, when a certain number of system call logs are stored in the RB, the kauditd pulls out the system call logs from the RB and creates a shared memory accessible from both REE and TEE.Then, the kauditd copies the system call log from RB to the shared memory.
Next, the kauditd invokes a TA-Collect process in the TEE and switches to TEE.The TA-Collect reads the system call log from the shared memory and sends it to the log receiver using TCP by calling Socket API.OP-TEE realizes the network communication of TA by borrowing the Linux network stack running in REE [62].Specifically, OP-TEE makes an RPC (Remote Procedure Call) to TEE-Supplicant to execute the specified command on Linux.TEE-Supplicant works as a proxy and provides TA with network communication functions.When the TA-Collect completes sending the system call log, it switches to REE, and the kauditd verifies the return value that shows the correct execution of TA-Collect.
Skillful Attackers may stop the LA component to prevent the system call logs from being sent to the log receiver.They may also delete or tamper with the system call log before the kauditd passes it to the TA-Collect to delay the attack's detection.We explain how to address these attacks in Section V-E.

C. SEMI-TCB
TEE-PA cannot transfer the whole system call logging mechanism from REE to TEE (i.e., TCB), and TEE-PA introduces semi-TCB to increase the trust of the Linux kernel.It hands over the integrity check from TEE to REE.The transferring function from REE to TEE resembles TCB minimization of RT-TEE [63].
Figure 3 shows the architecture of LA.The dotted parts can move to TEE because they are portable applications.However, the gray LA components cannot move because they are integrated components in the OS kernel running in REE.An attacker could compromise the LA components and delete or tamper with the logs before the kauditd passes them to the TEE.Therefore, TEE-PA offers a semi-TCB mechanism to confirm the integrity of the Linux kernel from TEE and REE.
Figure 4 shows the architecture of semi-TCB on TEE-PA, which has two parts: semi-TCB from TEE and REE.As semi-TCB from TEE, the hash of the Linux kernel's code region is taken by the TA-Collect in TEE and sent to the log receiver to decide whether the kernel image is genuine.This mechanism works as a remote attestation.The measured kernel requires LKRG (Linux Kernel Runtime Guard) [30], which works as semi-TCB from REE.However, Semi-TCB hands over the integrity check to the REE side.
The LKRG performs periodic integrity checks on the running Linux kernel.In addition, LKRG offers rootkit detection and treats runtime kernel modifications (e.g., kprobe, eBPF), which are not easy to treat from TEE because the TEE side cannot easily get the information of runtime modifications.The security of kernel data (e.g., data in the shared memory between TEE and REE) is also assumed to be protected because LKRG treats kernel exploits.When the LKRG detects an exploit, it causes a kernel panic and terminates the processing.Junnila [64] evaluated the effectiveness of rootkit detection on LKRG.
The idea of integrity check transfer from TEE and REE resembles TPM's measurement [11], [21], [22] because they take a binary hash and trust the behavior after the execution.The measurement starts from the power-on and keeps the chain of hash ,namely, chain of trust.The semi-TCB also measures the kernel integrity which includes LKRG from TEE at boot time.After that, the semi-TCB hands over the trust from TEE to REE as the style of TPM, which means that the semi-TCB trusts a self-protected Linux kernel with LKRG.

D. LOG ANALYSIS IN A CLOUD
The log receiver passes the received logs to the SPADE provenance auditing in a cloud.SPADE provides a function called reporter as a mechanism to collect logs.Each reporter should collect logs on the device where SPADE is not running and cannot process logs from external devices in real-time.We need to modify the existing reporter to pass the system call logs collected by the log receiver to SPADE in real-time.SPADE's query engine analyzes the collected provenance semantics.The query engine generates a DAG, which the system administrator can use to determine the ramification of the attack.

VOLUME 12, 2024
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

E. CONFIRMING OPERATIONAL CORRECTNESS WITH UNPREDICTABLE HEARTBEAT AND AUTONOMOUS RECOVERY
The original remote provenance auditing is a passive function and does not have a method to confirm the correctness of the LA on the edge device.Therefore, the log receiver sends a heartbeat message to check the correctness.The heartbeat message also includes the mechanism to prevent abuse of the watchdog timer in TEE.This mechanism resembles CIDER [59] and RO-IoT [60].
In a heartbeat message, the log receiver executes two types of tasks on the edge device via SSH every 15 seconds.The short time interval is based on the RO-IoT [60] to react immediately to an attack.One task contains a random string as an argument, which is recorded in a log and is difficult to predict for an attacker.Another task extends the watchdog timer.If the LA works correctly, the task is logged, and the system call log is sent back to SPADE.If the log receiver confirms the task execution in the log, the following heartbeat message is sent.If not, the following heartbeat message is not sent.At that time, the period of the watchdog timer is expired, and the watchdog timer issues a system reset to reboot automatically.
By intentionally stopping the extension task and causing a system reset, the attacker can interfere with the services running on the edge device.However, given the increasing amount of malware that is devising ways to operate persistently on edge devices [8], [65], we expect that attackers would not like to have their attacks interrupted by events such as reboots.For example, Mirai disables the watchdog timer to prevent restarts [66].
Fortunately, the extension task is special because the watchdog timer is protected by TEE, and the falsification of the extension task is not easy.When an attacker stops the extension task, the task is not logged, and the server knows the watchdog timer is not extended illegally.

VI. IMPLEMENTATION
We have implemented TEE-PA on an Arm TrustZone on a Raspberry Pi3 Model B (Arm Cortex-A53, four cores, 1.2 GHz, 1GB memory, 16GB SD card, Linux OS).TEE-PA uses LA to record events executed in the REE on edge devices.TEE-PA also uses OP-TEE [23] in the TEE, i.e., Arm TrustZone, to collect the system call logs and send them to a server in the cloud.TEE-PA uses SPADE for provenance auditing in the cloud.We verified the functionality of TEE-PA on QEMU and evaluated the performance on Raspberry Pi3.

A. COLLECT AND SEND LOGS
To send system call logs from kauditd to TA-Collect, we modify the audit_log_end() function that adds the system call log generated by the kernel to the Kauditd Buffer (KB).Specifically, we developed the mechanisms to the audit_log_end(), which adds the system call log to the Ring Buffer (RB) and invokes the TA-Collect to send the log in TEE.The kauditd extracts the system call log from the RB and creates a shared memory using the tee_shm_alloc().The kauditd copied the system call log to the shared memory and invoked TA-Collect using the tee_client_invoke_func().If there is no session between kauditd and TA-Collect, the kauditd uses tee_client_open_session() to launch a TA-Collect process.The session between kauditd and TA-Collect will be maintained until the OS reboots because it avoids the cost of re-creating the session every time the kauditd invokes the TA-Collect.
We developed the TA-Collect using OP-TEE's TA-devkit.The communication between the TA-Collect and a server is established via TCP.To perform socket communication from the TA-Collect, it uses the Socket API v1.0.The URL of the server is included in the TA-Collect only.When the TLS is effective, authentication and encryption are established between the server and the TA-Collect.
We implemented the log receiver using Python 3.10.The log receiver uses multi-threading to handle socket communication to receive system call logs from multiple edge devices.

B. SEMI-TCB
Semi-TCB is composed of the Linux kernel integrity checks from TEE and REE.The integrity check from TEE is implemented by a PTA (Pseudo Trusted Application) of OP-TEE with mbedTLS library.PTA is a part of the OP-TEE kernel and can access the Linux kernel memory on REE.OP-TEE does not allow a TA to access the memory of REE for security.
The PTA is called by the TA-Collect and gets a SHA256 value of the Linux kernel memory image.The SHA256 value is sent to the log receiver as a remote attestation.When the kernel is genuine, the provenance auditing server accepts the following event logs.
LKRG (Linux Kernel Runtime Guard) is the Linux kernel self integrity check on REE.TEE confirms the Linux kernel which includes LKRG at boot time.After that, the integrity check is handed over to LKRG, which works as semi-TCB on REE.

C. LOG ANALYSIS
We made two changes in the SPADE to use it for TEE-PA.First, we implemented the function of the UNIX socket server on the log receiver to pass the log to the SPADE LA reporter because the SPADE LA reporter uses UNIX domain sockets to collect logs.The log receiver opens a port at /var/run/iot_events, and the SPADE LA reporter reads logs from this port.
Second, the current SPADE's reporter and query engine only support Intel x86 architecture system calls.Since TEE-PA uses Arm architecture CPUs, we added Arm architecture 64-bit system calls to SPADE's reporter and query engine support.

D. RECOVERY WITH TEE PROTECTED WATCHDOG TIMER AND HEARTBEAT
To use the watchdog timer from TEE, we must create a PTA that can directly access the hardware.We made PTA-WDT (Pseudo Trusted Application for WatchDog Timer), which sets the time on the watchdog timer protected by TEE.We also made a CA in the REE, named Reset-CA, which invokes PTA-WDT.We use the Raspberry Pi3 watchdog timer.However, the Raspberry Pi3 watchdog timer driver is not implemented in OP-TEE, so we cannot use the watchdog timer from within TEE.In addition, the specification of the watchdog timer in the Raspberry Pi3 has not been published.Therefore, we implemented PTA-WDT by referring to the Linux watchdog timer driver for Raspberry Pi1.To access the physical address of the hardware from the PTA, we need to modify the OP-TEE kernel to map the physical address to memory using the register_phys_mem() at system startup.Therefore, we added a function to the OP-TEE driver to map the physical address of the watchdog timer to memory using the register_phys_mem() when the Trusted OS boots.When the Reset-CA invokes the PTA-WDT, the PTA-WDT uses io_write32() to set the deadline time to the watchdog timer and start the watchdog timer countdown.To prevent attackers of accessing the watchdog timer from REEs, we removed the watchdog timer driver from Linux.
An attacker who obtains root privileges may use the Reset-CA to extend the watchdog timer.To prevent the attacker from abusing the Reset-CA, we considered restricting the users who can execute the Reset-CA and requiring password authentication to execute the Reset-CA.We considered introducing MAC (Mandatory Access Control) using SELinux (Security-Enhanced Linux) to divide the users who can run the Reset-CA.
We implemented the heartbeat message functionality in the log receiver.We use Paramiko [67] to send heartbeat messages from the log receiver to edge devices.The log receiver uses the Python's secrets.token_hex(16) function to generate 16 byte random strings that are difficult for an attacker to predict and to control rebooting.The random strings are used by the file name of the ''touch'' command.The touch command issues system call ''execve'' with the random string, and the log is confirmed by the log receiver.In the current implementation, the maximum time that can be set for the hardware watchdog timer is 15 seconds which is the hardware limit of Raspberry Pi3 and adequate to react immediately for an attack.

VII. EVALUATION
The performance and security features of TEE-PA on a Raspberry Pi3 are evaluated.The server machine for provenance auditing SPADE is AMD Ryzen 2700X@3.70GHz with 20GB of memory.The SPACE runs on Ubuntu 18.04 on Hyper-V on the server.The evaluations are achieved alongside other processes and include some extra logs.

A. SEMI-TCB
We measured the time to take a SHA256 value of Linux kernel memory image from TEE (namely PTA).The performance is compared with the time to take a SHA256 value of the same memory size (4MB) on a Linux application with mbedTLS library.The difference shows the overhead of memory access from TEE to REE on PTA.
The results were 520 milliseconds on PTA and 24 milliseconds on the application.The difference is about 21 times.However, the measurement of kernel memory is taken at boot time only once and does not affect the total overhead.
We also confirmed that the LKRG caused a kernel panic when the kernel memory image was changed from TEE.It is evidence of runtime integrity check on REE.

B. SYSTEM PERFORMANCE
We used UnixBench to measure the impact of TEE-PA on CPU performance.UnixBench runs on three different settings.First, Vanilla is an unmodified Linux, which corresponds to the baseline.Vanilla has all Linux security modules turned off.Second, LA is configured with LA rules to audit and log specific system calls.Third, TEE-PA audits specific system calls as LA with TEE and sends them to a server in the cloud.
Table 3 presents the benchmark results.The first three columns show the INDEX score of UnixBench for each setting; the higher, the better.The fourth column indicates the number of system calls per second, which are made by UnixBench on TEE-PA.The fifth and sixth columns show the deviation rates to evaluate the performance degradation of TEE-PA from Vanilla and LA.The result lines are ordered by the fourth column value; the number of system calls per second on TEE-PA.The fifth column results (TEE-PA degradation from Vanilla) are getting worse as the system calls per second increase in general.Some lines do not follow the tendency because the system call processing times of are different.
TEE-PA is based on the LA, and most parts of performance degradation are caused by LA, which is Linux original function.The fifth column shows the evidence.When the number of system calls per second is small, the performance of TEE-PA is almost the same as the LA.However, when the number is more than 1,000, the overhead of TEE-PA is exposed.The worst case is the TEE-PA performance is about half of LA.However, they are stress tests, and the performance degradation is acceptable on the standard usage.

C. EFFECTS ON TYPICAL COMMAND
We measured the effects of TEE-PA on typical commands.The commands include different rates of different system calls, and the impacts of TEE-PA are also different.
Table 4 shows the performance results which show the average of 100 executions.The first two columns show the execution time on Vanilla and TEE-PA.The following three columns show the number of system calls, log size, and the 26544 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.overhead from the Vanilla on TEE-PA.The result lines are ordered by the first column; execution time on Vanilla.The results show that the log size is proportional to the number of system calls in general.However, the log size depends on the message size of a system call and has some irregular results.Among them, the log of ps command figures prominently because ps opens /proc/PID files.Therefore, the overhead also figures prominently (the highest overhead is 414.1).The overhead is based on the rate of system calls in the execution time.The short execution time suffers the effect of TEE-PA easily, and the overhead shows a high value.However, the low rate of system calls per execution time shows low overhead, and the lowest overhead is 2.3 for the dd command.
An attacker notices the difference in execution time and detects a security mechanism.However, the attacker cannot modify and stop the TEE-PA because it is protected by TEE.

D. EVALUATION OF ATTACK TRACKING
We recreated four attacks [4], [5], [68] on edge devices and confirmed that TEE-PA could determine the ramifications of these attacks.Table 5 shows the details of the attacks: two widespread AI&IoT edge attacks (Lightaidra [69] and Mirai [70]) and two stealthy IoT Attacks (fileless malware [3] and shell command attacks [5]).We checked whether the SPADE on the server could make the DAG and determine the ramifications of the attacks or not.In addition, we measured the resources used for each detection.
Table 6 shows the results.The second column shows all ramifications of attacks are determined by SPADE, even if the attacks are stealthy.The following columns show the system call log size, used memory, and elapsed time of SPADE on AMD Ryzen2700X@3.70GHz.
From the view of memory, the log size is relatively small even if the size depends on the attack scenario (i.e., Mirai).It indicates that SPADE needs to make DAG from the log and uses plenty of memory.Table 6 shows more than 1.3GB memory is used even if the log size is less than 1 MB.The memory size is heavy for edge devices, and the remote provenance auditing of TEE-PA is appropriate for AI&IoT edge.
From the view of CPU power, the remote provenance auditing of TEE-PA is required.For example, SPADE takes 15-minutes to process on the server to determine the ramification of Mirai.However, if a Raspberry Pi3 (Arm Cortex-A A53, 1.2 GHz, UnixBench System INDEX Score 63.3) runs this process with enough memory, it is estimated to be 5 hours because the UnixBench results show the 1/19 performance of the server (AMD Ryzen2700X 3.70GHz UnixBench System INDEX Score 1,215.7).

E. NETWORK OVERHEAD
Heartbeat messages are sent to the client from the log receiver every 15 seconds using SSH.A message includes a ''touch'' command with 16 byte string and a watchdog extension command.The overhead is considered negligible.
System call logs are sent to the log receiver from the TA-Collect via TEE Supplicant.The URL of the server is included in the TA-Collect and logs are sent by Post of HTTP using socket API of OP-TEE.The content volumes depend on cases in Table 4 and 6.The protocol will be added TLS for security, and total overhead is future work.

F. SECURITY ANALYSIS FOR TEE-PA
To analyze TEE-PA vulnerabilities, we achieved security analysis based on the STRIDE model [71], [72].
(i) Spoofing: TEE-PA does not have a target to masquerade and does not care about spoofing.
(ii) Tampering: The logs are the target to modify.Therefore, TEE-PA is designed that the logs are in TCB or semi-TCB, namely, Linux kernel.The communications are assumed to be secure.
(iii) Repudiation: The functioning of TEE-PA is confirmed by heartbeat messages.
(iv) Information disclosure: TEE-PA does not expose the logs to user space.The communication is assumed to be encrypted and does not disclose the logs.
(v) Denial of Service (DoS): TEE-PA cannot protect against DoS attacks, although TEE-PA is designed to analyze a provenance of attack on an IoT device.If TEE-PA suffers DoS attacks, the CPU load exceeds the limit, and a heartbeat message cannot be treated as normal.The device will cause a system reset.The reset mechanism will work useful because the device will not hang up from CPU overloading.It will reboot and work correctly again, although TEE-PA has no countermeasure for DoS attacks.
(vi) Elevation of privilege: TEE-PA is implemented on TEE and Linux kernel, which are privilege modes protected by TCB and semi-TCB.Unfortunately, the current implementation on Raspberry Pi cannot satisfy the secure boot, and this paper's threat model assumes to be secure.

A. REMOTE ATTESTATION OF SEMI-TCB
The semi-TCB takes a SHA256 value of the Linux kernel memory image and reports it to the provenance auditing server to check the correctness.It works as a remote attestation that guarantees the authenticity of the AI&IoT edge device.However, it is not based on a trust anchor protected by hardware RoT (Root of Trust) and is not valid remote attestation.
The original Arm Cortex-A TrustZone does not support remote attestation, although other TEE implementations have the feature (e.g., Intel SGX and AMD SEV).Fortunately, the next security design based on Arm's PSA (Platform Security Architecture) has a remote attestation mechanism, and the general remote attestation protocol is discussed at IETF [73].If the idea of semi-TCB can be integrated into the remote attestation, the target of TCB is extended to the kernel.We hope the many edge devices equip this mechanism and make them more secure.

B. APPLYING TEE-PA TO TINY IOT DEVICES
TEE-PA is mainly designed for AI&IoT Edge devices and assumes that the CPU can run Linux.As the next step, we plan to apply TEE-PA to tiny IoT devices with limited computing resources(e.g., devices with Arm Cortex-M and embedded OS).Even if IoT devices are tiny, the requirements to detect stealthy attacks are not decreased.Fortunately, Arm Cortex-M has TrustZone, but the architecture differs from Cortex-A.We will need to customize TEE-PA for Cortex-M's TrustZone.
In addition, CPU power consumption and log size depend on the system call category taken by the auditing tool.If we reduce the number of system calls to be audited, the logs may not contain information for detection.We need to decide the type of system calls to audit depending on the acceptable performance degradation of IoT devices.This topic is our future work.

C. RESEARCH FOR LIMITING SYSTEM CALLS
TEE-PA detects stealthy attacks based on system call logs.On the other hand, there are researches to limit system calls 26546 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
to reduce attack surface [74], [75], [76].Linux kernel has SECCOMP mechanism to limit system calls used by an application, and chestnut [77], [78] automatically generates a SECCOMP filter for an application.TEE-PA (i.e., provenance auditing) is a retroactive security and does not need to change the applications.If TEE-PA can mix these researches for limiting system calls, anomaly detection will be much easier.Furthermore, the purpose of AI&IoT edge is generally limited, and the number of applications will be reduced.The combination of system call and application reductions will reduce the logging data and increase TEE-PA performance.Our next research target is the combination of TEE-PA and these researches.

IX. CONCLUSION
We propose TEE-PA to detect stealthy attacks on AI&IoT edge devices.TEE-PA is a system that enables remote provenance auditing using TEE (Trusted Execution Environment).TEE-PA passes system call logs from the kernel to TEE directly and securely.The security is guaranteed by semi-TCB, which combines integrity checks from TEE and REE.In addition, the log is sent from TEE to the remote provenance auditing on a powerful server.TEE-PA performs periodic and unpredictable heartbeat messages to detect problems in edge devices.If a problem is detected, TEE-PA causes a system reset with watchdog timer protected TEE, which works as autonomous recovery.We evaluated TEE-PA and showed that the DAG generated by provenance auditing could determine the ramifications of stealthy attacks.We also measured the overheads and showed that the TEE-PA style's remote provenance auditing could utilize a high performance server (the evaluation estimated 19 times difference), which is appropriate for AI&IoT edge devices.

FIGURE 1 .
FIGURE 1.The structure of OP-TEE on Arm TrustZone.

FIGURE 2 .
FIGURE 2. The difference between (a) traditional provenance auditing for a personal computer and (b) TEE-PA for an AI&IoT edge device and cloud.The intensity of the gray color indicates the strength of the security.The yellow and orange indicate the essential functions (kauditd and SPADE) that are not changed.KB: Kauditd Buffer; RB: Ring Buffer.
Figure 2(a) shows the traditional provenance auditing (i.e., SPADE) on a personal computer, and Figure 2(b) shows a preliminary design of TEE-PA on AI&IoT edge and cloud.The darkness of gray parts indicates the level of security.The common and essential functions are yellow and orange colored (i.e., kauditd and SPADE[9]).

FIGURE 3 .
FIGURE 3.The architecture of Linux Audit (or LA)[34].Dotted rectangle part moves to TEE or server.Gray parts cannot move and are treated as semi-TCB.LA: Linux Audit, KB: Kauditd Buffer.

FIGURE 4 .
FIGURE 4. The architecture of semi-TCB from TEE and REE.

TABLE 2 .
A comparison of existing introspection methods that use TEE.(* TEE-PA requires kernel customization for secure logging but does not require it for introspection.)

TABLE 3 .
UnixBench results on vanilla, Linux Audit(LA), and TEE-PA.The table shows UnixBench Score, number of system call per second on TEE-PA, and the rate of performance degradation from vanilla and LA.

TABLE 4 .
Command execution time on vanilla and TEE-PA and impacts of TEE-PA.The impacts are measured by the number of system calls, log size taken by TEE-PA, and rate of overhead of TEE-PA from vanilla execution.

TABLE 6 .
Detection of attacks and used resources on SPADE.Used resources are log data size sent by TEE-PA, memory and processing time used by SPADE on a server.