Code Layering for the Detection of Network Covert Channels in Agentless Systems

The growing interest in agentless and serverless environments for the implementation of virtual/container network functions makes monitoring and inspection of network services challenging tasks. A major requirement concerns the agility of deploying security agents at runtime, especially to effectively address emerging and advanced attack patterns. This work investigates a framework leveraging the extended Berkeley Packet Filter to create ad-hoc security layers in virtualized architectures without the need of embedding additional agents. To prove the effectiveness of the approach, we focus on the detection of network covert channels, i.e., hidden/parasitic network conversations difficult to spot with legacy mechanisms. Experimental results demonstrate that different types of covert channels can be revealed with a good accuracy while using limited resources compared to existing cybersecurity tools (i.e., Zeek and libpcap).

is easier than containerizing the software (e.g., due to the lack of kernel acceleration). Monitoring and inspection for security purposes is more difficult as well, especially for immutable software images that cannot be modified at runtime.
Cloud-native cybersecurity platforms usually provide proactive controls at deployment time on the integrity and safety of the software. Yet, monitoring, inspection, and tracing remain three crucial requirements for telco-grade transition to Platform-as-a-Service (PaaS), especially to detect and mitigate attacks at the network boundary [3]. To this aim, in this paper we explore the concept of code layering to instrument VNF/CNF entities with monitoring and inspection capabilities. We leverage the extended Berkeley Packet Filter (eBPF), a framework that allows the run-time injection of code in the Linux kernel. Though it was originally conceived for monitoring system performance, eBPF has been increasingly adopted to build network functions [4] and gain network insights. The framework has also been ported to Windows, and it is currently supported by Facebook, Google, Isolavent, Microsoft, and Netflix. 1 To meet the typical demand for safe, immutable, and certified software images for telco-grade services, we propose a framework for the management of a broad class of eBPF programs. Our approach goes in the direction of agentless systems in order to guarantee the ability to address challenging and emerging security threats. As a paradigmatic example, we investigate the detection of network covert channels, i.e., parasitic communications cloaked in innocent-looking network activities [5], [6]. For instance, covert channels can be used to exfiltrate personal information, orchestrate nodes of a botnet, or implement multi-stage loading architectures to extend malware functionalities [7]. Since modern intrusion detection systems have major drawbacks when handling IPv6 traffic and seldom can detect covert channels out of the box [13], [14], assessing such a class of threats is of prime importance. Besides, the widespread adoption of IoT and industrial control systems requires flexible mechanisms against timing channels [15]. Unfortunately, embedding detection capabilities in resource-constrained devices is extremely challenging, therefore suggesting to address them within VNFs.
There are virtually unlimited opportunities to implement covert channels by altering protocol headers or packet timings, thus making their detection an open research question [5], [7]- [10]. Specifically, a comprehensive and general solution to address covert channels would require to continuously adapt inspection processes to new protocols and hiding patterns, which is almost unfeasible with static agents in a conventional security framework. Our framework allows to run a rich set of eBPF programs for gathering condensed statistics on header fields and timings that can be further processed and combined with additional data to spot the presence of covert channels.
In this perspective, the contributions of this work are: • a framework for the inspection of virtualized systems without the need of instrumenting VNF/CNF images or deploying additional sidecar containers; • a scalable and privacy-preserving method to spot covert communications in protocol headers, with specific focus on IPv6; • an analysis of code-layering schemes to detect timing channels via well-known techniques [12]; • an extensive vis-à-vis comparison among the proposed code layering approach and de-facto standard tools, i.e., Zeek and libpcap. We also point out that our investigation utilizes real traffic traces, differently from other works only focusing on theoretical analysis or data obtained in experimental setups (see, e.g., [14] and [16]).
Compared to the preliminary work [11], this paper has the following improvements: a broader scope, which also includes timing channels in addition to storage channels; extensive sensitivity and performance analyses to evaluate the detection, the resource consumption and the impact on packet processing; comparisons with de-facto standard tools for network monitoring; an architecture for monitoring services exploiting network virtualization in PaaS/serverless environments.
The rest of the paper is structured as follows. Section II showcases the reference architecture, Section III introduces the threat model and covert channels, while Section IV describes the experimental setup. Section V discusses the detection of storage covert channels, whereas Section VI considers timing channels. Section VII evaluates the performance of our approach compared to other tools and Section VIII reviews the related literature. Lastly, Section IX concludes the work.

II. REFERENCE ARCHITECTURE
Code layering is a technique that stratifies the software into a number of functional layers, which can be modified in an independent manner. This allows to perform changes without having to re-build and re-deploy the whole software infrastructure. Such a property is highly desirable, since the disruption of a running service is an unacceptable practice for telcograde operations. To this aim, our approach exploits the eBPF technology to implement low-level inspection and tracing operations at run-time both in conventional or PaaS/serverless environments, with negligible impact on service continuity. Figure 1 depicts the reference layered architecture of the proposed framework for monitoring and detection purposes.
The Inspection Layer is located in kernel space and contains various eBPF programs implementing simple monitoring Reference layered architecture for the agentless monitoring and detection of various threats. and inspection tasks. It is explicitly designed to run multiple eBPF programs without the need of changing the guest OS. The Inspection Layer offers functionalities for parsing protocol headers, recording inter-arrival times, as well as for creating custom statistics. In general, an eBPF program should be simple and with a reduced footprint, especially in terms of maximum number of instructions. Moreover, it should be "safe", e.g., it must be loop-free and not accessing memory out of bounds. In fact, for the case of inspecting network traffic, an eBPF program is triggered at the reception of each packet, thus resource-intensive behaviors could lead to hangs or scalability issues. Therefore, eBPF programs are preliminary verified and then executed via a virtual machine implemented as a part of the eBPF Runtime. Interaction with eBPF programs (including management operations and data exchange) is possible through a specific Kernel API.
The Management Layer runs in user space and represents a sort of middleware entity responsible for loading/unloading eBPF programs and collecting their data. To support the broadest range of inspection and monitoring tasks without having to perform changes, it should be loosely-coupled with the data structures used by eBPF programs to collect and store information. Indeed, the Management Layer is the most critical block for building an agentless system, because it is expected to collect generic data without any a-priori knowledge of their structure. For instance, tools using eBPF such as Cilium and Suricata put tight constraints on data structures, hence jeopardizing the possibility to shape the inspection tasks to evolving threats and attacks. Notwithstanding, there are also some examples of monitoring services that allow the collection and creation of custom metrics from generic eBPF programs, see, e.g., the dynamic network monitoring service of Polycube. 2 Such a design choice allows to include this layer in closed-source, verified, and certified software images of VNFs/CNFs or hosting infrastructure without precluding the possibility to collect additional or different measures at run-time. As said, interactions with eBPF programs can be carried out through the Kernel API. However, it is also possible to exploit higher-layer eBPF libraries, which can include bindings for many languages, e.g., C, Python, Go, and Lua.
Finally, the Detection Layer entails specific algorithms running in user space to reveal and mitigate various threats and attacks. Algorithms implemented in this layer are not strictly part of standard security agents, since most security information and event management architectures deploy them in a remote centralized location. The Detection Layer can be used to engineer a wide range of security tasks.
As possible examples of services using eBPF, we mention: tracking traffic with a per-flow granularity with a reduced footprint [18], identification of processes or nodes contacting malicious servers without degrading the performance of the inspected traffic/processes [19], and support of deep packet inspection operations [3]. For the case of hidden communications, this layer can be used to detect network covert channels as well as processes or threads locally leaking data [17].

III. THREAT MODEL
The threat model considered in this work deals with two endpoints trying to remotely communicate in a cloaked manner. This template is usually exploited by an attacker wanting to exchange data from the host/device of the victim towards a remote Command & Control (C&C) server while avoiding detection or blockages. The attacker can inoculate a malware (e.g., via phishing) and then use the covert channel to exfiltrate sensitive information, orchestrate nodes of a botnet, implement multi-stage loading architectures to extend at runtime offensive functionalities, or bypass firewalls or filtering rules [7], [20]. To this aim, the covert sender hides information by altering an overt traffic flow and creates a cloaked communication path. By pre-sharing a hiding mechanism, the covert receiver can then extract the secret data. The overt traffic flow could be generated directly by the attacker co-located within the end node(s) or altered in a Man-in-the-Middle fashion. The process of hiding data should not disrupt the overt traffic or cause (too many) visible alterations, otherwise the hidden communication attempts would be spotted. The properties of a covert channel are usually tightly coupled. For instance, the higher the throughput of the covert communication, the higher the chance of revealing its presence due to alterations [9]. Two major classes of network covert channels exist as depicted in Figure 2 (see, [8] for a fine-grained taxonomy).
The first group consists of storage channels, which are created by directly hiding information in header fields, altering the structure of the packet, overwriting padding bits, or by re-arranging optional fields, just to mention the most popular techniques. Literature abounds in works exploring how to inject secret data in the TCP/IP suite [5], [8]. However, the use of IPv6 has been partially neglected and, with its increasing diffusion, it is expected to become a major target for covert communications in the future [17]. Therefore, in this work we consider the most effective storage network covert channels exploiting IPv6 traffic, especially those targeting the Traffic Class, Flow Label, and Hop Limit [16]. For the case of Traffic Class and Flow Label, we consider an attacker directly writing data within such fields. Instead, for the case of the Hop Limit, the secret is encoded by introducing a pre-shared offset between two consecutive values to encode '1' or '0'. In Section V we will mostly concentrate on revealing channels in the Flow Label since it offers more space to embed secrets (i.e., 20 bits compared to the 8 bits of the Traffic Class and 1 bit of the Hop Limit value modulation). Moreover, Quality of Service is often enforced in border routers causing the disruption of the secret hidden in the Traffic Class as well as its detection owing to the presence of anomalous values. Similar considerations can be drawn for the case of the Hop Limit, especially for modern networks engineered via fewer but longer links, thus reducing the range of values for the field and making the presence of arbitrary values easier to spot. Therefore, the Traffic Class and Hop Limit will be briefly addressed in Section V-C.
The second group of covert channels consists of timing channels, which are created by encoding secret data through suitable alterations of the temporal evolution of network traffic. Possible encoding schemes are based upon the alteration of the throughput, introduction of statistical signatures in the jitter or the manipulation of the inter-packet time. Usually, timing channels are protocol-agnostic and mainly implemented at the network layer or by altering the error rates characterizing the data link [12], [21]. Since we are interested in covert channels with an Internet-wide scope, in Section VI we will address timing channels exploiting the alteration of the time gap between consecutive datagrams. Compared to storage channels, the detection of timing channels is more coherent and investigated [5], [21]. Thus, we will resort to a known approach instead of proposing novel mechanisms.

IV. EXPERIMENTAL SETUP
For the sake of evaluating code layering for the detection of network covert channels, we developed the reference implementation depicted in Figure 3, which is composed as follows: • Inspection Layer: it contains a set of eBPF programs that can create statistics on the usage of header fields and packet inter-arrival times for both IPv4/v6 traffic. Programs 3 collecting data to address storage channels are based on a modified version of bccstego, i.e., a suite of tools able to generate filters for inspecting network and higher-level protocols like TCP/UDP [22]. Instead, to address timing channels, we created a novel eBPF program 4 collecting time information and implementing the approach presented in [12] within the kernel; • Management Layer: to load and unload eBPF programs as well as to collect measures, we implemented ad-hoc scripts and user-land utilities taking advantage of the BPF Compiler Collection (BCC) library. 5 This layer also includes functionalities for setting runtime parameters; • Detection Layer: to spot storage covert channels, we developed a method based on "condensed" statistical indicators, e.g., the frequency/number of values for a specific field provided by the Inspection Layer. Instead, for the case of timing channels, we simply consider regularity metrics presented in [12]. Details on the detection methodology will be provided in Section V and Section VI, respectively. Concerning the threat model, we considered malicious endpoints communicating through several types of network covert channels in different scenarios targeting large traffic aggregates. To run tests, the communicating peers have been implemented via two virtual machines running Debian GNU/Linux 10 (kernel 4.20.9), with 1 virtual core and 4 GB of RAM. A third virtual machine with the same characteristics has been deployed to route and inspect traffic as well as to implement the code layering approach depicted in Figure 3. In our trials, the various eBPF programs have been attached to the output queue, thus inspecting the egress traffic. However, this does not lead to a loss of generality, since our implementation can also handle programs attached to the input queue without any meaningful difference in terms of performances. For the sake of comparison, the intermediate node has been also used to run a modified version of Zeek 6 and a pure user-space tool for gathering data with libpcap. To run the virtual machines, a host with a 3.60 GHz Intel i9-9900KF CPU, 32 GB of RAM and Ubuntu 20.4 (Linux kernel 5.8.0) has been used. In all trials, to quantify the footprints in terms of CPU and memory, we used pidstat, which is part of the sysstat collection. 7 Apart eBPF programs written in ANSI C, we used Python to implement loading functionalities, the various user-space daemons as well as supporting tools for gathering and analyzing obtained data.
To conduct tests in realistic network conditions, we used traffic collected on an OC192 link in different conditions/periods made available by the Center for Applied Internet Data Analysis (CAIDA). 8 Without loss of generality and to prevent burdening our trials, we removed packets with a Flow Label value equal to 0, ICMPv6 traffic, and singledatagram UDP conversations. In our experiments, we used the slice captured on March 15, 2018 from 14:00 to 15:00 CET between Sao Paulo and New York. After processing, we obtained a 30-minute long dataset composed of ∼15,000 TCP and UDP conversations. To implement storage covert channels, we directly injected various secret messages in the dumps provided by CAIDA [23]. Instead, for the case of timing channels, we used iPerf3 9 to generate ad-hoc flows. The approach in [23] has been used again to modulate inter-packet times and encode the secret information. Traffic generated via iPerf3 has been also used to compare the performance of the proposed agentless approach against Zeek and libpcap.
As it will be detailed later, the detection of storage covert channels can also take advantage of other network monitoring tools. To this aim, in our trials we adopted nProbe Enterprise M v. 9.5.210715 10 to inspect the traffic in realtime and compute the number of active IPv6 flows. According to preliminary tests, the number of active flows reported by nProbe is insensitive to the presence of IPv6 covert channels. This further supports the need of teaming up with a specific solution when such channels have to be detected.

V. DETECTION OF STORAGE COVERT CHANNELS
This section showcases the detection of storage covert channels targeting IPv6 conversations. As a paradigmatic example, we will discuss the case of the Flow Label, since it requires to handle a 20-bit space leading to a significantly higher bandwidth compared to other fields. Thus, for the Hop Limit and the Traffic Class we limit to a simpler analysis. We point out that the proposed mechanisms could be further extended to tackle channels targeting other fields/protocols.

A. Detection of Channels Targeting the Flow Label
The detection of storage channels targeting the Flow Label is based on the coarse-grained estimation of the number of IPv6 conversations. Since each IPv6 conversation is identified via a fixed, unique Flow Label value generated according to a uniform distribution [24], this can provide a rough estimation of the number of flows. The resulting metric can be then compared against measurements collected by network monitoring tools or used to "reinforce" indicators provided by standard firewalls or intrusion detection systems. 8 The CAIDA Anonymized Internet Traces Dataset (April 2008 -January 2019) -Available online: https://www.caida.org/data/monitors/passiveequinix-nyc.xml. 9 https://iperf.fr/ 10 https://www.ntop.org/products/netflow/nprobe/ Without loss of generality, we assume to have periodical measurements on the number of active IPv6 flows in the network, denoted as F, which is commonly provided by tools for network monitoring.
To compute such an estimation, we engineered a lightweight packet inspection mechanism suitable for being implemented with eBPF. In essence, the kernel has been extended to count the occurrence of Flow Label values by setting a hook point in the tc queue management. To guarantee privacy and scalability requirements as well as to prevent performance degradation for large traffic volumes, the 20-bit space of possible values is mapped into a bin-based data structure composed of B equally-capable bins. Accordingly, each bin has a size of 2 20 /B values. The mapping is based on the first log 2 B bits of the Flow Label, which are used to index the array of bins. Data is then periodically collected by a userspace utility every Δt seconds and the bin-based structure is emptied to avoid saturation: this is ruled via a time window with a duration denoted with T seconds. Parameters Δt and T allow to adjust the proposed approach to "follow" the dynamic of birth/death of covert communications and match measurements/feedback information provided by external tools with different timings, respectively. Therefore, the number of "dirty" bins, i.e., bins with a non-zero value, provides an estimate of the number of IPv6 conversations, denoted in the following with N. This is only an approximation: if different Flow Label values share the same bin, this will cause a collision. Greater values of B reduce such a probability and improve the precision, but at the price of a higher memory burden. As an example, let us consider the case of B = 2 12 bins with a size of 2 8 values. If a packet with a Flow Label value equal to 337 (i.e., 0x00151) is observed, the second bin is flagged since it is the one containing values in the 256 − 511 range (indexed by the 0x001 prefix). Accordingly, N is incremented by 1.
The presence of a covert communication could be revealed by comparing N and F, e.g., to understand if the relation N > F holds. However this could be inaccurate, especially due to the saturation of a bin and the coalescing of entries caused by a limited value of B. For this reason, we introduced a scale factor denoted with α to balance the flow/bin proportion. The resulting detection relationship is then αN > F . Unfortunately, using only a threshold could lead to an unstable behavior, that is, the detector over/under reacts when in the presence of minimal fluctuations in the number of flows. For this reason, we added a hysteresis parameter ξ.
For the sake of illustrating the proposed detection mechanism, Figure 4 showcases an example considering the exfiltration of 21.25 kbytes of data. In more detail, Figure 4(a) depicts the outcome of the detection for different values of B when α = 0.9 and T = 30 seconds. As shown, smaller bins (i.e., when B increases) allow to better spot the covert channel but at the price of more false positives. On the contrary, coarse-grained bins (e.g., for B = 2 12 ) tend to underestimate the presence of an hidden communication. A possible workaround could exploit a trade-off between the number of bins and the "frequency" of measures. Figure 4(b) reports the results for T = 15 seconds. As shown, smaller timeframes cause a more frequent "reset" of the bin-based scheme leading to an underestimation of the number of active IPv6 conversations. Thus, it is not possible to directly compare N with the measurement F provided by nProbe. The parameter α can correct this mismatch by "magnifying" the obtained values but at the price of errors leading to false positives. In general, the "optimal" matching between the observed traffic and the number of bins is critical since it influences both the "stability" and the performance of the detection. Thus, Section V-B discusses in detail the design of the various parameters.

B. Sensitivity Analysis
Detecting storage covert channels is subject to many tradeoffs. For the case of the Flow Label, there is the need of balancing the granularity of the gathering phase (i.e., Δt, T and B), the quality of the estimation (i.e., N and α), as well as the resources required to run additional logic. Therefore, this round of tests aims at performing a sensitivity analysis of the framework.
For the sake of considering a wide-range of use cases, we designed three different attack scenarios. Specifically,   [20]. To this aim, we used three different covert channels activating in a timeframe of 15 minutes to exchange data requiring 2,500, 6,000, and 7,500 IPv6 packets (i.e., 6.25, 15, 18.75 kbytes, respectively) within overt flows of 12 kbit/s, 1,600 kbit/s, and 100 kbit/s, respectively. Lastly, Scenario 3 considers an APT targeting a datacenter or a subnetwork, thus producing multiple covert channels towards a C&C server. In this case, we used 10 concurrent covert communications targeting each one of 800 IPv6 packets (i.e., 2 kbytes). After 10 minutes the number of connections is halved, for instance, due to reboots or crashes/shutdowns of compromised nodes.
As a first step, we evaluated the impact of the number of bins B and the sampling time Δt ruling the kernel-to-userspace copy of collected values to elaborate on constraints of the granularity of the detection process. To this aim, we replayed the considered traffic trace towards the node running the eBPF framework. The related CPU and memory usage have been collected with a granularity of 10 samples per minute and average values have been computed. Table I shows the obtained results. To avoid burdening the table, we report values for B = 2 8 (as they represent the case of measuring the Hop Limit and Traffic Class), B = 2 12 for an intermediate reference, and for B > 2 16 . As shown, the footprint of the user-space program collecting results increases with the "precision" of the data gathering (i.e., B and Δt). Despite the absence of configurations leading to an unbounded utilization of resources, a major bottleneck is caused by the operations needed to copy data from the kernel space to userland. This is especially true for B = 2 20 : in fact the copy requires ∼14 seconds, thus causing a "misalignment" from real Flow Label values and those collected in the meantime. Indeed, also the granularity of Δt is subject to careful design choices. Even if a precise tracking of the abused flow is desirable, this should be impeded by difficulties in gathering data in a fine-grained manner. For instance, a typical timeframe for computing analytics of large-scale links/networks is in the range of 30−90s, thus relaxing tight constraints on Δt (see, e.g., [25] for timing constraints for scalable classification). Therefore, for the sake of brevity, in the rest of the paper we will limit our analysis to Δt = T = 30 s.
Concerning the possible tradeoff among B and the ability of spotting hidden communications within the bulk of traffic, Figure 5(d) provides a comprehensive overview for the impact of B on the accuracy. In general, as shown in Figure 5(a), best results are achieved for Scenario 1 mainly owing to the presence of a unique covert communication leading to a non-negligible volume of artificial Flow Label values. Instead, when in the presence of hidden transfers characterized by ON/OFF or "fading" behaviors, the accuracy decreases accordingly, as reported in Figures 5(b) and 5(c). Even if higher values of B typically lead to a better accuracy, the proposed approach is able to capture the presence of storage covert channels also with a reduced number of bins (see Figure 5 The accuracy may not be sufficient to capture the performance of the proposed approach in terms of false/true positive/negative events. Therefore, Table II reports the true positive rate (TPR) and the true negative rate (TNR) collected when using various values of B for α * = 0.9, i.e., the "optimal" α leading to the best performance. Moreover, as depicted in Figure 4, the presence of a threshold-based rule may lead to an unstable behavior of the detection. To mitigate such an issue, we also investigate the impact of ξ implementing a sort of hysteresis for the comparator rule αN > F , i.e., the outcome of the detection changes according to +ξ and −ξ switching thresholds à-la Schmitt. Specifically, it is a lower/upper bound considering F ± ξ with ξ = 1%, 5%, and 10% of its current value. For the sake of brevity, we limit our analysis to B > 2 15 .
As shown, for the case of Scenario 1, the parameter ξ allows to improve the overall detection, especially in terms of TNR. However, greater values of ξ may cause a decay of the accuracy as they make harder to switch the outcome of the detector, thus remaining in a "wrong" state. For the case of Scenario 2, the poor performance of the TPR affects the accuracy, despite the various B and ξ. This can be ascribed to the presence of a low-throughput channel reducing the effectiveness of the detection mechanism, i.e., the TPR remains in the 45.45 − 68.18 range. A similar behavior characterizes Scenario 3: again, ξ improves the TNR. Yet, the presence of many covert channels halving their activity influences the TPR and the accuracy mainly due to the reduced volume of altered Flow Label. Similarly for the case of Scenario 1, higher values of ξ prevent to switch from positive to negative (and viceversa) when the throughput of covert data changes in time.

C. Channels Targeting Other IPv6 Fields
When handling less capacious fields, the bin-based approach can still be used to implement simpler yet effective counters to reveal hidden communications. Specifically, for the case of the Traffic Class and Hop Limit, by creating a structure with B = 2 8 , it is possible to perform a one-toone map between observed values and "dirty" bins. To this aim, we performed an experimental campaign considering the same background traffic used for trials in Section V-B. For the Traffic Class, we considered a threat embedding a malicious command of 512 bytes as used by the Silence Trojan to download and execute a PowerShell script within a flow with the average throughput of 750 bytes/s. Instead, for the Hop Limit, we assumed an attacker wanting to deliver a stage of the Emotet malware 11 Figure 6(a) deals with a covert channel built by embedding secrets in the Traffic Class. As shown, the number of non-empty bins varies according to the different Traffic Class values within the bulk of traffic. When the targeted IPv6 conversation is active, the number of bins is higher due to the presence of the secret, leading to a sort of "signature". On the contrary, for the case of Hop Limit (see Figure 6(b)), this is less evident, especially due to the used hiding strategy not directly storing the secret. Hence, a more sophisticated approach is needed: this is part of our ongoing research.

VI. DETECTION OF TIMING CHANNELS
This section investigates how the proposed agentless approach can be used to reveal the presence of timing covert channels. Specifically, we are interested in understanding whether the detection logic can be partially embedded within eBPF mainly to avoid the need of further moving and processing data in user space. To this aim, we implemented a de-facto standard algorithm borrowed from the literature. Originally introduced in [12], the idea is to compute a measure of regularity for a set of variances built by grouping packets to make pattern-like behaviors emerge. Patterns can then be used to reveal the presence of hidden information causing "anomalous" inter-packet times. In essence, for each window composed of W packets, the algorithm in [12] calculates the standard deviation σ of the related inter-packet time values. Then, it computes the pairwise differences between σ i and σ j , for each pair i, j. The final regularity measure is given by computing the overall standard deviation for all the pairwise differences. Unlike the original version, our implementation checks the regularity metric on-line, i.e., a flow is evaluated on a semi-continuous basis. Unfortunately, due to eBPF limitations in terms of stack size and number of instructions, the regularity measure has been approximated (e.g., the lack of sqrt() and other mathematical operations required to implement approximate counterparts). Moreover, to tame memory consumption, the regularity indicator is periodically reported to prevent the need of "unrolling" too many operations. In the following, we define such a "control" parameter as Q, i.e., the number of values for σ considered for each computation of the regularity metric.

A. Numerical Results
To evaluate our code layering mechanism when used to detect timing channels, we performed trials with hidden conversations nested within the inter-packet time of a ∼7,000 datagrams flow. To make our investigation more comprehensive, we present results obtained with IPv4 traffic: similar results have been obtained with IPv6. To test the covert channel, we sent a malicious command of 304 bytes used by the GZipDe malware. According to [12], to encode the value 1, we inflated the inter-packet time for two adjacent datagrams by 0.06 seconds, which ensures a character accuracy of 98%. Instead, the 0 value is encoded by maintaining the original timing of the overt traffic. To evaluate the impact of the inkernel detection algorithm, we measured the CPU and the memory usage, as well as the packet loss and the jitter of the processed traffic. For each trial, we considered flows with various bitrates, i.e., 10 kbit/s, 100 kbit/s, and 1 Mbit/s. We also evaluated the impact of the number of packets W used to compute the standard deviation, which somewhat constitutes the granularity of the approach. Specifically, we considered W = 100 and W = 250 as suggested in [12]. Results indicate that our agentless approach does not introduce further delay or packet loss on the inspected traffic. Indeed, the low bitrate characterizing timing channels plays a major role, especially it does not require tight computational constraints. This is further supported by CPU and memory consumptions, which are limited to ∼0% and ∼114 Mbytes, respectively, throughout all the trials.
To understand the ability of the eBPF-based code layering approach to handle large traffic volumes, we performed an additional round of tests considering different packet sizes and higher traffic rates. Specifically, we considered datagrams ranging from 16 to 65,507 bytes, in order to consider both worst and best cases in terms of packet processing. Although the fragility of timing channels limits the allotted throughput, the proposed approach could be deployed to monitor Internetscale deployments (e.g., a datacenter). In this perspective, we also investigated the impact of W and Q to assess the scalability of the proposed agentless implementation. Table III contains the CPU utilization. In more detail, the code layering mechanism does not account for major overheads. Concerning the overall memory consumption, it is always bounded to ∼110 Mbytes and almost constant. This is due to the limited amount of memory needed to store the Q standard deviations and the W inter-packet times. Lastly, we also measured the delay and jitter of the traffic processed with the eBPF code. Overheads caused by the inspection are almost negligible for all the considered configurations.

VII. PERFORMANCE COMPARISON
A main goal of our framework is to run agentless detection processes without relevant performance degradation. Hence, this section presents a comparative analysis with well-known monitoring tools and technologies for network inspection. Specifically, we compared our agentless approach against implementations of the bin-based technique with Zeek and ANSI-C/libpcap. For the sake of comparison, we also considered a reference scenario, denoted as "baseline" in the following, where no traffic inspection is performed (i.e., no tools were running) in order to have a lower bound. Indeed, monitoring network traffic could interfere with the overall Quality of Service/Experience (especially by impacting on the packet loss, bitrate, and latency of delay-sensitive applications) or require non-negligible computing and storage resources. Therefore, we considered the impact of the packet size and transmission rate for UDP flows as well as the Maximum Segment Size (MSS) for TCP streams. For the packet, we considered four different values: 16 bytes modeling tiny and fragmented traffic, 1,470 bytes modeling a full utilization of the Ethernet frame, 8,192 bytes modeling IPv6 jumbo frames, and 65,507 bytes modeling maximum size allowed by UDP and representing the "best" condition for forwarding. 12 For the MSS, we selected four different values as well, by doing similar considerations for the case of UDP, i.e., we used 88 bytes, 536 bytes which is the the minimum value that should be used on IP links, 1,460 bytes for the full Ethernet utilization, and 9,216 bytes.
Concerning the transmission rates, we considered traffic loads ranging from 10 kbit/s to 10 Gbit/s. However, it turned out that our testbed was not able to sustain rates higher than 3 Gbit/s. This has to be ascribed to a limitation of our softwarized implementation but does not represent a constraint. In fact, production-quality deployments usually rely upon some form of acceleration that can sustain more than 10 Gbit/s of traffic [26].
For the sake of brevity and to avoid burdening results, in the following we only report and discuss the case of gathering information for the Flow Label when B = 2 12 and for loads up to 1 Gbit/s. Yet, similar results have been observed for the case of the Traffic Class and Hop Limit. Figure 7 investigates how the inspection process behaves in the presence of different packet sizes and bitrates. In more detail, Figure 7(a) shows that the proposed method has a very limited impact on the transmission rate, for the whole range of relevant parameters. Specifically, libpcap-based tools duplicate packets via raw sockets, hence decoupling additional processing from forwarding operations (i.e., inspection is done on a copy of the packet). However, even if eBPF programs act on the forwarding path, the impact is limited, thus the resulting behavior does not deviate from the considered baseline condition. Figure 7(b) depicts results for the packet loss, which is affected by the bitrate, as expected. In general, the causes of the losses are due to tiny packets causing a major overhead, and limitations of our setup to handle rates in the 1 Gbit/s range. For the sake of brevity, we omit results concerning the jitter. The measured variation for the inter-packet delay is ∼0.1 ms for all the considered tools, thus making our approach feasible also to search for covert channels in multimedia or time-sensitive flows.

A. Impact on Packet Transmission
Finally, Figure 8 showcases the performances in terms of rates achieved when using the TCP/IPv6 traffic. Coherently, higher bitrates are possible with larger MSS especially due to a beneficial impact on the TCP flow control mechanism. Again, our approach performs similar to the case of Zeek and libpcap. Thus, our eBPF-based mechanism does not affect packet transmission in a significant way.

B. CPU and Memory Usage
CPU and memory utilizations are important to understand the footprint of the various frameworks used for the detection of network covert channels. Figure 9 reports a detailed  breakdown of the used CPU. Our eBPF-based approach accounts for a small overhead with respect to the baseline. Both the libpcap-based tool and Zeek require more CPU at higher bitrates, whereas our framework has a more "stable" demand. Similar considerations hold for the case of TCP as reported in Figure 10. Even if the Python language is not the best option in terms of processing speed, our agentless mechanism performs better than the other tools and limits the used CPU compared to the baseline.
Concerning the used memory, in our trials we investigated the overall memory utilization including the Virtual Memory Size (VMS), the Resident Set Size (RSS) representing the size of physical memory including shared libraries, the Proportional Set Size (PSS) capturing the size of physical memory with proportional attribution to shared libraries, and the Anonymous utilization (Anon) containing the stack and other allocations. Figure 11 depicts the obtained results. As shown, Zeek has a larger memory footprint, but only a minimal part is allocated to the RAM. The memory allocated for our eBPF-based approach is larger because of the many libraries needed by the Python runtime. Instead, the ANSI-C implementation based on libpcap has a negligible memory requirement owing to the efficient nature of the library and the use of a very minimal fraction of other calls, mainly for I/O operations.

VIII. RELATED WORK
Even if eBPF is a relatively novel technology, it has been already considered for a variety of security-related tasks or to improve various software components. As an example, [19] investigates how to extend the ntopng network monitoring tool with events generated by the libebpfflow, which allows to enrich network-layer data with system metadata (e.g., source/destination IP addresses are matched against source and destination processes and system users). The goal is to support the definition of custom policies to drop unwanted connections. Besides, eBPF can be used to break up the conventional packet filtering model in Linux. This can be achieved by moving the inspection process in the eXpress Data Path, where ingress traffic can be processed before the allocation of kernel data structures, thus leading to performance benefits [27]. This paradigm can be used to provide a "first line of defense" against unwanted traffic such as flows with spoofed addresses or DoS/DDoS attacks [28].
In the context of network tracing, [29] proposes a framework where a master node translates user inputs into configuration files to feed eBPF agents for monitoring network packets of specific connections at given tracepoints (e.g., virtual network interfaces). Obtained measurements are then collected and analyzed in a centralized manner. In [30], the authors propose an eBPF-based implementation for monitoring the traffic exchanged between virtual machines without the need of specific hardware appliances. Results indicated that duplicating packets with an eBPF program attached to a hook in the tc achieves better throughput than native port mirroring of Open vSwitch, especially for large data units. The idea closest to our approach is presented in [31] showcasing a system for deploying eBPF programs and collecting their measurements in containerized user-space applications. To this aim, the framework exploits tools like Prometheus, Performance Co-Pilot, and Vector, as well as specific eBPF programs and various userland counterparts. However, differently from our work, [31] does not consider covert communications or manipulations of network artifacts. Rather, it focuses on monitoring the garbage collector, identifying HTTP traffic, and implementing IP whitelisting.
In general, detecting a network covert channel requires to develop protocol-and method-dependent metrics [5]. Therefore, a vast part of the literature focuses on specific injection mechanisms or protocols. Owing to its ubiquitous  availability and multiple exploitable behaviors, many works address the problem of revealing hidden communication nested within the TCP. For instance, [32] proposes a statistical model to spot data injected in the Initial Sequence Number field used to synchronize peers. For similar reasons, the mitigation of channels exploiting the DNS has been largely studied as well, especially due to its wide adoption in data exfiltration campaigns or botnet orchestration (see [33] and the references therein for a discussion on the real-time detection of end-to-end DNS covert channels). Even if there is no evidence of real-world attacks using Internet telephony to covertly exfiltrate data, a relevant amount of works investigated how to mitigate covert channels targeting VoIP conversations. As a possible example, [34] deals with the Session Initiation Protocol when exploited to move secret data and highlights the need of performing protocol-dependent parsing to spot the anomalous information. For the case of voice traffic, the work [35] reviews a plethora of mechanisms to tame covert communications within VoIP conversations, e.g., analyzing audio artifacts to spot data embedded in voice samples, search for anomalous traffic features to reveal encoding scheme using packet losses or manipulations of the delays, as well as deploy nodes buffering and padding traffic to disrupt parasitic information in a blind manner. Recent surveys [7], [10] highlighted the need of shifting towards more general indicators or exploiting features that can bring together different protocols, such as anomalous energy usages or signatures in the execution of processes running in end nodes.
For the specific case of detecting IPv6 covert channels, the literature already offers some previous attempts. In more detail, [36] proposes a machine-learning technique: unfortunately, this requires suitable datasets for training the detector. The work [13] deals with the leakage of hidden information through the manipulation of v4/v6 transitional mechanisms, which is definitely outside the scope of our work. Additionally, [17] and [14] provide preliminary assessment of mechanisms to detect IPv6 covert channels and also emphasize the inadequacy of standard network intrusion detection tools to handle such threat out of the box.
Lastly, the protocol-agnostic nature of timing channels leads to a more coherent and homogenous literature. Despite the wide-array of used methodologies (e.g., AI-capable framework or ad-hoc metrics), the problem of revealing hidden or parasitic conversations within timing feature has been better investigated compared to other type of channels, see [21] for a comprehensive survey on the topic.
Compared to previous approaches, our idea allows to consider different covert channels within a unique framework. Owing to the flexibility of eBPF in handling various traffic features, the inspection process can be extended or adapted to consider different types of storage covert channels. Differently from past works available in the literature only addressing a single protocol or pursuing generalization via AI and dataintensive approaches, our bin-based data structures prevents to store and process sensitive details even with a per-flow granularity. This design allows to guarantee privacy requirements, while taming the computational burden. For what concerns the detection, revealing a class of channels only accounts for the creation of a simple detection rule, which can also take advantage of measurements already provided by network monitoring tools commonly deployed in medium-and large-sized scenarios.

IX. CONCLUSION AND FUTURE WORK
In this paper, we have presented a code layering framework for the detection of storage and timing covert channels. Specifically, we engineered an agentless monitoring architecture and developed various eBPF programs to gather data, map obtained values in suitable data structures, and implement a reference detection mechanism. Collected results indicate that code layering can be effectively and efficiently used to implement monitoring mechanisms in PaaS/serverless environments, as well as to implement a complete detection "pipeline" for covert channels. Moreover, the required resources make the use of eBPF a convenient choice, especially if compared with tools like Zeek or libpcap.
Future works aim at using the proposed approach to consider other channels and different threats (e.g., DDoS or cryptojacking campaigns). A part of our ongoing research concerns the utilization of eBPF for actively manipulating traffic, e.g., to sanitize flows and disrupt the channels by overwriting fields or restoring them to a standard value. To remove some limitations of the detection scheme, we are investigating the use of AI to face more sophisticated threats (e.g., modulation of values in the Hop Limit), also to reduce the dependance on parameters requiring a suitable design (e.g., α and ξ). Concerning the technological viewpoint, future developments aim at refining the engineering and implementation of the agentless framework, especially for its deployment in production-quality environments. Obtained performance suggested the possibility to rewrite part of the framework in ANSI-C and take advantage of libbpf to prevent possible performance bottlenecks.