Sniffing Detection Based on Network Traffic Probing and Machine Learning

Cyber attacks are on the rise and each day cyber criminals are developing more and more sophisticated methods to compromise the security of their targets. Sniffing is one of the most important techniques that enables the attacker to collect information on the vulnerabilities of the devices, protocols and applications that can be exploited within the targeted network. It relies mainly on passively analyzing the traffic exchanged within the network, and due to its nature, such an activity is difficult to discover. That is why, in this article, we first revisit existing techniques and tools that can be used to perform sniffing as well as the corresponding mitigation methods. Based on this background, we propose a novel measurement-based detection method that infers whether the sniffing software is active on the suspected machine by network traffic probing and machine learning techniques. The presented experimental results prove that the proposed solution is effective.


I. INTRODUCTION
Currently, whole societies are becoming even more dependent on open networks. The Internet changes various aspects of everyday life, like commercial activities, business transactions and government services being offered online. As a result, this has led to the fast development of new cyber threats and numerous information security issues, which are exploited by cyber criminals. Moreover, currently the attackers are devising increasingly sophisticated methods to compromise the security of the devices that the users utilize.
Sniffing is a typical part of the reconnaissance phase of the network attack. It can be executed by the attacker who is able to access the targeted network infrastructure. Moreover, the attacker can analyze the passing network traffic using, e.g., an insecure WiFi network, or as a result of successful insider threat actions. The main aim of such activities is to identify the machines, protocols and applications that are running within the targeted network. After that the attacker can determine the potential vulnerabilities that can be used to compromise the network. In most cases these actions precede the actual attack. Therefore, it must be noted that an early The associate editor coordinating the review of this manuscript and approving it for publication was Ana Lucila Sandoval Orozco. detection of sniffing is of great importance as it allows the defender to prepare for further attack phases. Moreover, if the security system introduced within the network is able to detect sniffing devices, it can take countermeasure actions and thus reduce the attack probability. Sniffing is typically executed with the help of dedicated software, called sniffers, of which most notable examples include TcpDump 1 or Wireshark. 2 They rely on passive analysis of the network traffic transmitted within the network. Due to this characteristic, such activities are typically difficult to detect.
Currently, in the existing literature there are several approaches for sniffing detection that have been proposed [10]- [13]. They mostly focus on trying to determine whether a wired or wireless Network Interface Controller (NIC) is set to the promiscuous mode, which typically is used to diagnose network connectivity issues. In this case, NIC forwards all frames allowing the machine to analyze them even if they are intended for other devices in the network. It must be noted that in a non-promiscuous mode, unless the frame is addressed to that specific NIC's MAC address (or is a broadcast or a multicast frame), it is dropped.
However, as we reveal in this article, existing sniffing detection approaches are mostly outdated and the majority are not effective any more. That is why further research is needed to establish which defensive measures can be utilized to identify such activities. Additionally, novel approaches are needed to effectively deal with such threats.
We decided to address a sniffing attack within the security framework of the Internet of Radio-Light (IoRL) system which is being created within the Horizon 2020 project. It aims at developing an architecture for smart buildings [1], supermarkets, museums or even train stations [2] using a 5G Radio-Light multi-component carrier, Frequency Division Duplex (FDD) broadband system consisting of a Visible Light Communication (VLC) downlink channel in the unlicensed THz spectrum and mmWave up/downlink channels in the unlicensed 30-300 GHz spectrum. IoRL allows wireless communication networks to be deployed in buildings that can provide data rates greater than 10 Gbit/s, latencies of less than 1 ms and location accuracy of less than 10 cm, whilst reducing the EMF levels and interference, lowering the energy consumption at the transmitter/receiver and increasing the User Equipment (UE) energy battery lifetime.
However, apart from the obvious benefits that such a system can offer to the users, some challenges and issues must be addressed first. As IoRL integrates various networking technologies, i.e. VLC, mmWave, SDN, WLAN and eNB/HeNB, and each of them is characterized with a specific set of features, there are potential security threats and vulnerabilities that are still often not completely resolved and still need addressing.
That is why, for the IoRL system a dedicated Integrated Security Framework (ISF) has been designed and is continuously developed within the project to mitigate various threats and provide an adequate level of security [3], [4]. While designing the ISF we have utilized experiences and analyzed solutions from existing finished or ongoing 5G security projects and standardization activities. A key, fundamental technology used within the ISF is Software-Defined Networking (SDN) that can easily realize important security features like security monitoring and management. Based on the SDN controller, a centralized system for security monitoring and manageability, providing near real-time awareness of network incidents status and effective enforcement of security policy can be designed and developed. For this purpose, security-related Virtual Network Functions (VNFs) can be created that will perform various security functions. Such a solution will also work effectively by enabling correlation, aggregation and analysis of the security-related data originating from different sources to provide a complete network-wide view of the security posture (security analytics).
In our previous work [5] we presented initial results for the developed detection method that relies on inflicting artificial load on the investigated machine and measuring its Round-Trip Time (RTT) with and without the load. The proposed solution turned out to be effective, especially for Windows-based machines. This article can be treated as an extended version of [5], however, here we make the following novel contributions: • we develop a novel measurement-based sniffing detection method that relies on network traffic probing and additionally uses machine learning techniques, • we perform a systematic and detailed experimental evaluation of the proposed solution, • we use not only ICMP, but also an application layer protocol (HTTP) for traffic probing, • we also propose how to detect sniffing on Linux hosts, which was not addressed in the initial work [5] and • in the experimental part we also present a performance evaluation using features like CPU load and variable 'flooding'/'idle' period lengths.
The rest of this article is structured as follows. Section II provides a review of the existing approaches for the sniffing detection. Next, in Section III, we verify whether the previously proposed methods and tools can be still used to discover such threats. In Section IV we present the design and implementation of our own approach for sniffing detection. Section V contains the obtained experimental results and a description of the fine-tuning of the proposed solution to improve its efficiency. Finally, Section VI concludes our work and provides insights for potential future work in this direction.

II. RELATED WORK
In this section we first review the existing research work related to sniffing detection, and then we present the tools that are publicly available and can be used to defend networks against such a threat.
To the best of the authors' knowledge, currently there are no papers or work related solely to network sniffing detection that utilizes Machine Learning (ML)/Artificial Intelligence (AI). However, currently both ML and AI are commonly used for many cybersecurity related tasks. For example, a survey [6] was published by Ghaffarian and Shahriari about software vulnerability analysis and discovery using Machine-Learning and Data-Mining Techniques. Likewise, another survey [7] was published recently by Xin et al. about using Machine Learning for an efficient cybersecurity intrusion detection system. The security of the AI itself is also under researchers' interest. In [8] the authors present how to prepare malicious input for AI processes, which will greatly decrease its effectiveness. Finally, in [9] the authors presented the SecureML system for scalable privacy-preserving Machine Learning, which helps with massive data collection that raises privacy concerns. This proves that the utilization of ML/AI is a promising research direction in cybersecurity. In the reminder of this section we will focus on reviewing the existing research work and tools related to sniffing detection.

A. EXISTING RESEARCH WORK ON SNIFFING DETECTION
The existing literature related to sniffing detection is typically divided into [10] (i) detection at a local host or (ii) at Local Area Network (LAN) segment levels. An example of a local host detection is presented in [14]. The authors detect a sniffer by running a special, centrally managed agent on every device connected to the network. This agent compares the destination address of every captured Ethernet frame and matches it with the local address. If they are not equal, a sniffer might be running on such a device. Unfortunately, this solution assumes that every device is under administrator control, and that such an agent can be deployed. In this scenario, a more sophisticated solution like Host-based Intrusion Detection System (HIDS) should be used. There is also the possibility of compromising such a device by the attacker, disabling or reconfiguring agent so it always reports ''no sniffer''. Our solution is more general, we assume that each device is a black-box that cannot be controlled from the inside (like in every public, dynamic environment). In the rest of this article we focus only on network-based solutions, thus we omit a description of host-based approaches.
Existing network-based sniffer detection methods can be divided into two main subgroups: • Challenge-based where the defender tries to provoke the sniffing machine into responding by sending specifically crafted network traffic. Typically, packets with a forged MAC addresses or Domain Name System (DNS) messages are utilized for this purpose or • Measurement-based where the suspicious machine is subjected to temporary, intentional traffic load and then based on its response it is determined whether the sniffer is active or not.

1) CHALLENGE-BASED APPROACHES
As previously mentioned, the aim of such methods is to stimulate the sniffing machine into responding to the intentionally crafted network traffic. It exploits the fact that often sniffers are not fully passive and tend to respond to certain types of traffic. Typically, packets with forged MAC addresses are used for this purpose. In a normal situation, such packets would be rejected by the NIC and therefore never reach the operating system (OS) for processing. However, when the NIC is in the promiscuous mode, some OSs treat such traffic as legitimate and respond accordingly, betraying that it is sniffing. Previously proposed methods of this kind include the utilization of specially crafted packets. For this purpose, ARP requests with intentionally incorrect group MAC addresses set can be used and sent to the suspected machine [11]. The other approach uses an ICMP-based method in which the messages of this protocol are sent to an investigated target with the correct destination IP address, but with a forged destination MAC address [10]. Another flavor of the same approach is related to the utilization of DNS messages to provoke the sniffing machine to respond [12]. It must be noted, that by default most sniffers perform a reverse DNS lookup on the traffic that is sniffed. Often for this purpose specially crafted packets with unused IP addresses are utilized, and when encountered by a sniffer DNS lookup request can be generated and discovered by the defender. A similar approach presented in [15] is based on a decoy and honeypot. Specially crafted traffic, containing for instance, a FTP logging session with unique and fake credentials (login and password) is sent to the host, suspected of running a sniffer. If the attacker is actively monitoring the captured traffic, or looking for specific signatures like login session, it might try to use the bait. On the FTP server side, a logging attempt with such unique credentials will prove that the sniffer captured the crafted traffic. Unfortunately, this solution assumes that the attacker will try to use the captured information, but this might never happen as the attacker could be looking for a different type of data or simply not use it at all.
Obviously, a sophisticated attacker running a sniffer is able to thwart replies to any network traffic while sniffing is in progress, thus, making the above mentioned techniques of limited usefulness.

2) MEASUREMENT-BASED APPROACHES
These methods rely on inflicting intentional, artificial load on the suspicious machine by sending specially crafted types of network traffic and observing its performance degradation in response to this traffic, thus measuring the RTT. The rationale for such solutions is that when the NIC is activated in a promiscuous mode, then instead of dropping traffic not directed to this particular device, the OS is overloaded and this causes noticeable performance decreases. Such approaches have been proposed, for example, in [12] and [11].
There are several issues related to the existing solutions of this kind. First of all, the proposed methods can be influenced if there is heavy traffic within the monitored LAN. Second, if not handled with care, as this technique assumes flooding the target machine with potentially heavy traffic, it can lead to unintentional Denial of Service (DoS) attacks.

B. EXISTING ANTISNIFFING TOOLS
While conducting this research we analyzed the tools that aim to allow sniffer detection. Below we present their descriptions, and then in subsection III-A the results verifying whether they are still effective when used in the current communication networks. The analyzed tools include Promqry, Sniffdet, Anti-sniffer and Nmap sniffer-detect NSE. Promqry 3 is a Microsoft tool from 2012 that can be used to detect network interfaces that are running in promiscuous mode. Promqry can determine if a Windows 2000 or later based device has network interfaces in a promiscuous mode, which can be a sign that a network sniffer is running on the system. This tool functions by listing the interfaces in a promiscuous mode on the local machine by requesting information on the interface status through WinAPI. To analyze external machines, an IP address range can be specified, then these addresses are checked using a ping query, and if the host is running/online, the tool attempts to connect to the host with a Remote Procedure Call (RPC) and check the interface status to discover if the promiscuous mode is used.
Sniffdet 4 is an open source tool under the GPL license that aims at combining a set of tests for remote sniffer detection in TCP/IP network environments. These tests utilize DNS, ICMP, ARP traffic or latency measurements to determine whether the machine's NIC is running in the promiscuous mode. The latest version is from 2006. The tool itself is not well documented, therefore, it is hard to describe how exactly it works. By executing it and viewing its configuration files our assumptions on specific tests are: • DNS-based: sends spoofed packets with fake MAC and IP addresses to the examined machine. If the tested host has an interface in the promiscuous mode it may perform a revDNS query for the fake IP address, • ICMP-based: sends an ICMP echo request to the examined machine with a fake MAC and correct IP address. Again, if the tested host has an interface in the promiscuous mode it may reveal itself by replying to the ping, • ARP-based: is similar to the ICMP-based test, but instead of a ping the ARP protocol is utilized and • Latency-based: using a ping tool it measures the RTT when flooding the target with custom packets.
Anti-sniffer 5 is a similar tool to Sniffdet and it was destined for operating systems like Windows 95, 98, NT, NetBSD and Linux. It performs various forms of tests like OS specific tests, DNS tests and network latency ones.
Nmap sniffer-detect NSE 6 is part of the well-known Nmap scanner and works by exploiting incomplete packet examination by various operating systems. 7 It sends ARP packets to the sniffing host with the destination MAC address crafted in such a way that it resembles a broadcast or a multicast address. If the host is not sniffing and its network card is in the normal mode of operation, the packet will be dropped by the NIC. However, if the host is sniffing, the packet will reach the network stack of the operating system. If the network stack of the target system checks the destination MAC only partially, which is the case for a couple of the OSes that the tool has built-in support, such a crafted ARP packet will receive a response.
Compared to the existing mechanisms and tools that are presented above, it must be emphasized that the novel approach, which we propose in this article, is also measurement-based and relies on inflicting an artificial load on the investigated machine and monitoring of the changes in responses. However, based on the experimental evaluation of the existing approaches it turned out that they are not successful any more (see Section III). Therefore, to detect sniffing we propose a different technique that is based on the macof tool, 8 which temporary floods the targeted device with TCP packets containing randomized MAC addresses and then uses the ping to measure the RTT and the curl tool download data rate in the corresponding time periods. Then based on the received responses during the 'flooding' period as well as during the 'idle' ones our solution can determine with high probability whether the sniffer is active or not.

III. EVALUATION OF FEASIBILITY OF EXISTING TOOLS & METHODS FOR SNIFFER DETECTION
In this section we verify the existing tools and methods for sniffing detection described in subsection II-B. We want to evaluate if they are still applicable and effective in today's networks.

A. EVALUATION OF EXISTING ANTISNIFFING TOOLS
Promqry is able to detect a promiscuous mode in Windows (and only for this OS), however, to be successful it requires an active participation from the sniffing host (due to the RPC usage), and thus we consider such a behavior as out of scope for this article as we do not assume that the defender has any control over the inspected host.
Despite the fact that the last version of the Sniffdet tool is from 2006, we were able to compile it on the machine Supermicro X10DRi with 2 x Xeon E5-2637 v4 @ 3.50GHz, 64GB RAM and Fedora 29 Linux installed. However only three (ICMP-based, ARP-based, and latency-based) out of the four detection methods could be tested as while executing the DNS-based test we experienced an application crash. During our experiments with this tool two physical servers were used, one with a Fedora 29 Linux acting as a testing host and the other with a Windows 8.1 Enterprise N x64 acting as a sniffing host. They were both directly connected using an Ethernet cable. Sniffdet was executed on the Linux host with all three working tests against the Windows server with and without Wireshark sniffer running. As stated before, DNSbased test causes the application to crash without any useful output. Both the ICMP and latency test were not able to properly detect the usage of Wireshark on the tested machine -Sniffdet reported negative results every time it was executed. Finally, the ARP-based test successfully reported the sniffer existence on the examined host when indeed the Wireshark was running. On the other hand, when sniffing was disabled, Sniffdet froze for many hours without any useful output. Due to this fact, the tool could not be examined properly.
The LOpht AntiSniff v1.02.1 could only be installed on Windows XP due to its installer being a 16-bit application. Unfortunately, the application itself, after successful installation on a Windows XP failed to recognize the network adapter being present in the system, which prevented us from testing its effectiveness.
Nmap sniffer-detect NSE was tested in three scenarios: 1) A physical machine probing another physical machine (our server probing Raspberry Pi gen.1, Raspberry Pi gen.3 and the Tower PC) -directly connected via an Ethernet cable, 2) A physical machine probing a virtual machine (our server probing its own guest VMs) and 3) A virtual machine probing a virtual machine (guest VMs on our server probing each other).
In the first scenario, nmap reported a negative result for all three probed machines, regardless if the sniffer was active or not. In the second and third scenario we determined that the result depends on the type of the virtual network card attached to the virtual machine. If the virtio card was installed, the tool detected sniffing regardless of if it was actually ongoing. On the other hand, if the e1000 card was installed the sniffing was correctly detected, but the application crashed when the machine was not eavesdropping.
To summarize, our experiments have shown that publicly available tools and described methods are no longer useful. It is not possible to use them to clearly determine whether a sniffer is running on a given host or not.

B. EXISTING DETECTION METHODS
In the following subsections we describe different possibilities for sniffing detection.

1) DNS-BASED APPROACH
As previously mentioned, sniffers, in their default configuration, try to reverse-lookup the encountered IP addresses in the DNS records for the convenience of the user. It is possible to exploit this behaviour by providing the host suspected of sniffing with packets destined to a fake IP address. If the machine is running a sniffer, it will try to resolve the fake IP via reverse-DNS requests, which can be monitored by a network administrator.

2) FORGED MAC ADDRESSES APPROACH
During normal operation, the firmware of a network interface drops Ethernet frames that are irrelevant to the host. This means that frames with the destination MAC address not matching exactly the adapter's own or not being the special broadcast address, are removed to avoid their unnecessary processing in the network stack of the operating system. If this behaviour is disabled, which is done by switching the network card to the promiscuous mode, then all transmitted packets reach the operating system and are available and processed by the sniffing program.
The forged MAC address method [10] relies on the OS's network stack characteristic feature, i.e., lack of additional MAC address verification, which is a sensible optimization, as in a normal scenario this would be a double-check and a waste of the kernel's processing time.
Sending an IP packet, e.g., opening a TCP connection, UDP datagram or ICMP echo request with an invalid destination MAC address, but correct destination IP to a benign host should result in no action, because the packet will be dropped by the adapter. On the other hand, if encountered by a sniffing host then a response will be generated, because in that case the MAC address is no longer checked by the NIC (running in the promiscuous mode) and the higher layer is correct (IP address matches the host's).

3) FEASIBILITY OF DNS-BASED AND FORGED MAC ADDRESS-BASED SOLUTIONS
The first two detection scenarios were realized in our virtualized SDN environment as illustrated in Figure 1. We utilized Open vSwitch version 2.10.1 and Ryu controller version 4.30. We developed a custom Ryu module that instructs the Open vSwitch to forge packets and send them to the sniffing host.
During the DNS-based experiments we sent TCP segments to the sniffing host with: • randomized source and destination MAC addresses, • randomized source and destination IP addresses, • randomized source and destination TCP ports, • randomized SEQ and ACK numbers and • two null-byte payload. It must be noted that a MAC address cannot be fully random, as for instance, multicast frames can be ignored by the network equipment (switches) or NICs. Such a frame would never reach the sniffing host's TCP/IP stack, thus a false-negative situation can occur.
Using this approach we were able to detect reverse DNS requests on the SDN switch coming from the examined machine when the TcpDump (with the default configuration) was running, which matched the randomized IP addresses we provided in the forged packets.
During the MAC-based experiment we sent a correct ICMP echo request to the sniffing machine, but with a randomized destination MAC address.
However, it must be noted that we were unable to replicate the behaviour previously reported in [10], i.e., a response when the machine is sniffing on the following operating systems: VOLUME  It is also worth noting that due to the nature of the two detection methods presented above, i.e., that each mechanism is either successful in sniffing detection or not and practically no parameter tuning is possible, we verified only that they are still feasible (so the technique still can be successfully used). Note, that the nature of this method makes it difficult to provide any numerical results.
To summarize, the DNS-based approach can still be useful in sniffing detection, however, the MAC-based approach is not applicable any more. That is why we incorporated the DNS-based detection method within the developed IoRL security framework.

IV. PROPOSED ARTIFICIAL LOAD-BASED SNIFFING DETECTION METHOD
As previously discussed in Section I, the main motivation for developing countermeasures against sniffing attacks is to enrich the Integrated Security Framework of the IoRL system. That is the main reason why we developed and evaluated the proposed defensive mechanism within the SDN-based environment. The proposed detection method targets the sniffing application directly. The general concept of this approach is illustrated in Figure 2. The suspected machine is first selected according to the adopted security policy, e.g, every host is periodically under investigation for a certain amount of time. Then, the investigated machine is continuously probed using, for instance, a ping tool or other tool that allows the measurement and collection of the resulting response times. Periodically, we execute the macof utility to flood the suspected machine with a large number of packets with randomized MAC addresses to inflict an artificial load.
If the machine is not sniffing it is expected that there would be no differences between the response times with and without macof as the flooding packets will be simply discarded at the hardware level, and thus not interfere with the operating system.
However, if the machine is sniffing then each of the flooded packets will be recorded, analyzed or displayed to the user. If the volume of the traffic is high enough, then the packet processing in the sniffing program should engage a significant fraction of the processing power, thus, there could be a notable difference in the response times between the periods with and without the flooding with macof.
To discover these fluctuations in the response times we employ ML algorithms. Thus, based on the collected results, machine learning techniques are utilized to decide whether the host is running a sniffer or not.
In the remainder of this section we present the developed experimental test-bed and methodology for the proposed sniffer detection method.

A. EXPERIMENTAL TEST-BED
In the rest of this subsection we describe the details of the test-bed. For this approach we temporarily resigned from the SDN and controller-generated packets due to a low maximum throughput of the solution (in our tests, around 1,000 packets per second). However, it must be noted that SDN can still be utilized as an efficient means to block all the incoming and outgoing traffic from the host suspected of running the sniffer. Moreover, the SDN is a core component of our security framework as well as in the IoRL network itself. In such a case, using the SDN-based solution for blocking sniffing attempts and other nefarious actions seems to be a natural choice.
The utilized test-bed includes two setups: • two hosts for running the measurement-based experiment, • three hosts for training AI software and testing obtained results. As our solution is based on a previously described measurement-based approach, two hosts were connected directly via an Ethernet cable without an Ethernet switch between them. The reason for such a setup is that a switch would introduce additional delays, which can change during long-running experiments. The measured values of, for example, the ping RTT are very small (less than one millisecond), thus for the measurement-based approach even a small deviation can be of significance. The resulting test-bed is presented in Figure 3.
Moreover, the list of utilized hardware within the test-bed is as follows: • Probing host MSI GL72 with Intel i7-6700HQ 4-core, 8 9 It is also considered as one of the best AI platforms on the market. 10 Additionally, AC922 is a building block of the second TOP500 supercomputer, called Sierra, for the last 1.5 years as well. 11 During the experiments we also utilized the following software: • tcpdump-4.9.2 -a command-line sniffer, • curl-7.29 -a command-line tool for transferring data using various protocols, HTTP among others, 12 • apache-2.4.6 -an efficient web server, 13 • modified version of macof tool (which is a part of the Dsniff suit toolset) 14 16 We modified the source code of macof to control the number of generated packets per second.

B. EXPERIMENTAL METHODOLOGY
Compared to the experimental methodology that we used in our previous work [5], we not only utilize ping-based RTT for measurements but also the data rate of the file transfer via the HTTP protocol. Moreover, we use a Linux operating system on the sniffing host instead of a Windows OS, as in [5] the used method turned out to be unsuccessful for Linuxbased hosts. Additionally, we are using variable 'idle' and 'flooding' periods of 5, 10, 15 and 30s. Another novel aspect is that we are not only testing the behavior of the sniffing host running on battery and AC power, but also under heavy CPU consumption and idle states, which better demonstrate reallife sniffing host functions (as the sniffing could be not the only activity running on the sniffing host).
Two connected hosts: probing (MSI) and sniffing (HP) were used during the first phase of the experiment, i.e., measurement-based. The probing machine was responsible for both generating the flood of packets using macof and establishing the baseline for the measured responses.
To prove that not only ping RTT can be used as a measurement metric (based on conclusions presented from the related work section), but also application layer protocols, HTTP was used as an example.
Therefore, we assume that the sniffing host is serving any service over the network, HTTP for example, and there is access to such a service via a firewall. This might be an unusual situation, but such an approach serves as an example of using application layer protocols instead of ICMP (ping). Obviously, the utilization of other protocols is feasible as well, however, further studies in this aspect are needed. In [5], ping-based RTT was utilized to establish the baseline and this solution was tested here as well, but for the Linux OS as a sniffing host.
As in [5], we determined that in our experimental environment the probing machine using macof could handle a maximum packet flooding of around 10,000 packets per second. This is why we used this setting for conducting all our experiments.
Taking the above into account, the following experiments were performed: • For the whole duration of the experiment, we either executed the ping with the lowest possible interval (0.01s) targeted at the suspicious host and recorded the corresponding RTTs or utilized the curl tool to download a 1MB file from the suspicious host and recorded the download data rate (in B/s), • Then for the next 10, 20, 30 or 60s we activated the modified version of the macof tool, however, for the first half of this period (i.e., 5, 10, 15 or 30s) it is idle (i.e., no packets are sent -in the rest of the paper we will call it an 'idle' period) and for the rest of the time (i.e., 5, 10, 15 or 30s) it generates 10,000 packets per second ('flooding' period). In both cases we noted the corresponding ping RTTs or curl download data rate, • In half of the cases for both of these periods the sniffer was activated and in the remaining time it stayed idle (sniffing/no sniffing). This allowed us to compare the delays with the artificially inflicted load for the case of a normal user machine and for a device with the NIC set to the promiscuous mode and • In terms of the sniffing host, another two factors were taken into consideration: working on battery/AC power VOLUME 8, 2020 and CPU intensive load/idle. The load has been inflicted based on computing the MD5 checksum from the /dev/zero pseudo-device. As the sniffing host (HP) had four threads, four such jobs were executed, which caused 100% CPU utilization.
Experiments were executed for all considered configurations, therefore, in total 64 experiments were executed: • 2x for curl/ping, • 2x for battery/AC power, • 2x for CPU load/idle states, • 2x for sniffing/no sniffing activities and • 4x 5, 10, 15, 30s 'idle'/'flooding' periods. In every experiment 100 'idle' and 100 'flooding' periods (5, 10, 15, 30s) were recorded. In total, more than 53 hours of network traffic was recorded across all experiments. The longest experimental group using the 30s period was recorded for more than 26 hours (one experiment took 1h 40min times 16 for four binary parameters).
When the ping-based RTT and curl download data rate were finally recorded, they were used in the last phase of the experiment -sniffing detection using ML techniques. For this purpose we utilized Driveless AI (dai) software, which is an AI framework, and it provides many useful features and supports GPU processing. One of the most important features is the auto-tuning of the ML algorithms, which allows a focus on the main problem (which is sniffing detection) and not tuning ML parameters.
It should be noted that our solution assumes that we do not have access to the operating system of the host being suspected of running a sniffer. In the IoRL project, the network is open, dynamic and available to all users. Thus, in such a scenario, some of them may perform malicious activity (like sniffing). In the closed environments, dedicated software, like HIDS or antivirus software, can be pre-installed and prevent or detect the usage of such forbidden software. In fact, many simple tools (like ps in Linux) can be easily utilized to check whether a sniffer is running or not. However, note that in our case we have to treat every device as a black-box. Therefore, we assume that we are not able to examine whether the sniffing software is running on the machine from the hostlevel perspective.
Obviously, differences in the response times of the probed host (i.e., expressed as ping RTT or curl download data rate) can be a result of the OS load caused by running an application not necessarily related to the sniffing activity. We addressed that issue by adding two factors that we mentioned before, i.e., running on battery/AC power and by introducing an additional CPU load. This will simulate different behaviours in the suspected host.
The next section presents the obtained results for the proposed method.

V. EXPERIMENTAL RESULTS
In this section we present and describe the results of our experiments. Firstly, we will explain whether running a sniffer on the host will result in differences in its response to the measurements, which will justify using this method. Next, we will describe the obtained experimental results of the proposed detection solution, empowered by ML, using raw data and statistical representations of gathered datasets.

A. RATIONALE BEHIND PROPOSED METHOD
First we want to establish whether there is indeed no significant difference between the ping response times and in curl download data rates with and without flooding with the macof when the sniffer is not active. The obtained experimental results confirm this observation and they are presented in Figures 4a, 5a, 6a and 7a.
Surprisingly, when the sniffer is active, the ping response time could be increased or decreased during the macof flooding, depending on the type of the experiment -see Figures 4b and 5b.
Finally, if we apply an artificial CPU load, the ping response time is noticeably higher, when comparing 'idle' and 'flooding' periods, which is an unexpected result (Figure 5b). This can be explained by intelligent overclocking technology implemented on modern Intel CPUs. On the other hand, in a different experiment (running on battery, CPU idle), the ping response time is decreased, which confirms our initial assumptions (Figure 4b).
For the case when the sniffer is active for the experiment where the inspected host is running on a battery and its CPU is idle, the resulting curl download data rate surprisingly increases (see Figure 6b). In the other performed experiment, where the host was on AC power and under CPU load the result is very ambiguous. Thus, there is no noticeable difference between the 'idle' and the 'flooding' periods with and without an active sniffer (see Figure 7a and 7b).
Different results from the ping-based and curl-based experiments can be explained by the fact that the ping is handled in the Linux kernel, which has higher priority than user-space applications, like an HTTP server.
Note also that in our previous work [5], for each 'idle' and 'flooding' period, the mean, the median and the standard deviation of the obtained ping-based RTT were calculated. Then based on the obtained values three markers were determined using equation: where x is the mean, the median or the standard deviation. It turned out, that using such metrics worked well for detecting the sniffer on the Windows OS. However, in the case of the Linux OS the method did not yield satisfactory results. That is why, to improve this situation in this article we decided to use ML techniques to solve this problem.

B. ML-BASED APPROACH USING RAW DATA FROM MEASUREMENTS
First, we want to evaluate how the proposed ML-based detection method would work on the raw data coming directly from the measurements. To verify this the results of the performed experiments described in subsection IV-B were saved in a CSV file. These files were then used as an input for the dai software. During the first run of the experiments, we decided to use the recorded raw data (instead of mean, median, and standard deviation of 'idle' and 'flooding' period). The CSV file contained the data described below: • FlagCPU (Boolean): '0' means that the sniffing host was idle with only the sniffer running, while '1' denotes that additional CPU greedy processes were running, • FlagAC (Boolean): '0' means that the sniffing host was running on the battery and '1' indicates that the sniffing host was connected to the AC power supply, • FlagFlood (Boolean): '0' means that the macof tool was operating in the 'idle' mode while '1' denotes that it was operating in the 'flooding' mode, • Value (Real number): measured the ping RTT [ms] or curl download data rate [B/s] and • FlagSniff (Boolean): '1' means that the sniffer was running on the sniffing host, while '0' indicates that the sniffer was not activated. This was the parameter we wanted to determine using the ML-based approach. Table 1 presents exemplary rows of such a file for the ping-based RTT experiment.  It should be noted that during real-world sniffing detection, the probing host is unaware of whether the sniffing host is running on battery or AC power and, moreover, whether it is under a CPU heavy load or not. That is why all the data from these cases were merged into one file. The only fact known by the probing host was whether the macof tool was operating in 'idle' or 'flooding' periods, thus the final CSV as an input for the dai software contained following the rows: FlagFlood, Value, FlagSniff.
Experiments performed using ping and curl tools were executed and processed separately. There are two main reasons behind such a procedure. First of all, the units for ping and curl are not the same (RTT is in [ms] while download data rate in [B/s]). The second reason is their order of magnitude. An average value for all experiments for ping-based RTT was 0.41 ms and for the curl download data rate it was 78818100 B/s. Additionally, the experiments for different period lengths (5, 10, 15, 30s) were executed and processed separately as finding the most reasonable value was one of the aims of this research. Last but not least, the ML experiments were executed on three different hosts, i.e., x86, P8, and P9.
As the traffic generated by the ping tool directed to the suspicious host was executed with a high rate (100 ICMP packets/s) as well as downloading 1 MB file on a 1 Gbit/s network gives a theoretical data rate of 125 MB/s, the CSV file with recorded values contained many entries (one row per one ping and curl download). Obviously, regarding the curl tool, there were some delays in downloading the file in the infinite loop (for every download, current time had to be recorded to match the 'idle' or 'flooding' period), thus the theoretical data rate was also not achievable. As a result, the file was downloaded with the rate of about 65 downloads/s (thus 65MB/s as a 1MB file was downloaded). For the 30s period, the file with all the factors combined (CPU, battery, flooding, sniffer) for the ping-based experiments contained more than 4 million entries, and for the curl-based more than 2.4 million entries.
Moreover, the input files were divided into training and test parts using the 75:25 ratio. As the dai is performing internal validation, dividing the test dataset into test and validation datasets was pointless. On the other hand, external cross validation was performed by additionally shuffling the datasets four times and dividing them again using the 75:25 ratio.  Therefore, a total of 40 experiments were executed for each of the three hardware platforms (120 in total): • 2x for ping/curl, • 4x for 5, 10, 15, 30s 'idle'/'flooding' periods and • 5x for cross validation. Again the results are unsatisfactory. The best Area Under Curve (AUC) for the Receiver Operating Characteristic (ROC) curve for such an experiment was 0.765, as presented in Figure 8. Moreover, Table 2 presents the test confusion matrix for this experiment, and it should be noted that the obtained error rate is too high.

C. ML-BASED APPROACH USING MEAN, MEDIAN AND STANDARD DEVIATION REPRESENTATION OF MEASUREMENTS
Next, to improve the results presented in the previous subsection we decided to use statistical values as proposed in our previous work [5]. For each 'idle' and 'flooding' period, the mean, the median and the standard deviation of the obtained ping-based RTT value and the curl download data rate were calculated. As a result, the input file for the ML-based experiments was orders of magnitude smaller, instead of millions of entries there were only 1600 entries (200 'idle'/'flooding' periods × 2 CPU load/idle × 2 battery/AC × 2 sniffer/no sniffer). This file was split into training and testing datasets (again using the 75:25 ratio) five times for cross validation.
Each CSV file used for training and testing contained the following data: • FlagFlood (Boolean): '0' indicates macof was in the 'idle' period, not generating packets; '1' means macof was flooding the suspected host with artificial packets, • Mean (real number): ping RTT value or a curl download data rate for each period, • Median (real number): ping RTT or curl download data rate median value for each period, • Standard Deviation (real number): ping-based RTT result or curl download data rate standard deviation value for each period and • FlagSniff (Boolean): '1' means that the sniffer was running on the suspected host; '0' means the sniffer was not running on it. This was the result we were expecting to obtain to determine whether the host is malicious or not. Providing these statistical measures for each period helped the dai to train a model that can reliably predict the FlagSniff, i.e., to discover whether the sniffer is active or not.

1) TIME-RELATED FACTORS
During the sniffing detection, the amount of time that could be spent on this activity should be considered. The sniffer can be running for a long time, but also for the short periods (as the attacker can be aware of the specific process and the sniffer is only activated for a short time). It must be noted that our solution contains three time-related factors: • T1 -creating input dataset: in our solution it depends on the length of the 'idle'/'flooding' periods, • T2 -ML training: depends on the processing power of the utilized hardware and • T3 -sniffing detection using the trained model (negligible comparing to previous components).
In our experiment we use 100 'idle' and 'flooding' periods. This means, that during the first phase (measurement-based), one experiment duration (T1) was: • 1h 40min for the 30s period, • 50min for the 15s period, • ∼ 33.33min for the 10s period and • ∼ 16.67min for the 5s period. Obviously, ML model training is an activity that has to be periodically re-executed, thus the model has to be re-trained or at least re-tested -providing that artificial input data should give the expected results, and if not the model needs to be re-trained. This takes time and, as mentioned, it depends on the performance of the hardware. It must be noted that it is not required to have enterprise-class hardware to perform the training phase. Commercial cloud providers, like IBM or Google, provide special platforms and products for AI applications. On the other hand, a trained model that gives good results, can be used without any concerns. This means that the training time (T2) cannot be considered as a fixed component for every sniffing detection using the proposed solution.
Finally, sniffing detection using the trained ML model does not require high processing power, and what is more important is that it does not require using a GPU. It can be executed on a low-end computer (e.g., a Raspberry PI), and still the results will be obtained very quickly -within a few seconds (T3), which is negligible when compared to, for instance, the 1h 40min for the 30s period (T1).

2) DETECTION RESULTS
Below we describe the exemplary results for the representative dataset with the overall highest ROC curve AUC (thus P9 as AI training host and 30s period) for both ping RTT and curl download data rates. In both cases the dai software found the optimal parameters for XGBoost and Light-GBM ML models by training these models with different parameters. However, in the end, the LightGBM was not selected due to the low performance during the model tuning stage. On the other hand, the XGBoost seems like a good choice in this case as many winners in the Kaggle's competitions utilize it due to its parallelization, distributed computing, Out-of-Core computing, and cache optimization of the data structures and algorithms [16]. XGBoost is an optimized distributed gradient boosting library, designed to be highly efficient, flexible and portable. It implements ML algorithms under the Gradient Boosting framework. It also provides a parallel tree boosting (also known as GBDT and GBM) that solves many data science problems in a fast and accurate way [18].
Figures 9 and 10 present ROC curves for the validation and testing of the ping-based exemplary experiment (P9, 30s). Additionally, Tables 3 and 4 provide confusion matrices for the validation and test phases of the ML technique. It should be noted that the resulting AUC is very high, i.e., 0.999, and thus the error rate is very small.    The obtained results for the curl-based experiment, although still very good, are slightly worse than for the pingbased one. Figures 11 and 12 present ROC curves for the validation and testing of the curl-based experiment (P9, 30s). Additionally, Tables 5 and 6 provide the confusion matrices for the validation and test phases of the ML algorithms.    Again the resulting AUC is very high and it is in the range 0.994-0.996.
The detailed performance of the final models for the exemplary ping-based RTT and curl-based experiments are presented in Tables 7 and 8. It should be noted that the ROC curve and the AUC were used to evaluate the proposed  detection method. Note, that in this case we also utilized other performance indicators that also yielded very good results. These parameters include [17]: • AUC (Area Under the Receiver Operating Characteristic Curve): is used to evaluate how well a binary classification model is able to distinguish between true positives and false positives.
• Accuracy: is the number of correct predictions calculated as a ratio of all predictions made.
• AUCPR (Area Under the Precision-Recall Curve): is used to evaluate how well a binary classification model is able to distinguish between precision recall pairs or points.
• F1 score: provides a measure of how well a binary classifier can classify positive cases (given a threshold value).
• F0.5 score: is the weighted harmonic mean of the precision and recall (given a threshold value). Unlike the F1 score, which gives equal weight to precision and recall, the F0.5 score gives more weight to precision than to recall.
• F2 score: is the weighted harmonic mean of the precision and recall (given a threshold value). Unlike the F1 score, which gives equal weight to precision and recall, the F2 score gives more weight to the recall than to precision.
• GINI (Gini Coefficient): is a well-established method to quantify the inequality among values of a frequency distribution and can be used to measure the quality of a binary classifier.
• LOGLOSS: is a logarithmic loss metric that can be used to evaluate the performance of a binomial or multinomial classifier. Only in this case a lower value is better.
• MACROAUC (Macro Average of Areas Under the Receiver Operating Characteristic Curves): is for multiclass classification problems, this score is computed by macro-averaging the ROC curves for each class (one per class).
• MCC (Matthews Correlation Coefficient): the goal of the MCC metric is to represent the confusion matrix of a model as a single number. The MCC metric combines the true positives, false positives, true negatives, and false negatives. Additionally, in Tables 9 and 10 we present the final model parameters for the exemplary ping-based and curlbased experiments. Note, that these parameters were autotuned by the dai software.   As described previously, for every experiment, the AUC was used as an indicator for the best results. Figure 13 presents the results for all ping-based RTT experiments and Figure 14 illustrates the results for experiments based on the curl download data rate.
Based on the obtained results it may be concluded that apart from the 5s period, which turned out to be too short for the proposed detection method all other periods (i.e., 10s, 15s and 30s) achieved very good AUC results higher than 0.98 and the impact of the hardware used was negligible.

D. SUMMARY
As with any ML-based solution, it is not possible to provide a threshold for the amount of input data needed to perform the experiment correctly (thus successful detection of the sniffing activity). The general rule is, the more data provided the better and more accurate results can be obtained. Note however, that our main aim in this article was to show that the presented method can be successfully utilized for sniffing detection and not how many 'idle'/'flooding' periods are expected. Therefore, the obtained results can be summarized as follows: • It is not important which hardware was used for the ML training. There were some differences between the x86, P8, P9 platforms, but they are negligible. The reason for that is the fact that the ML algorithms are nondeterministic, and for the same input, the output can vary, • The best results can be obtained for the higher 'idle'/'flooding' periods (i.e., 30s). This is not surprising as we are using statistic measures, therefore, a larger dataset results in more representative values and • Experiments using the ping-based RTT provides better results than experiments based on the curl download data rate. It should be noted, that still the worst obtained result, i.e., 0.9641 (x86, curl, 5s), can still be considered as very good, which is comparable with the best results (around 0.97) obtained in [5]. This also proves that the application layer protocols can be utilized for the sniffing detection for the proposed solution as well.

VI. CONCLUSION AND FUTURE WORK
In this article we first revisited the existing sniffing detection solutions and showed that many of them are outdated and thus no longer effective. Motivated by these findings we proposed a novel approach that allows the identification of sniffing hosts, i.e., those that have NICs set to the promiscuous mode. Our approach is measurement-based and uses an artificial traffic load as well as the ping and curl tools for network traffic probing, together with the ML techniques. The obtained experimental results prove that the proposed detection method is very promising as it achieves an accuracy of 99%. In our future work we would like to evaluate the proposed detection method with respect to virtualization environments and incorporate it further into the solution developed within the IoRL project. This would allow full utilization of the SDN potential. Moreover, we would like to transform the solution presented in this article into a usable, practical mechanism that would meet the requirements of current, modern communication networks. He is currently an Associate Professor with WUT. He was a Former Instructor of Cisco certificated Academy courses: CCNA Routing and Switching, CCNA Security, and CCNP with the International Telecommunication Union Internet Training Centre (ITU-ITC). His research interests include network security, honeypots, dynamic malware analysis, data-mining techniques, the IoT, and industrial control systems security. He is the author or coauthor of over 60 publications and supervisor of more than twenty five B.Sc. and M.Sc. degrees theses in the field of information security. He took part in over a dozen research projects, among others for EU, ESA, Samsung, U.S. Army, and U.S. Air Force. He is a Co-Leader of the Computer Systems Security Group, Institute of Computer Science. He also works as a Researcher with the Parallelism and VLSI Group, Faculty of Mathematics and Computer Science, Fern Universitaet, Germany. His research interests include bio-inspired cybersecurity and networking, information hiding, and network security. He is involved in the technical program committee of many international conferences and also serves as a Reviewer for major international magazines and journals. Since 2016, he has been the Editorin-Chief of the Open Access Journal of Cyber Security and Mobility. Since 2018, he has been serving as an Associate Editor for the IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY and the Mobile Communications and Networks Series Editor for IEEE Communications Magazine. VOLUME 8, 2020