Detecting Internet of Things Bots: A Comparative Study

Since the Mirai botnet attacks in 2016 research into the Internet of Things (IoT) botnet malware has increased substantially. IoT botnet relevant threats continue to rise, impacting businesses and users. This paper aims to contribute to the problem space by compiling and synthesizing the relevant literature over the last five years to provide an overview of the most recent advances in IoT botnets, their detection and prevention, and laying down the future research directions required to better address this ever growing threat.


I. INTRODUCTION
A S computing has become more miniaturized over time, smaller devices could be attached to networks. This started as industrial control systems, in areas such as electricity generation and distribution, and water treatment and pumping, with physical devices such as pumps or relays being actuated remotely. As prices dropped in the 2010's, small devices started to appear in the homes of wealthy countries as convenience devices. This included programmable space heaters, lighting, and air conditioning devices. With the rise of smartphones and always-connected Internet services these devices, along with remote industrial and scientific instruments, are starting to become more ubiquitous. And so the Internet of Things (IoT) was born [1].
Malicious software existed almost since computers were first connected together. A botnet is a network of computing devices hijacked by malware that can be controlled remotely by an attacker, called a 'botmaster'. The botmaster will then send commands to the bot network instructing it to perform a number of different tasks, ranging from attacks such as distributed denial of service (DDoS), disseminating spam or simply ordering it to spread and infect more devices [2].
According to Spamhaus's 2019 Botnet Threat Report [3], which measures the number of botnet command and control servers (abbreviated as C&C or sometimes C2) over time, in 2019 the number of C&C servers detected was 17,602. This is a substantial growth compared with 7,314 C&Cs reported in 2014. The same report in the first quarter of 2021 indicates 24% increase in just a few months [4] (i.e., the last quarter of 2020). The growing number of botnets is therefore an increasing cyber security concern [5].
The Internet of Things (IoT) presents a unique set of security challenges, as devices are often unmonitored, remotely located, and have limited computing resources. Many IoT devices are based on low-power-consumption chips such as MIPS, ARM or Arduino [6]. Many use limited functionality chips that are purpose-made, such as sensors and actuators.
Hackers have found an opportunity in the heterogeneous IoT landscape with many manufacturers creating devices, where price and time-to-market are their primary objectives rather than security [7]. There are several open standards for IoT communication protocols including Zigbee, Constrained Application Protocol (CoAP), and Bluetooth Low Energy [8]. Many manufacturers, however, choose to use proprietary protocols instead, or may choose how to implement standards themselves.
Firmware updates may be unavailable or poorly controlled and communicated, and default credentials are often used, making IoT devices an attractive target for attackers.

A. METHODOLOGY
It was found that the 2016 Mirai malware was a driver for research in this area, and much of the research that was analyzed focuses on preventing a repeat of similar future attacks. Therefore papers from the last five years were collected to expose the current state of research on IoT botnets.
Search terms were used on Google Scholar and EBSCOhost database for "IoT Botnet" and "Internet of Things Bot-net". The resulting list of approximately 1,200 conference and journal papers' titles were reviewed for relevance. Papers published at well established and top-ranking publications including Q1 on SJR (SCImago Journal & Country Rank) or A/A* on CORE 2020 (Computing Research and Education) portals were selected, given the stringent peer review process and traditional impact in the field 1 . We initially selected 19 core papers that gave a range of IoT botnet detection methods which informed the formulation of our research questions: 1) What are the methods for identifying and classifying malware detection in IoT devices? 2) What IoT malware detection methodologies are better suited to detect IoT bots? Drawing on the literature review guidelines given by Snyder [9] we reviewed each paper within the targeted fiveyear time frame, leading to our selection of around 50 of the most relevant papers. We read and summarized these papers to classify them, and then followed up the reference lists to 'snowball' more relevant literature, which was again categorized and summarized.

B. RELATED WORK
Several surveys on botnets have been conducted, as have several IoT security surveys, but only one could be found that specifically addresses the issue of botnets in IoT devices. Vu et al. [5] provided an extensive survey on botnets, focusing on the incentives for their creation, how they have evolved and current trends. Cozzi et al. [6] traced the evolution and code-sharing of IoT malware and provided valuable insight into the relationships between malware families. Costin & Zaddach analysed malware in IoT but did not look specifically at botnets in their 2018 BlackHat paper [10]. They conducted an earlier survey in 2014 [11] which is focused solely on security of embedded firmware. Meneghello et al. [8] provided a comprehensive survey of vulnerabilities in IoT devices. Rytel, Felkner & Janiszewski [12] conducted another survey of vulnerabilities but focus on the data sources for vulnerability sharing.
Wazzan et al. [13] wrote the most similar paper to this, having conducted a literature review on botnets in IoT. However, the research questions, motivations, and scope are different and we find their survey complimentary to ours. Wazzan et al. primarily focus on the phases of infection detection with the majority of papers reviewed focusing on earlydetection of IoT bots. Several high-quality and influential research publications from the past few years in IoT botnet (e.g., [10], [11], [14], [15] among others) are overlooked. Recent significant IoT botnet threats such as Hajime are only discussed in a few words, the paper has limited focus on mitigation and it does not establish future research directions systematically. Our contribution differs from [13] in that we compare detection and mitigation methods (e.g., network blocking, software patching) and establish a framework for 1 When looking at future research directions, this requirement was relaxed for a few papers to accommodate inclusion of promising early ideas. future research in detecting and mitigating IoT bots. More important, we focus on different papers within the same problem space (i.e., IoT botnet instead of botnet broadly) in the form of a targeted literature review focusing on the more stringently reviewed research publications published at a higher quality conferences and journals.

C. CONTRIBUTIONS
To the best of our knowledge this paper is the first IoT botnet specific survey which compares detection methods and suggests a framework for future research. Our main contributions are: • A detailed review of IoT botnet attack impacts to identify the problem space. research questions, and a framework for future research. In Section II we explore the history of botnet malware and methods of infection for traditional PC malware versus embedded systems such as IoT devices. Section III details methods of botnet detection in IoT devices, which is split into two subsections. Section III-A looks at the research into Host-based detection and Section III-B explores Networkbased detection. Network detection is further split into remote (over the network) detection and local detection via the router/gateway at the network edge. Section IV explores new research into emerging directions as well as emerging threats. In Section V we discuss a framework for future research and the open questions that have not been answered. In Section VI we provide our final thoughts.

A. TRADITIONAL BOTNETS
The threat of botnets were first recognized in the 1990's, with early botnets such as Eggdrop, SDBot and Sub7 spreading through Trojan horses or email worms to infect hundreds of personal computers and their associated networks. These botnets were then used to send spam or conduct DDoS attacks against the botmaster's targets. Botnets of this era often used Internet Relay Chat (IRC) servers as a central point for the botmaster to control the bots [20].
As malware evolved so did botnets, with peer-to-peer (P2P) botnets being developed in the mid-2000's to avoid having the single point of failure of a central C&C server. The 2010's were the decade when botnets starting becoming monetized according to Cimpanu [21], first as 'DDoS for hire' then more recently, cryptocurrency-mining botnets. Sophos Labs [22] disagree with the dating, stating that the use of botnets for pharmacy spam from 2006 was the start of botnet monetization.
Crypto botnets are being used to 'mine' cryptocurrencies such as Bitcoin or Monero [23] to directly add to the bot-  [24] running on embedded devices such as routers, security cameras and set-top boxes, however these are less effective than higher-powered computing devices. The most recent monetization method employed by botmasters is ransomware, whereby infected machine's files are encrypted and a ransom demand is displayed before the files are irretrievably removed. Botnets impact in two ways; the device is no longer under the control of the legitimate owner, and is then used to generate malicious activity on the Internet. Bots will remain quiet and behave as normal until the malware is triggered, which complicates their detection and mitigation. Botnets are a problem for the Internet as a whole as they are used to send large amounts of spam from unwitting hosts and can also be used to create huge DDoS attacks. Some of the largest botnet attacks were comprised of IoT devices, as we will explore in the following section.

B. IOT ATTACKS
Traditional botnets are able to infect PCs which have high computing resources and the ability to run anti-malware programs. IoT devices, with lower resources, do not have the capacity to run anti-malware software which makes them more vulnerable to infection and less able to mitigate threats. Vulnerabilities that can turn IoT devices into bots include: 1) brute-force password guessing [25], 2) unsecured services [26], 3) leaked/reused passwords, 4) network stack vulnerabilities in unpatched firmware [10], and 5) physical access to the device.
There is no central database of IoT vulnerabilities, however Rytel, Felkner & Janiszewski [12] have explored the USA and China's National Vulnerability Databases (NVD and CNVD respectively) and other databases. From this, they extracted features they plan to use in a future IoT vulnerability reporting database 2 .
Mirai's variants are the most widely studied botnets, due to the scale of the attacks. The first reported attack, on the "Krebs on Security" blog (discussed below), hit 623Gbps using simple TCP port flooding, which knocked the website offline. This was followed by the attack on Dyn which was between 1-1.5Tbps [27] which disrupted domain name resolution for Dyn's customers, including Amazon, GitHub, Netflix and Twitter. This botnet was created by exploiting default passwords such as 'root/root' or 'admin/admin' [14] that were left on the devices when they were installed.
Antonakakis et al. [14] assembled an army of researchers in 2017 to analyze the Mirai botnet. Mirai peaked at 600,000 infected IoT devices in November 2016, one of the largest botnets at the time. When it was used to attack the Krebs on Security blog the botnet was smaller at around 120,000 hosts, but had grown to its peak by the time of the Dyn attack. It is suspected that attackers were attempting to bring down Sony's PlayStation network, whose name servers are hosted by Dyn. Other targets include Lonestar Cell, a Liberian telecommunications provider and OVH, a French cloud hosting provider. After Mirai's source code was publicly released, variants started appearing [14], attacking new targets using new tactics such as reflection attacks, where DDoS traffic is bounced off a third party.
Gu et al. [28] predicted in 2008 that a peer to peer (P2P) IoT botnet was coming, 8 years before the emergence of Hajime, the first IoT P2P botnet. Between 2016 and 2019 Hajime was examined in depth by Herwig et al. [17]. Hajime peaked at around 300,000 infected hosts, before returning to a steady state of about 90,000. Herwig et al. believe the worm was created to secure vulnerable devices, as no attacks have ever been observed from Hajime. However, other researchers such as [21] disagree, claiming the botnet may be used for proxying -masking malicious Internet traffic.
Bashlite (also called Gafgyt), the precursor to Mirai displayed similar characteristics, but was missing Mirai's encrypted traffic and had hard-coded IP addresses for its C&C servers [2]. Brickerbot is another significant botnet from 2017 that was used to create a 'permanent denial of service' attack by exploiting default credentials, then wiping vulnerable devices storage and network capabilities [18] turning them into 'bricks'. Costin et al. [10] estimate that over 10 million devices could have been affected by this attack.
Persirai [19] was a 2017 Mirai variant that exploited an empty password bug in certain IP cameras that allowed it to gain user passwords in clear text [26]. Other vulnerabilities have been exploited and combined with the released Mirai code-base to create smaller botnets such as Reaper in 2017.
As the number of IoT devices continues to grow, botnets will become larger, and their attacks more devastating. It is therefore vital that ways are found to protect these devices from malware [27]. Table 1 gives an overview of the most VOLUME X, 2021 Ben Stephens et al.: Detecting Internet of Things Bots

III. METHODS OF DETECTION
There are two main groupings of botnet detection methods; network-based and host-based. Network-based detection methods can be used with any networked device remotely.
Host-based methods require the firmware from a device to be loaded onto a computer and studied either statically (not running) or dynamically (running). An advantage noted by several researchers when comparing botnet-infected IoT devices with general purpose computing devices is that IoT devices are not multipurpose machines, so will usually only follow specific patterns of execution and network usage. This leads researchers to explore the two available methods of detection, host-baseddiscovering malware by examining the device firmware, or network-based where network traffic is analyzed. Table 2 summarizes the most recent literature focusing on host-based detection. It gives details of citations to measure impact of each paper, accuracy (correctly predicted observations) where reported for detection, and precision (correctly predicted positive observations), where reported.

A. HOST-BASED DETECTION
Host-based detection describes the methodology for analysis of code on a device. It can be categorized into two distinct methods. First the static method, where binaries or source code are examined without executing the code. The second method is dynamic analysis, where a sandbox is created and monitored, and the code is executed to observe its effects.
Static analysis is slow, but more conservative, and in malware is much less likely to cause an unexpected consequence such as infection. Dynamic analysis is faster, however all paths of execution and variables cannot be guessed by the analyst so some functionality of the malware may be missed [31].
Costin et al. [11] provided a comprehensive survey of embedded systems firmware in 2014. They updated their research in 2018 [10] with a similar survey of malware, specifically on IoT devices. This section will add to their work with novel techniques that have been discussed since then.
Pa et al. [25] proposed and implemented an IoT honeypot, which presented itself to the Internet as varied unprotected IoT devices to capture malware. It emulated telnet services of various devices on the front end, with a back-end connected to a series of virtual environments emulating embedded CPU architectures. During its 81 days of operation, the honeypot had 79,935 download attempts by malware from 180,581 Internet hosts. The researchers manually downloaded 106 samples and analyzed them in their emulation system, identifying 5 families of IoT malware.
Su et al. [29] propose a technique where IoT firmware binaries (executable low-level software) were converted to gray-scale visual images, then passed through a shallow (2 layer) convoluted neural network algorithm. The neural network classified the image as malware or goodware. Basic firmware to image processing can be done on the IoT device, then the image can be passed to a cloud-based classifier. They do, however warn that this method may be vulnerable to binary obfuscation.
Nguyen, Ngo & Le [30] outlined a method of static analysis of firmware source code or binary executables, searching for printable strings then fed those to a convoluted neural network. The neural network, which had been trained on known good and malware samples, put the strings into context and classified them as malware or goodware, whether they have been obfuscated or not. They did this by leveraging a control flow graph, which traverses paths of execution in the sample. As shown in table 2 they achieved 92% accuracy and 91% precision.
Zaddach et al. [15] described Avatar, a dynamic approach that relied on a hybrid of hardware to provide the input/output of the system, and software running on an external emulator to dynamically analyse (possibly malicious) firmware.
They created an open-source framework based on QEMU 3 with debugging interfaces that could be used to analyze and change device execution. The authors provided three examples of Avatar in action: an analysis of a hard drive's on-board firmware, a vulnerability assessment of a Zigbee device, and manipulation of a mobile phone GSM network stack -proving the versatility of their platform.

B. NETWORK-BASED DETECTION
Several methods have been suggested, modeled and prototyped for network-based detection of IoT malware using network traffic. The advantage of these methods are that the device can stay in place, connected to the network and continue performing its function. Table 3 provides a summary of the most recent literature with a focus on networkbased detection. Citations are included for impact on further research and accuracy (correctly predicted observations) and precision (correctly predicted positive observations) are included where available.
Emerging research aimed to leverage recent network advances to enhance detection and mitigation of IoT bot threats. For instance Software Defined Networking (SDN) is a recent network paradigm that splits network operations into data and control planes, decoupling the functions from the hardware. It adds separate layers for policy definition, enforcement, and implementation, allowing the network to be reconfigured dynamically in real time using an "intelligent orchestration and provisioning engine" independently from the hardware used [32].
Similarly, Network Function Virtualization (NFV), use virtual machines to emulate hardware. This can be useful for adding network resources such as firewalls, domain name resolvers, virtual routing or traffic control on an as-needed basis [33]. This technology could be used in the future for mitigation of botnet malware.

1) Local Detection
Habibi et al. [35] proposed a software solution called Heimdall that they implemented on a Linksys router. This solution is in two parts, a traffic manager which continuously validated traffic and a whitelist manager that managed allowed and blocked addresses. A profile of each device was built when they were connected to the network, and once patterns were established the system moved to an enforcement phase per device. DNS requests were mediated through the system, which checked validity of the DNS response to prevent DNS poisoning attacks. While the results in table 3 look significant, this is based on just a few test devices.
Miettinen et al. [38] proposed methods of detection and mitigation at the network edge on the gateway in conjunction with a web-based IoT security service provider. The gateway fingerprinted the connected IoT device, then sent the fingerprint to the service provider, who sent back a classification of restricted, trusted or strict which was applied by the gateway depending on whether vulnerabilities exist. They also provided a method for fingerprinting devices as they were inducted into the network.
Hafeez et al. [34] built on the work of Miettinen et al. [38]. They had the gateway classify devices in both cases, but where Miettinen et al. proposed a central service provider, Hafeez et al. proposed all the work be done by the network gateway. They created a prototype that could be run on a regular consumer-grade router with minimal impact (1.8% increased latency). It was a modular system with monitoring, detection and enforcement modules which used fuzzy Cmeans clustering, after feature extraction, to classify network traffic. They then used SDN to create adhoc network overlays to modify traffic flows.
Meidan et al. [37] ignored the infection stage of botnet malware entirely, under the assumption that some malware will get past any filters, and concentrated their network anomaly detection at the point when devices are given the attack order by the botmaster. They used autoencoders, a compressed neural network, training them on benign traffic. Once the autoencoder was trained for a particular device it could detect anomalous network behavior. They infected 9 commercial IoT devices with Mirai and Bashlite to test their detection method. Their results detected 100% of attacks in the samples and a false-positive rate of 0.007 in 174-386ms.

2) Remote Detection
Nõmm & Bhaşi [39] used Machine Learning (ML) to detect anomalies in IoT network traffic by only training benign data. Their system detected outlying data points and classified them as suspicious. Nõmm & Bhaşi performed multiple tests to determine the best ML algorithm for accuracy (low falsenegative) and precision (low false-positive rate) when a system is trained on benign data and then exposed to combined normal and botnet data from Mirai and Bashlite botnets. They concluded that the most effective ML algorithms were five feature-point entropy for feature selection and isolation forests for unsupervised learning. These performed better than local outlier function, support vector machines, and Hopkins statistics for botnet traffic detection. The authors updated their work in [42] with an examination of a hybrid feature selection model. ing method called Bidirectional Long Short Term Memory based Recurrent Neural Network (BLSTM-RNN). This method fed whole packet data into a neural network over long time periods to extract text features, then contextualized the data. The self-learning capabilities and knowledge of the past that the RNN provided allowed for the detection of botnet traffic even when there was a large time gap between attacks. This came at a processing cost but provided very high levels of accuracy, even where malware had mutated. Vinayakumar et al. [40] explored converting domain names extracted from malware binaries into images then using Siamese Neural Networks. They used this to analyze whether domain names had been computer-generated or were legitimate domains to detect domain generation algorithms (DGA). The same researchers have explored the use of deep learning in PC botnet detection in previous papers [43], [44].
Sriram et al. [45] built on the work of Vinayakumar et al. [40] by comparing multiple ML and Deep Neural Network (DNN) algorithms when applied to normal and botnet traffic to classify them. They explored which algorithms use the least time for training and detection and presented their results. They showed that the most promising techniques were Decision Tree (DT) for differentiating botnet vs normal traffic, and that 4-hidden-layer DNN is effective in classifying which botnet is operating at real-time speeds. In this paper they also used a t-distributed stochastic neighbor embedding visualization, which separated the attack and normal traffic graphically. They suggested that this visualization could be run through a convoluted neural network to achieve differentiation using existing computer vision algorithms.
Alauthaman et al. [41] proposed a method targeted at detecting P2P botnet traffic by passively monitoring network traffic, extracting TCP headers and reducing the data to a feature set. This method did not require deep packet inspection and so was scalable. The remaining features were fed to a resilient back-propagation neural network using a classification and regression tree (CART) algorithm. Experimental results showed that the CART algorithm was faster and more effective than random forest (RFtree) and principle component analysis (PCA) neural network algorithms.
While machine learning and AI models are effective when trained on mixed botnet and normal traffic, they have a couple of drawbacks. They use more computing power than other methods, and for large volumes of traffic they may become a bottleneck unless appropriate resource planning is undertaken. They can also be tricked using adversarial machine learning techniques [46].
Local detection is useful when the scale of the IoT deployment is not large and the limited resources of a consumer router are able to scan and classify network traffic in real time. Remote detection suits larger networks such as an enterprise or campus as the remote resources are used on faster devices such as servers or high-end workstations. A combination of detection methods would be ideal for sensitive IoT devices such as industrial control systems.

IV. EMERGING DIRECTIONS
In this section we will explore future threats that have been theorized and emerging directions in botnet detection and mitigation. These are broken down to Emerging Threats and Emerging Detection Methods, which are further categorized by their themes of Trust and Patch Delivery. This is not a comprehensive list, but rather research that the authors believe have not had the attention they deserve and as such we highlight them here.

A. EMERGING THREATS
Soltan, Mittall & Poor [47] described a theoretical attack they call BlackIoT on a power grid by a group of malware infected high wattage IoT devices such as water heaters, air conditioners and space heaters. In this scenario the Supervisory Control and Data Acquisition (SCADA) devices of power distribution are not attacked directly, but instead large changes in power consumption are initiated by the botnet master controlling consumer high-wattage appliances in certain regions to overload or under-load the electricity grid and cause blackouts and potentially cascading failures in power transmission systems.
Kamenski et al. [48], [49] proposed a threat model for increasing the resilience of botnets by leveraging blockchain technology in place of traditional centralized C&C servers. This would make bots harder to take over for security teams as they would not be able to impersonate the botmaster. If a public blockchain such as bitcoin were used to store botnet data then the data would become part of the immutable blockchain and be distributed to all nodes. This also makes the C&C structure more resistant to government shutdown.
In February 2021, as this paper was being written, the attack predicted by Kamenski et al. has been observed by Saias [50]. The Skidmap cryptocurrency mining botnet, identified in 2019, was seen in 2021 attempting to download malware to Akami's honeypot with a bitcoin wallet address that contained encoded IP addresses for backup C&C servers. Nagy [51] also observed in June 2020 that the Glupteba botnet has also been using the Bitcoin blockchain since 2019 to update C&C servers.
Future IoT device threats must be classified by the technology layer that they attack [52]. Application-level attacks are the most common to date, but vulnerabilities in network stacks are a real threat as evidenced by Karliner [53], who described discovering thirteen vulnerabilities in the FreeRTOS operating system used in embedded devices.
Vulnerabilities in IoT devices may come from unexpected sources, such as attacks on cloud-based service providers or the headline-making work of Sugawara et al. [54] who used lasers to remotely control voice assistant software. This demonstrated that vulnerabilities can exist in surprising places such as sensors, network bridges, hardware or software and that security should be a high priority design consideration for device manufacturers.  In this section we will examine new and experimental methods of detection and mitigation that fall outside of the host/network-based paradigm. Table 4 summarizes the emerging directions in detection and mitigation of IoT bots.
Citations are listed for impact on future research. Zheng et al. described IoTAegis [55], a model security platform that worked on a workstation within a large network to discover and secure devices, both Internet-facing and internal. It used active and passive network scanning to discover IoT devices, connected to, and identified them. It then checked for security vulnerabilities and could be used to change default passwords and update firmware remotely. It was successfully used on a university campus to update passwords and firmware of Hewlett-Packard printers as their test cases. Their scan of 2399 hosts discovered 1701 IoT devices, which were then analyzed. 66% of VoIP phones and 51% of IP printers were discovered to have default or no password, and 59% of printers were found to have out of date firmware.
Jung et al. [56] discussed a method for detecting botnet traffic by monitoring power consumption in IoT devices. They attached a power monitor to a simulated IoT device using Raspberry Pi, then measured changes in current when a device was working normally compared to when it was infected with malware. They found that botnets generate a detectable pattern of electricity usage.
Demeter, Preuss & Shmelev [57] described a production honeypot-as-a-service run by Kaspersky Labs. This aimed to record and analyze new malware targeting IoT devices to protect enterprise networks from intrusion attempts. Results from their monitoring show a marked increase in infection attempts from 2018 to 2019, with mostly Mirai-based attacks observed.
Authors in [58] introduced a PLuggable And Reusable (PLAR) architecture for firmware development aimed at giving IoT device manufacturers tools to create more secure devices. They suggested modularizing software components, with middleware that mediates between the components in the device, so modules can be swapped for tested and secure components depending on the manufacturer's needs.
While most traditional computing research separate their detection methods by the same host-based or network-based methods as the IoT literature, there is some early research into combining the two methods. Almutairi et al. [59] described an algorithm for host and network analysis to detect botnet activity at early stages of infection before communication with the C&C server. They used a combination of file state from the host being monitored and network traffic to determine anomalies through a common detection engine. This method is impractical in the current generation of IoT devices, however future generations may be able to include self-checking software which could be combined with network-based detection for more transparency of device configuration and software.

1) Trust
Devices must communicate between themselves and back to controllers such as a service provider or home hub to be able to function. This has led researchers to examine the subject of trust between devices, and how trust can be established that a device has not been compromised.
Xia et al. [60] suggested using social features to establish trustworthiness between devices. They explained that a PageRank algorithm can be used to promote distribution of information (whether patches or data) from more trustworthy sources and reduce the impact of untrustworthy devices.
England et al. [61] described RIoT or Robust Internet of Things, a system for establishing hardware-based trust using simple hashing cryptography in IoT devices. An immutable bootloader was used to read a device secret during boot, which was never revealed to higher layers of software. Rather, a derived key was generated using a HMAC algorithm which could be used to establish trust with upperlayer software and external devices, such as for attestation of software being run.
Samaniego & Deters [62] suggested using a blockchain middleware to establish trust. They followed a zero-trust model where each device must validate their credentials and configuration each time they participate in a network before they can send a message on the network, ensuring transactions are legitimate. The IoT devices can store their VOLUME X, 2021 configuration on a blockchain where it is immutable and updated only by consensus with other devices. They outline a two-level hierarchy of mining for identity-trust and transaction-trust.

2) Patch Delivery
An effective method for defending against botnet infection is keeping IoT device firmware up to date, as manufacturers release patches from time to time to plug security bugs and improve functionality [63].
Choi et al. [7] proposed an ecosystem for securely updating device firmware. In their system, an IoT device will not run software that has not been properly signed. Signatures are chained, from the manufacturer to a central server on the Internet, then middleware running on a home gateway. Each device in the chain signed the firmware with its private key, and the device used public key cryptography to verify the signature of each step in the chain from the manufacturer to the device and would not run firmware that does not pass all tests. They proved their method mathematically.
Chandra et al. [63] proposed a method of pushing firmware updates to very low power or capacity devices via a lightweight mesh over-the-air protocol. They used a gateway device to download the firmware, then a hub to distribute it across the mesh network, with any devices at too great a distance from the hub being able to acquire the updated firmware from their end-device peers in the mesh. The authors did not provide details on use-cases, but this method could be used for diverse applications of wireless IoT devices such as sensor networks, distributed weather monitoring or even micro satellites.
IoT updates are primarily delivered through a client-server architecture. Evidently, this approach is not scalable given the exponential growth in device numbers. Furthermore, the existing mechanisms to ensure integrity of updates are challenging given the IoT devices' limited computing power (e.g., only lightweight cryptographic primitives can be implemented). Puggioni et al. [64] proposed CrowdPatching to address the aforementioned challenges in IoT update delivery. CrowdPatching is a blockchain-based decentralized protocol that allows device manufacturers to delegate the delivery of software updates to self-interested distributors in exchange for cryptocurrency. Compared with similar work proposed in [65], CrowdPatching allows the involvement of an unrestricted number of distributors, leveraging recent IoT deployment architectures, and rewards trustworthy distributors in the network.

V. FINDINGS
After reviewing the literature, the authors have formed the following conclusions: • The vulnerability exploited to create most IoT botnets (around 95%) is use of default credentials. This could be replaced with a per-device unique password generation system to alleviate much of the botnet infection activity.
• Host-based detection is not feasible on the current generation of IoT devices and has limited application. Artificial intelligence and machine learning models are a promising avenue of research into botnet detection and can be used in mitigation. Researchers contributing in this space, however, do not present side by side comparisons. Their results are often communicated in different ways and with their experimental methods not detailed enough to reproduce their results. This presents a problem when trying to decide which is the most effective or efficient algorithm for detecting botnet activity. Sharing of full experimental setup and methodology, as well as publishing of data sets such as network traffic capture, would help future researchers verify results and build upon the work of others. Authors in [66] recently have made an attempt to provide a comparison given the ongoing inconsistencies in results.
Malware analysis is, by nature, reactive to threats discovered in the wild or on honeypots. If device manufacturers could be convinced of the value of designing with security prioritized then their devices could be much less vulnerable. Provisions need to made for decommissioning IoT devices once they have served their purpose and patch management should be automated or very simple for a typical end-user to perform until the device reaches its end of life.
Over the course of this research we have discovered that while there are many researchers aiming to solve the problems facing IoT devices and detecting botnets in particular, the work is disparate and each researcher uses their own metrics to measure their success. We therefore propose a framework that will allow future researchers to compare and contrast results in an accurate and methodical way.

A. FRAMEWORK FOR FUTURE RESEARCH
We have devised a framework for future research in IoT botnet detection and mitigation. We have noticed that research in IoT botnet detection and mitigation is hardly repeatable and comparable, which has slowed down practical progress in this domain. The main goal we are pursing with this framework is, therefore, to ensure that research in this critical domain does not suffer from such limitation. Generally, research in IoT botnets can be categorized into the matrix shown in table 5. This framework table can be used to assist researchers to move their research from the early exploration phase to an operational product that can perform detection and mitigation of botnets in IoT devices.
Examples of research papers in the exploration phase include [42], [60], [62], [63] and [64] which are early experiments that explore whether a concept is worth pursuing. Papers that are in the solution phase like [30], [36], [39] and [41] take their research a step further, comparing algorithms against malware to measure effectiveness. Finally, operational phase papers such as [34], [35], [38] and [55] provide more fully-fledged mitigation solutions to the IoT botnet problem, having built on previous research.
Future researchers are encouraged to use a standard set of characteristics for reporting results so that different method- Our research has shown that there are two methods for detecting botnets in IoT devices; host-based and networkbased. Research can be categorized further into exploration, solution or operation depending on the stage of research and the researchers' goals. We reviewed the most recently emerging threats and solutions and report that there seems to be an agreement among researchers that network-based detection suits the heterogeneous, distributed, and sometimes remote nature of IoT devices. Innovative approaches for detection of IoT botnet such as monitoring power usage, hybrid approaches (i.e., local and remote detection), leveraging other technologies such as Blockchain and Software-Defined Networks seem to be early stage promising efforts that require further exploration. Specifically, we note that the real world efficacy of proposals will be dependent on deployment assumptions that recent efforts seem to be too idealistic about.
Future researchers in IoT botnet domain can plan their work based on the findings reported in this paper including the framework suggested to increase effectiveness and better positioning of their contribution.
BEN STEPHENS (Student Member, IEEE) is studying his final trimester for Bachelor's degree in Cyber Security at Deakin University, where he is among the highest achieving students in the course with a GPA of 92/100. Ben has secured competitive scholarships including Deakin Scholarship for Excellence and a research scholarship from Cyber Security Research & Innovation (CSRI). Ben has also been sponsored by Deakin University to complete cyber security certifications including Certified Ethical Hacker (CEH) because of his outstanding university performance. He was involved in malware analysis, botnet takedown, removal and reporting as well as distribution of new malware to antivirus companies in the 2000's, when he ran IRC servers on the EFNet and Undernet IRC networks. His research interests include security in consumer and industrial IoT, malware analysis and privacy-enhancing technologies. He is currently a member of the Australian Information Security Association and the IoT Security Institute and has contributed as a peer reviewer, and student member of Deakin's major course review for the Bachelor and Masters of Cyber Security. VOLUME X, 2021 ARASH SHAGHAGHI (Member, IEEE) is a Senior Lecturer in Cyber Security at RMIT University. At RMIT, he is a member of RMIT University Centre for Cyber Security Research and Innovation (CCSRI). Arash is also a Visiting Fellow at the School of Computer Science and Engineering of UNSW Sydney, Australia. He has previously been affiliated with Deakin University, UNSW Sydney, Data61 CSIRO, The University of Melbourne, and The University of Texas at Dallas. Arash completed his PhD at UNSW Sydney in Computer Science and Engineering, MSc Information Security at University College London (UCL) and BSc at Heriot-Watt University. Arash is a multi-award winner cyber security educator and researcher with a track record of publications at competitive international conferences and journals. Arash currently serves as the Associate Editor for Journal of Ad Hoc Networks and has had roles (TPC, organising member, and reviewer) at prestigious journals and conferences. To this date, Arash has received a total funding of more than 300,000 AUD (as PI and CI combined) for his cyber security research from various internal and external sources including the Australian Government. There have been several media coverage on Arash's research activities, including by the Australian Broadcasting Corporation (ABC).
ROBIN DOSS (Senior Member, IEEE) is a Professor and the Research Director of the Strategic Centre for Cyber Security Research & Innovation (CSRI) at Deakin University. In this role, he provides scientific leadership for this multidisciplinary research centre focused on the technical, business, human, policy and legal aspects of cybersecurity. In addition, he also leads the Next Generation Authentication Technologies theme for the Critical Infrastructure Security research program of the national Cyber Security Cooperative Research Centre (CSCRC). Prior to this role, he was the Deputy Head of School for the School of Information Technology at Deakin University. Robin has an extensive research publication portfolio and in 2019 was the recipient of the 'Cyber Security Researcher of the Year Award' from the Australian Information Security Association (AISA). His research interests include the broad areas of system security, protocol design and security analysis with a focus on smart, cyberphysical and critical infrastructures. His research program has been funded by the Australian Research Council (ARC), government agencies such as the Defence Signals Directorate (DSD), Department of Industry, Innovation and Science (DIIS) and industry partners. He has contributed to large multiyear projects under the European Union's Framework Program (FP6) and been funded by the Indian Government under the Scheme for Promotion of Academic and Research Collaboration (SPARC). He is a member of the executive council of the IoT Alliance Australia (IoTAA). He is founding chair of the Future Network Systems and Security (FNSS) conference series and is an associate editor for the Journal of Cyber Physical Systems. SALIL S. KANHERE (Senior Member, IEEE) received the M.S. and Ph.D. degrees from Drexel University, Philadelphia. He is currently a Professor of Computer Science and Engineering with UNSW Sydney, Australia. He also holds affilations with CSIRO's Data61 and the Cyber Security Cooperative Research Centre (CSCRC). His research interests include the Internet of Things, cyberphysical systems, blockchain, pervasive computing, cybersecurity, and applied machine learning. He is a Senior Member of the ACM, a Humboldt Research Fellow, and an ACM Distinguished Speaker. He serves as the Editor in Chief of the Ad Hoc Networks journal and as an Associate Editor of the IEEE Transactions On Network and Service Management, Computer Communications, and Pervasive andMobile Computing. He has served on the organising committee of several IEEE/ACM international conferences. He has co-authored a book titled Blockchain for Cyberphysical Systems.