Zero Trust Architecture (ZTA): A Comprehensive Survey

We present a detailed survey of the Zero Trust (ZT) security paradigm which has a growing number of advocates in the critical infrastructure risk management space. The article employs a descriptive approach to present the fundamental tenets of ZT and provides a review of numerous potential options available for successful realization of this paradigm. We describe the role of authentication and access control in Zero Trust Architectures (ZTA) and present an in-depth discussion of state-of-the-art techniques for authentication and access control in different scenarios. Furthermore, we comprehensively discuss the conventional approaches to encryption, micro-segmentation, and security automation available for instantiating a ZTA. The article also details various challenges associated with contemporary authentication mechanisms, access control schemes, trust and risk computation techniques, micro-segmentation approaches, and Software-Defined Perimeter, that can impact the implementation of ZT in its true sense. Based upon our analysis, we finally pinpoint the potential future research directions for successful realization of ZT in critical infrastructures.


I. INTRODUCTION
The rapid growth and adoption of the Internet of Things (IoT) and edge computing platforms has challenged the ability of traditional perimeter-based security architectures to effectively protect both enterprise assets and critical infrastructures. The notion of a Zero-Trust Architecture (ZTA) has been gaining momentum and is increasingly seen as the security architecture of choice for such infrastructures. As the name implies, ZTA is built on the notions of least privilege, granular access control and dynamic and strict policy enforcement wherein no user or device is implicitly trusted-irrespective of stature or location. In this paper, we undertake a horizon scan to identify the current state-of-the-art as relevant for effective implementation of ZTA in a critical infrastructure context. Although there are some existing works such as [1] and [2] which review the working principles of ZT, our focus in this The associate editor coordinating the review of this manuscript and approving it for publication was Maurizio Casoni . article is on the basic tenets of ZT and how state-of-the-art approaches can be used to accomplish these. We critically analyse the individual components of ZTA to see whether existing techniques suffice for realization of ZTA in critical infrastructures. Based on this comprehensive survey of high-quality research publications relevant to the ZTA tenets, we then present recommendations that can be referenced as a guiding framework for the crafting of future cyber security strategies for the protection of critical infrastructures and their operations.
As per the National Institute of Standards and Technology (NIST) report on Zero Trust Architecture [3], ZTA is not a single network architecture which can be achieved using just one technology. Rather, ZTA comprises various guiding principles that need to be strategically implemented to secure enterprise assets such as data, devices, users and other components of infrastructure. The key principles for achieving ZTA are authentication and access control, as these are the means by which the user's identity is established and VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ privileges ascertained for the conduct of different operations involving protected resource(s). For implementing ZTA in a critical infrastructure context, a strong authentication scheme which identifies both users and devices is required. Beyond this, rather than simply relying on entry-point authentication, the use of a context-aware and continuous authentication scheme which takes into account both the user and device contexts on an ongoing basis to ''actively'' authenticate is also recommended. In terms of access control, ZTA strategies should include a risk-aware access control scheme which determines the risk associated with the granting of an access request. Further to these two primary principles, ZTA realisation necessitates the adoption of lightweight encryption schemes to account for resource-constrained devices in cyber physical systems. Micro-segmentation and software defined perimeters are also recommended by NIST as core ZTA implementation strategies. However, these technologies require customisation to secure the edge network of IoT devices. Threat intelligence is also critical, as it can serve as a key feedback mechanism to drive automated security technologies within the defence environment. A tailored, reliably responsive system is mandatory for continuous trust evaluation and access control. Given the large volume of data coming from heterogeneous sources within a complex system, a Machine Learning (ML) supported approach is recommended for effectively deducing trust. This article offers a comprehensive survey of state-of-the-art approaches for achieving a zero trust architecture (See Figure 1 for an overview). Our goal is to support informed decision-making by practitioners when drafting security strategies appropriate for their own peculiar operational contexts. After thorough investigation of each requirement depicted in Figure 1, we have arrived at the following conclusions: • Conventional user authentication mechanisms (static, active, or context-aware) are either vulnerable or have limitations. Therefore, lightweight and scalable continuous authentication mechanisms are essential for enabling trust for all tangible and non-tangible organisational resources such as users, devices, systems and processes.
• Different IoT-enabled environments have entirely different access control requirements and therefore demand customised arrangements. For most critical infrastructures, risk-aware access control which incorporates capabilities congruent with a fine-grained access control scheme (such as FBAC, or Function-Based Access Control) seems the optimal choice, as this can actually be used to evaluate the risk inherent to a particular access request. This evaluation is performed by leveraging information collected from diverse sources, including reported threat-intelligence and outputs from cyber-physical systems (CPS) designed to support situation awareness. The result is context-specific access at a high-level of granularity.
• In essence, the goal of ZTA is to protect data, whether it is in rest, in transit, or being processed. Therefore, encryption is an important requirement for achieving a zero trust environment. Conventional encryption techniques are now at a greater risk of compromise due to recent developments in quantum computing. If encryption is to remain a viable means of securing data, post-quantum cryptosystems which can withstand quantum computing attacks are necessary. Fortunately, standardization of post-quantum cryptosystems is currently underway at NIST. However, successfully incorporating such systems across a host of different applications and devices-many of them resource-constrained-is a major challenge that will require dedicated and incremental research efforts.
• Another important element of ZTA is microsegmentation, which enforces policies closer to the protected resources, thereby breaking the network into smaller logical segments and precluding lateral movement by attackers. Micro-segmentation in the IoT context can be achieved using software-defined networking (SDN) along with the use of network function virtualisation (NFV) and a software-defined perimeter (SDP) that acts as an overlay network to protect resources. While this approach has shown some promise, it necessitates significant changes to networks, clients, and servers, however. Another problem is that the central controller then becomes a single point of failure which can be targeted by malicious entities to severely impair SDP functionality. Thus, a federated, network-and-application-layer-aware segmentation and SDP technique that is resilient to contemporary cyber attacks is required for diverse defence network architectures.
• Threat intelligence (TI) and Security Situation Awareness (SSA) are important elements of ZTA because they provide necessary feedback to the policy engine for making informed decisions. However, defining indicators of compromise (IoC) is generally difficult. The heterogeneity of sources and the sheer volume of data involved also makes it difficult to accurately identify potential threats within a complex system. Thus, it is essential to have an effective feedback system in place which can incorporate input from heterogeneous data sources and device state monitoring logs to effectively recognise and react to threats.
• The dynamic enforcement of access policies within a ZTA requires a reliable trust evaluation capability. However, trust schemes and risk evaluation frameworks face challenges in cyber-physical systems (CPS) and critical infrastructures due to the numerous sources of input and high volumes of contextual data collected. To meet these challenges, the services of Machine Learning (ML) algorithms are needed to automate the detection and prevention processes (with minimal false positives) for effective implementation of a ZTA. Deep learning has already shown great utility toward cyber attack detection and it will continue to improve as computing capabilities advance and more data is gathered to support the growing body of knowledge. ML is an indispensable component of the strong security automation framework that is required for a functional ZTA.
• Likewise, the heterogeneity of sources and data containing numerical and imprecise information makes it challenging to have a trust mechanism that can leverage the varying input data (e.g. contextual information, behavioral data, device-related information, and location information). The implementation details of risk computation remain obscure in current literature: more work must be done to elaborate the procedures for implementing access control policies at the network and application levels for critical applications.

A. REVIEW APPROACH
The presentation of our literature review is structured as follows. We begin by listing the basic tenets or logical components of ZTA as outlined by NIST. Elaborating on these, in subsequent sections we critically analyse each identified tenet with reference to current practices intended to instantiate or embody it. This is illustrated by means of different usecase scenarios. In each section pertaining to a particular ZTA tenet, the weaknesses of the current practices are identified and explained. In total, we reviewed more than 180 articles and reports relating to the ZTA tenets. The keywords we used for retrieving these documents were zero trust architecture, authentication, access control, encryption, microsegmentation, and security automation (i.e. the requirements depicted in Figure 1). While the topic of zero trust has been explored by other authors in previous published work, our approach for the current literature review differs significantly (see Table 1 for a comparison). Our approach is both new and more comprehensive because we examine prevailing practices in context to identify their shortcomings vis-à-vis NIST's full list of identified ZTA tenets.

B. TARGET AUDIENCE
This article is intended for developing cyber-security researchers with aims to 1) provide basic information on ZTA; 2) present the state-of-the-art for practices indeed to meet ZTA requirements; 3) identify the problems with these current approaches; and 4) present a broad overview of future research directions across the ZTA tenets.

C. ORGANISATION
The rest of this Article is structured as follows. Section II presents a generic discussion on ZTA and its essential tenets. State-of-the-art authentication and access control mechanisms are presented in Sections III and IV, respectively. Encryption mechanisms are discussed in Section V, and segmentation techniques are detailed in Section VI. Section VII expands on security automation and orchestration. Finally, the discussion and conclusion is presented in Sections VIII and IX, respectively.

II. ZERO TRUST ARCHITECTURE
The operative definition of zero trust and zero trust architecture (ZTA) in accordance with NIST [3] is as follows: zero trust refers to the collection of ideas that may help to lessen (or ideally eliminate) the ambiguity involved in in enforcing the precise access decisions for each and every request made by viewing the network as compromised, and ZTA refers to the actual overall system design intended to support this.
An abstraction of zero trust access is shown in Figure 2, which illustrates the roles of authentication and authorization via Policy Decision/Enforcement Points (PDP/PEP) to enforce access control for every connection request. The access control relies on device security posture and potential consideration of other contextual factors (e.g. time and location, prior access behaviour) that may impact confidence level before access to a resource is granted as per the defined policies.
NIST defines the seven basic tenets for a ZTA [3] which are aimed at achieving the optimal goal of implementing ZTA (with the option of selectively implementing some tenets and not others, in accordance with perceived need). VOLUME 10, 2022
• Resource -any data source or computing service • Communication Security -communication is secured irrespective of location.
• Session Security -access to resources is granted on a per-session basis, and authentication and authorization for one resource may not extend privileges to others.
• Access Control -access to resources is determined by dynamic policy, including the observable state of client identity, application, and requesting asset.
• Minimum-Security Posture -enterprise ensures that all the owned and associated devices are in the most secure state and monitors assets to ensure this.
• Continuous Authentication -all resource authentication and authorization is dynamic and strictly enforced. An enterprise that intends to implement ZTA may have an identity, credential, and access management (ICAM) system and multi factor authentication (MFA) for added security. A continuous inspection during the interaction of the user with a possibility of friction free reauthentication/authorization may help.
• Information Logging -the enterprise collects as much information as possible about the current state of the network infrastructure and communications, and uses this information to improve its security posture. Similar zero trust principles are described in Forrester's extended zero trust model [6]. The principles in Forrester's model are data protection centred, with all entities such as user, device, network and workload protected through the analysis and automation of all network operations. Table 2 shows a mapping between tenet terminologies, revealing their interchangeability and suggesting that either scheme is useful toward guiding zero trust security practice. Since these two schemes effectively sort the same core concepts under different headings,, Table 2 situates these concepts within a matrix to illustrate relationship between them.
A. ZTA LOGICAL COMPONENTS ZTA comprises various services consisting of numerous logical components, which are operated either onsite or offsite via a cloud. Of these components, NIST describes three as core: policy enforcement point (PEP), policy administrator (PA), and policy engine (PE), as shown below in Figure 3. The functionalities of these three core components are as follows: • Policy engine (PE) -makes the access decision in accordance with enterprise policies by feeding the external inputs to a trust algorithm which functions as the ''brain'' of the entire system.
• Policy administrator (PA) -works closely with the PE and either allows or denies access as per the PE's decision. It may be incorporated into the PE and it talks to the PEP for policy enforcement.
• Policy enforcement point (PEP) -enables, monitors, and finally terminates the connection between the subject and the resource. It can be further divided into subcomponents, namely, client (e.g. agent on a device) and resource (e.g. a gateway). The area beyond the PEP is usually a trust-zone. In addition to the aforementioned core components, a number of external components are mentioned in [3] that facilitate the realization of zero trust security (e.g. continuous diagnostics and mitigation, data access policies, identity management, security information and event management (SIEM), activity logs, etc.). As described above, the actual access decision is made by the PE by leveraging a Trust Algorithm (TA).

1) TRUST ALGORITHM (TA)
The trust algorithm (TA) is the process employed by PE to make a decision by considering inputs such as entries in the policy database, user role, attributes, behavioral information, threat-related information, etc. as per the need of a particular deployment, as shown in Figure 4. • Access Request -Whenever a user makes an access request, the basic information regarding the resource and the requester is used by the TA (e.g. operating system, patch level, and application used).
• User Identification, attributes, and Privileges -User related information, authentication performed by the PE, and other attributes such as time and location may also be used by the TA to compute the confidence level. The collection of privileges given to different users as encoded in the ID management and policy database may also be utilized by the TA.
• Asset Database and Observable Status -This database comprises the status of all the resources, which are compared against the observable status of the requester (i.e. operating system, patch level, location) by the TA to make an access decision.
• Resource Access Requirement -This includes policies that are based on minimal requirements to access a resource as set by the custodian, e.g. the requirement of multi-factor authentication at a new location.
• Threat Intelligence -This includes information such as attack signatures or malware operating on the Internet, as provided by either internal or external sources and used by the TA. In TA, all aforementioned data sources are assigned different weights as per their importance and the needs of enterprise. Once the TA has made the decision, the PE passes this to the PA which configures all the corresponding PEPs to either enable or disable the communication; for example, it may send the configurations to gateways and agents, thereby requiring re-authentication, re-authorisation, or termination of the connection, as per the defined policies. In accordance with the NIST framework [3], the TA can be implemented in various ways as described below.
• Criteria vs. Score-based TA -Criteria-based TA necessitates a combination of attributes be fulfilled before allowing an action. Whereas in score-based TA, the weighted values of input data are used to compute a confidence level which is compared against the threshold value. If the confidence level is greater than the set threshold, access is granted; otherwise it is denied.
• Singular Vs Contextual TA -Decisions in singular TA do not take the user's historical information into account while making a decision, which may lead to a faster decision making process but can have repercussions in the sense that some threats can go undetected. On the contrary, contextual trust makes use of a user's historical behavioural patterns while making a decision, which is helpful, as deviation from a normal usage pattern may indicate a threat-one that a contextual TA has the ability to recognise and flag. As indicated in the NIST guidelines [3], contextual score-based TA may be an optimal choice, as it can compute a confidence score by leveraging contextual information, thereby increasing the possibility of detecting a potentially malicious request. Different methods used for trust computation in different conventional and cyber-physical systems are detailed in Section VII-F.

B. ZTA IMPLEMENTATION TECHNIQUES
ZTA can be implemented in an organization by adopting various technologies and techniques. As explained above, PE consists of policy rules that are derived from a number of data sources [3]. The data sources can be categorised into: • Identity management (validity of user accounts, cryptographic certificates etc.) • Access control (access levels of various users, past access history and contextual attributes) • Network and Application logs collected through SIEM • Threat Intelligence (internal and external threat knowledge base) • Continuous Diagnostics and Mitigation (CDM) • Compliance to various industry defined standards These data sources are integrated in the TA of the PE, which makes policy decisions based on various factors in compliance with the ZTA principles. The zero trust architectures can also vary based on components deployed and data sources that drive the organisation's PE [3]. Zero trust architectures can be based on: • Identity Governance: In this approach, the identities of users and devices are the primary factors incorporated in the PE. The policy decisions consider the validity of user identities, contextual information and asset sensitivity to measure access risks.
• Micro-Segmentation: In this approach, the PEP is deployed closer to the resource or the data to protect it from unauthorised access. This prevents lateral movement by an intruder. Micro-segmentation can be achieved by placing virtual firewalls to eliminate unauthorised access.
• Network Infrastructure and Software Defined Perimeters: In this type of ZTA implementation, an overlay network is created to control access to resources which are otherwise not accessible directly, thereby creating a software-defined perimeter (SDP). The advantages of software-defined networks are leveraged in this approach to enforce access control and secure communication with back-end applications.
An implementation of zero trust requires identifying the main zero trust tenets applicable to the situation, as well as what techniques are required to enforce these tenets. Although ZTA principles and their logical components have been defined by various organisations, the implementation strategy required for a critical Infrastructure (CI) is still unclear. The reasons behind this is that CIs use a wide range of technologies (new and legacy systems) and endpoints (e.g. IoT devices, CPS devices, traditional network endpoints), which presents various challenges to the effective implementation of ZTA. Hence, it is essential to identify the most suitable techniques for realising ZTA tenets in a CI. Table 3 lists the techniques associated with the implementation of ZT tenets, which will be discussed in detail in subsequent sections. We maintain that to successfully realize ZT in CI, the underlying techniques requiring thorough investigation include authentication, access control, encryption, network segmentation, SDP, and security automation. A detailed discussion of these is provided in subsequent sections (see Sections 3 through 7). We believe this will provide the reader with a succinct overview of the current state-of-the-art for each technique while also pinpointing the weaknesses and knowledge gaps which future research needs to address.

III. AUTHENTICATION
Authentication is the process of verifying the identities of users (or devices, in machine-to-machine scenarios) when they attempt to access resources. This is important for determining whether a subject requesting access is legitimate, which is a fundamental consideration when attempting to establish zero trust. In this section, we present an overview of state-of-the-art techniques for user and device authentication.

A. CONVENTIONAL USER AUTHENTICATION MECHANISMS AND RELATED ISSUES
User authentication is critical to both personal devices and online services. However, it has been repeatedly demonstrated that the traditional methods of authentication are vulnerable to subversion. For example, the use of passwords-the most popular method of authentication-has numerous vulnerabilities. Users often use easy-to-guess passwords (e.g. asd123). It is also not uncommon for users to replicate passwords across multiple accounts; consequently, if any one of these accounts is compromised, then all others are susceptible as well. Moreover, well-crafted, ''strong'' passwords are also prone to hacking and can be inferred by conducting sophisticated side channel attacks such as those mentioned in [7]- [9]. Authentication mechanisms based on physical biometrics such as fingerprints, face recognition and, iris scans are also not difficult to bypass. For example, fingerprints are easy to capture from a surface to make a dummy fingerprint. It has been shown that face recognition based mechanisms can be compromised by using the victim's photograph or a 3D-printed reproduction of the user's head [10]. Similarly, iris-based mechanisms can be bypassed using a photograph of the user's iris superimposed onto a contact lens [11]. Vein-based approaches [12] and typing/tapping characteristics captured thorough sound, as proposed in [13], are promising but may be impractical if they only work on limited devices or in controlled environments.
In view of the above considerations, the popularity of multi factor authentication (MFA) is increasing. A widely used instantiation of MFA is referred to as two-factor authentication (2FA). 2FA generally combines any two of the following (traditional) authentication factors -i.e., knowledge (e.g., password), inherence (e.g., fingerprint), and possession (e.g., hardware or software token). Through its requirement for two authentication factors, 2FA provides an additional layer of security because if one of the factors (e.g. password) is compromised, the other is still in place (e.g. token) to thwart illicit access. The second factor of authentication can be divided into two broad categories, namely, hardware and software tokens. For example, [14], [15] provides an overview of the hardware token based solutions, which require the user to have a specific hardware which generates a unique one time code (OTC) to be used as the second factor of authentication. However, the problem with this approach is that it requires users to carry a dedicated piece of hardware with them at all times, which users commonly find inconvenient [16]. Another disincentive for using this approach is that it incurs an extra cost to the service provider. An alternative to such hardware-based solutions, is the software-token. A typical example of this is the one time code sent to a user's pre-registered mobile phone number via SMS. A significant vulnerability of this approach is that it is susceptible to interception [17]. This approach can also have privacy implications because providing a personal mobile number to multiple service providers can lead to spam. These kinds of shortcomings have been redressed by application-based solutions in which the OTC is generated by an app (and not transmitted). Either approach is less than optimal because it requires significant interaction: the user must wait for the code and then manually enter it (or accept a push notification).
Numerous prior studies have indicated this extensive user interaction as a reason for the low adoption rate of 2FA. Apart from this, the dependence of these approaches on a secondary device (e.g. a mobile phone) is also problematic [18], [19]. If the 2FA device is lost, stolen, without power or otherwise inoperable, then the dependent service may not be accessible. The aforementioned shortcomings of most 2FA solutions necessitates the creation of a mechanism that is easy to use (to improve adoption rate) and which is not dependent upon any secondary device (such as mobile phone). A potential alternative could be the use of behavioural biometrics (e.g. to recognise some type of gesture) as a second factor. Such a solution would mean that the user did not have to carry any additional device for the purposes of user authentication.

B. CONTEXT-AWARE USER AUTHENTICATION
Given the aforementioned problems, the need for contextaware and continuous/active authentication is gaining greater recognition. The word 'context' can be defined as any information that can establish the situation of an entity [20]. Context-aware security uses this situational information (e.g. identity, geolocation, time) to decide whether to provide access to a particular resource.
Modern mobile computing devices are increasingly integrating sensors (e.g. GPS, accelerometer, gyroscope) which generally have enormous computing capabilities that could be used to support context-aware authentication. For instance, many online services such as online banking and email ask users additional questions when they attempt to login from a new IP address. In this case, the IP address is the contextual information being used to flag that the user may still be an adversary despite having provided the correct credentials. However, many commonly asked security questions have answers (e.g., place of birth, pet's name etc.) that are often readily available online (e.g. in social media posts and profiles) and are therefore no deterrent at all for an attacker willing to do some research. An interesting extension of this approach is presented in [21], where a user's mobile location provides the contextual passive information (using WiFi and cell tower data) and, based on this information, an active factor of authentication (PIN, Password, or None) is modulated to ascertain the identity of the user. The paper proposes a probabilistic framework that leverages the location information and decides (through risk-assessment) which active form of user authentication should be deployed. This provides a particular usability advantage in that the user may not be authenticated through an active mode (i.e. PIN or password) when the location in which the transaction occurs is illogical. Similarly, the authors of [22] showed that the mobility pattern of the user can be modelled by leveraging an n-gram model to determine anomalous instances where a mobile device may be stolen (and access to private data should be therefore restricted). The authors of [23] also used behavioural features (i.e. GPS location, time since email checked) to calculate a score which can be compared against a threshold to decide whether to perform implicit authentication.
The authors of [24] devised a context-aware user authentication mechanism for a scenario in which the user attempts to access an application or service hosted on a cloud. When a user attempts to access a particular service hosted on a cloud, an agent installed on the user's mobile device (e.g., phone) gathers the contextual information and sends it to the cloud-hosted context aware authentication system, which in turn compares this information against the saved information of the user and makes an authentication decision.  Specifically, the mobile agent collects the time zone and GPS location, which are compared to proceed further. Then, the system checks the OS details like OS type, phone manufacturer and model. Finally, the mechanism computes a Cosine similarity between the applications installed and process running on the mobile device. If the similarity of applications and processes is greater than the defined threshold, then the access is granted (or denied otherwise).
The authors of [25] present an interesting contextual authentication mechanism (context-aware multimodal FIDO authentication, or CAMFA) for mobile phones used to access any service hosted remotely. CAMFA is compliant with FIDO (Fast IDentity Online-a technical standard for authentication systems). Here, the FIDO server defines the relying party's level of authentication (RP LoA) required for accessing a particular service. The FIDO client (on the relying mobile device) then utilizes this to request that CAMFA meet the level of authentication required for that service by means of explicit (PIN, Face) or implicit (keystroke, location, placement) methods. The CAMFA mechanism monitors the risk level associated with the user's current situation through utilization of sensors embedded in the device. For example, the user's risk level changes in accordance with location and where the mobile phone is placed (e.g. hand, table, pocket). When the user's sensed behavioural information corresponds with the situational information, the risk level is computed to be low. With reference to both the computed risk level and the LoA required for the service the user is attempting to access, CAMFA combines different implicit and explicit methods (e.g. PIN, keystroke, location, face) to authenticate users. The authors of [26] propose a context-aware authentication mechanism for smart homes that utilizes the user's location, time, and other behavioural data for accessing the devices integrated into these homes. For making an access decision, this mechanism assigns different weights to different pieces of contextual information (e.g., location = 0.2, time = 0.1, calendar = 0.2, password = 0.3, preference = 0.2). The confidence level is computed utilizing the defined weights and then compared against a threshold value to make the access decision. The evaluation demonstrates the flexibility of the approach for assigning security levels to different users, as well as the appropriateness of the aforementioned contextual information (which can be obtained reasonably quickly) for making access decisions.
Issues With Context-Aware Authentication: Table 4 presents a summary of contextual information used in different mechanisms. It is conspicuous that, in almost all of the mechanisms, location is used as the primary source of contextual information to be combined with some other form of information, such as time, phone details, or behavioral data. All of the approaches involve comparing different types of contextual information to compute a risk score which determines further requirements for authentication.
Although the context-aware authentication mechanisms described above have shown success in specific scenarios, they do have some limitations. For example, modern users access the same online services using many different types of devices (mobile phones, laptops, desktops, etc.) and the sensors required for establishing the necessary contextual information may not be present in many devices. For example, context-aware mechanisms that need an accelerometer to determine the position of a device for risk assessment (e.g., in [25]) will not work for laptops or desktops that have no accelerometer. Therefore, having a mechanism that can leverage rich contextual information-but which is also widely usable across all sorts of devices-is challenging. A suitable alternative might be to pair the user's location (i.e. contextual information) with the user's daily mobile phone activity to generate questions. Numerous works such as [28], [29] propose such authentication mechanisms, which reference call, SMS and web logs to as questions such as ''Who did you call first today?'' However, a problem with this approach is that such questions are easy to answer for close relations (e.g. partners, family or friends) who may in some cases be the adversary. To counter this issue, the user's historical location information may be referenced when presenting such questions. For example, when a person is at home, the authentication mechanism may ask questions related to activities performed at work and vice versa. Another anticipated problem with this approach is that it requires asking a series of questions, which may result in usability issues (and may only be suitable for fall-back or second-factor authentication). The selection of questions that are easy to answer for an actual user but difficult for an attacker to guess is also an open problem.

C. CONTINUOUS AUTHENTICATION
Traditional modes of authentication (e.g. passwords, biometrics) only provide entry-point security, which is to say that they only establish the identity of the subject (user, device, process) when the subject is attempting to access a secure service (or device). Once the subject passes this stage, there is generally no procedure to ensure that the authenticated subject is in on-going control of the session (or device). As mentioned earlier, passwords are frequently leaked or hacked. If any critical service is password protected and is somehow compromised, there is no way to confirm the subject's identity beyond the login stage. To address this issue, continuous authentication (also referred to as active, transparent, or implicit authentication) is widely advocated. For example, authors of [30] use the colours of a user's clothes and facial skin to continuously authenticate the user during a login session. However, continuously capturing the user's photograph and sending it (to a remote service) for authentication may not only be computationally expensive, but can also have serious privacy implications. In [31], authors proposed a continuous authentication mechanism that leverages the pattern of user's hand movement while typing on the keyboard. The webcam (as on the laptop) is pointed towards the keyboard and a continuous video stream is fed to the algorithm, which in turn attempts to repeatedly authenticate the user. There are two significant issues with this approach. First, continuously streaming the video will be onerous and may degrade the performance of the system if any other intensive computation is also being carried out. Second, requiring that a webcam remain pointed toward a keyboard means it cannot be used for anything else, which necessitates the use of and additional external device that can be oriented in this way without interfering with use of the keyboard or monitor. Likewise, the authors of [32] also use the user's typing behaviour for continuous user authentication. They use two features for accomplishing continuous authentication: key hold-time and inter-key time. However, this procedure may only work in situations where the user is actively engaged in typing activity. In many scenarios, the user may not be performing any task that involves typing, thus rendering the approach useless. A number of works have proposed continuous authentication mechanisms for mobile devices (e.g., smartphones). As the smartphone designs incorporate numerous sensors (e.g. touch sensors, accelerometer and gyroscope), they offer opportunity to profile the user's behavioural characteristics (e.g. how the screen is tapped), and reference this profile for continuous authentication. For example, [33], [34] use the touch dynamics on the screen and extract features such as orientation of the finger, pressure exerted on the screen, area occluded and time instances. These features are converted to vectors and fed to machine learning classifiers which then model the user's behavioural profile using the extracted features for future reference.
Many works have utilized gait patterns (i.e. the way person walks) for enabling continuous user authentication. Gait related data is captured by leveraging accelerometer and gyroscope sensor data. These methods either use the raw data corresponding to a gait pattern (and then use correlation or machine learning for performing continuous authentication, e.g. [35] [36]), or they extract features (e.g. fast Fourier transform or Wavelet coefficients) from the captured data to train the classifier and perform continuous authentication (e.g. [37], [38]). A class of continuous authentication techniques tap passive biometrics such as medical signals which do not require the user's active cooperation, unlike methods such as facial recognition, fingerprint recognition and speaker voice recognition, which can distract the user. Many medical signals such as brain activity tracked by Electroencephalogram (EEG) [39], heart rate monitoring using Electrocardiogram (ECG) [40], electrical activity of muscles monitored by Electromyogram (EMG) [41] and an ensemble of various other medical signals [42] have also been explored.
Continuous authentication mechanisms for devices which do not maintain physical contact with humans cannot leverage the aforementioned authentication schemes that monitor human user behavioural or rely on biometric-based parameters. In such scenarios, ubiquitous parameters, such as radio frequency (RF) signals, and ambient parameters, such as light, temperature, and sound that can be acquired by the devices without human interaction, have been proposed [43], [44]. Wireless channel parameters such as channel state information (CSI) and received signal strength (RSS), which change in the presence of users and their movements, can be used to continuously authenticate devices without human interaction. Ambient parameters can be similarly leveraged to verify the device location, as such parameters do not change drastically within short periods of time.
Issues With Continuous Authentication: Although the aforementioned approaches have demonstrated success in enabling continuous authentication in some particular scenarios, a few limitations are rather conspicuous. Almost all of these approaches are device-specific, i.e. they only work on the devices they are designed for. For example, an approach that uses the touch dynamics (e.g., [33]) for continuous authentication will not work on some other devices such as laptops or desktops. Therefore, the extension of such approaches for enabling continuous authentication for online services will be difficult, as such services can generally be accessed from a variety of devices (e.g. smartphones, laptops, desktops). The aforementioned approaches for continuous authentication are also scenario-specific. For example, approaches based upon gait pattern will only work while user is walking. Likewise, most of the continuous authentication mechanisms are based on behavioural biometrics (e.g. the way person types, taps, or walks) that tend to change under different circumstances (e.g. the tapping behaviour of a user may be different while sitting or walking). Therefore, the development of a continuous user authentication mechanism that can work across all sorts of devices and situations VOLUME 10, 2022 remains an open research problem. The aforementioned discussion reveals that most of the approaches for continuous authentication leverage the user's physical or behavioral biometrics. This presents obvious challenges for device design, as it requires the integration of new and improved components that have been optimised for authentication purposes.

D. DEVICE AUTHENTICATION
The Forrester extended ZTA ecosystem [6] considers devices such as IoT devices as potential threats to enterprise networks and have suggested a ZTA principle that allows enterprises to segment, protect and restrict devices connecting to the network. This principle is also referred to as Zero Trust Device (ZTD). According to the NIST ZTA, all resources that generate data are considered a resource. Resources could include several categories of devices such as servers, workstations, mobile devices, and IoT and operational technology (OT) devices. Devices can also be categorised into enterprise-owned or personal devices. In order to implement zero trust for devices, it is essential to identify all the devices that connect to the network and what they access. The traditional methods to authenticate people may not to be appropriate for identifying and authenticating IoT and OT devices. Some of the main challenges include: • Most IoT and OT devices operate without human assistance, hence human-associated authentication factors are irrelevant.
• Machine-to-Machine (M2M) communications exist in IoT and OT networks which require new authentication mechanisms (e.g. mutual authentication) to authenticate devices [45].
• However, due to the typical computational limitations of IoT devices, existing identity verification methods might be impractical for authentication between them [46]. In an IoT or OT-based system, more devices than people will connect to the network and all must be authenticated. In the context of a ZTA, devices must be authenticated before messages can be exchanged between them for M2M communication. Popular authentication methods include symmetric key authentication, lightweight public key infrastructure (PKI), and Open Authorization 2.0 (OAuth2.0). In symmetric key authentication, a shared key is used between the sender and the receiver. Even though this method is easier to deploy, it is less secure than asymmetric key techniques. In asymmetric key authentication, digital certificates can be utilised to prove the identity of a device before communication is established. OAuth2.0 is a token-based authentication and authorisation scheme appropriate for authenticating IoT devices. In order to enhance the authentication process, hardware based methods such as Trusted Platform Module (TPM) and Trusted Execution Environment (TEE) are being increasingly used to secure and process authentication. As most IoT devices are resource constrained and most authentication schemes discussed thus far are incorporated during device enrolment, authors in [47] proposed a authentication and access control scheme for the entire IoT device life-cycle. The IoT device life-cycle consists of pre-deployment, ordering, deployment, functioning and retirement. The main advantage of this scheme is that an IoT device might exist in multiple domains during its life-cycle and the authentication scheme can work across multiple domains. The proposed scheme is based on attribute based access control which is certificateless and can reduce computation costs on constrained devices. In addition to the above mentioned authentication schemes for IoT devices, constrained devices require unique identities which are tamper-proof and cannot be easily imitated. This is discussed in detail in the following section.

1) IDENTITY OF DEVICES
Research efforts are being made to specify an identity of things (IDoT) scheme that can be used to enforce strong access policies within a ZTA. Simple device attributes such as International Mobile Equipment Identity (IMEI) number, manufacturer details, and model and firmware version would not be sufficient, as these can be sniffed from the network and used for imitating genuine devices [46]. A viable IDoT scheme should support the following properties: • Unique device identification • Tamper resistance and unclonability • Adaptive authentication and access control • End-to-end encryption • Scalability Similar to user authentication factors, four categories of information can be leveraged to verify the identity of devices during authentication [46]. These are shown in Figure 5. Inherited information is that which a device inherits from its hardware components. Information of this type is fetched by recognising attributes such as physically unclonable functions (PUFs). PUFs are unique design characteristics that are used only in one piece of hardware. For example, circuits can be integrated in specific way that is verifiable via challenge-response behaviour between components [46]. By applying an electrical stimulus to the PUF, a response based on the interaction of the stimulus and the physical micro-structure of the device is produced. Due to the unclonable and unpredictable nature of the PUF response, this technique has emerged as a way to create lightweight, identity-based cryptosystems [48]. To date, PUF-enabled signature detection has been proposed for identity verification [49], [50] and securing IoT communications [51], [52]. However, this approach has also been shown to be vulnerable to modelling attacks that can be used to infer characteristics well enough to support cloning and redistribution [53]. In addition, multiple challenge-response pairs need to be collected to enroll the devices. Adversaries can use machine learning based techniques to map detected responses against various challenges, which can in turn be used for device masquerading [54]. These issues have been addressed by using cryptographic schemes such as Elliptic curve cryptography [54]. Similarly, the authors of [55] proposed a device authentication protocol for wireless devices which leverages the frequency response of the speaker-to-microphone (S2M). Such hardware components have unique characteristics due to factors such as minute differences in manufacturing processes and other uncontrollable variables which result in unique design instances which behave in unique ways that partner devices can be made to recognise. Whenever device D 1 must authenticate itself to another device, D 2 , D 1 sends a sound to D 2 , which computes the received frequency and compares it against the known fingerprint of that device (i.e. D 1 ) to make an authentication decision. While experiments demonstrated that this approach is resilient against replay attacks, however, it is also only applicable to devices with embedded speakers and microphones. Numerous IoT devices do not feature these components, thereby rendering it an unusable approach in such instances. Authors in [56] proposed a continuous device-to-device (d2d) authentication protocol that leverages Wi-Fi channel state information and uses a dynamic function for frequent updating of keys to accomplish authentication continuously. However, this mechanism may only be employed on Wi-Fi-enabled devices.
The second category of IoT identity is the association it has with other devices such as IoT gateways or smartphones. As IoT devices do not posses any identity generating information such as hardware tokens, their association with IoT gateways can be used instead as associations that are not expected to change regularly [46]. A common example of this is that a personal wearable device will transmit data to the cloud via the user owned smartphone that this device has been paired with. Though this is practicable in the context of personal devices owned by specific individuals, however, it is not a feasible approach to identification in the context of an integrated industrial environment. The third category of IoT device identity is knowledge about the device. Information such as manufacturer, IMEI number, firmware version number, and serial number can all be used as ways to establish the identity of a device. The fourth category is contextual information about the operational environment within which IoT or OT devices are being used. Such information can be collected by monitoring the behaviour of a device in relation to its neighbouring devices to establish a baseline of historical patterns. Current behaviour can then be compared against this baseline to detect agreement or deviation.
Commercially, Public Key Infrastructure (PKI) has been widely adopted to provide unique identities to IoT devices. However, with PKI, certificates need to be managed (distributed, revoked, stored and provisioned) with regard to their use by IoT devices. Intrinsic ID, on the other hand [57] provides a leaner solution by using a static random-access memory (SRAM) PUF to generate a unique identity. IoT Identity management tools such as Ericsson's IoT identity access management (IAM) platform and Vouch's decentralised IoT IAM are two commercially available solutions currently being used to take IAM beyond traditional human-centric frameworks [58].
Issues With Device Authentication: The broader categories of device authentication are summarized in Table 6. However, these approaches have some associated problems: they are prone to vulnerability via known attack vectors and also cannot be extended to many of the device types ubiquitous in critical infrastructures. Various open research directions for device authentication are presented in [59].

a: NIST's REQUIREMENTS FOR DIGITAL IDENTITY ESTABLISHMENT
Verifying digital identity over open networks presents unique technical challenges associated with impersonation and other forms of attack. NIST provides recommendations for authentication processes and authenticators (called tokens in some specifications) which can be used to achieve various Authenticator Assurance Levels (AALs) [60]. The three AALs are defined as follows: AAL1 -AAL1 provides some confidence that the claimant has control over an authenticator associated with the subscriber's account. AAL1 requires single-factor or multi-factor authentication, which can be accomplished using a variety of secure authentication mechanisms that the claimant can use to establish custody and control of the authenticator.
AAL2 -AAL2 offers higher confidence that the claimant has control over an authenticator. Through use of secure authentication protocols, possession and control of two separate authentication factors is established. At AAL2 and higher, approved cryptographic algorithms are necessary.
AAl3 -AAL3 gives a high level of assurance that the claimant is in control of the authenticator(s) associated with the subscriber's account. Authentication at AAL3 is based on a cryptographic protocol that verifies the possession of a key. AAL3 authentication requires both a hardware-based authenticator and an impersonation-resistant authenticator; the same device can meet both requirements. Claimants must verify possession and control of two separate authentication elements using a secure authentication protocol to authenticate at AAL3. Approved cryptographic techniques are required. Table 7 shows the requirements for different AALs. Interested readers are referred to [60] for more details on requirements and elaboration on different means of authentication (i.e. authenticator types) such as look-up secrets, out-ofband devices, and single factor and multi-factor one-time password (OTP) devices. Various standards exist that can be used for authentication processes within ZTA. An overview of these standards is presented in Table 8. One or more of these standards can be used by an organization to achieve a particular authentication level. Although some of these standards do offer certification (e.g. FIDO functional certification to measure compliance and ensure interoperability among products and services that support FIDO specifications), it is advisable that certification be offered which covers all that is required at each AAL. As it stands, the higher the AAL, the closer it is to fulfilling the theoretical requirements for a ZTA.

IV. ACCESS CONTROL
The foundational requirement for ZTA is access control-the ability to ascertain the privileges of a subject (an authenticated user or a process executed on that user's behalf) and restrict access accordingly. In essence, the purpose of logical access control is to protect resources such as devices, data, and applications (referred to as objects) with respect to operations available to a subject (e.g. read, write, execute). To accomplish a particular operation on a specific object, the subject must satisfy the access control policies, i.e. if the policy is satisfied, access is granted to an object. These policies are part of the organisation's access control mechanism (ACM), and are derived from the business and security requirements of the enterprise. As described by NIST, the ACM is a logical component that assesses access requests and decides whether the subject is authorized to execute the requested operation [68]. ACMs can deploy numerous methods to define and enforce access control policies.

A. IDENTITY-BASED ACCESS CONTROL
Identity-based access control (IBAC) is a simple and coarse-grained approach to access control, where access authorization is directly mapped to the subject's identifier. One approach to adopting IBAC is through the use of an Access Control List (ACL), which requires the system administrator to define access rights for objects with regard to different identities which can be recognized as subjects. The problem with this approach is that it is not scalable for efficient use within settings that involve dynamically changing groups of subjects and objects because it would require constantly revising access authorisations every time a change is made. Moreover, the access decisions are not made with respect to contextual considerations such as business functions or characteristics but solely the identifiers themselves.

B. ROLE-BASED ACCESS CONTROL (RBAC)
In contrast to IBAC, RBAC utilizes the roles of the different subjects within the organization to implement the access control mechanism. For example, a subject with the designated role of ''research-scientist'' will be allowed to access ''R&D'' related documents. In contrast, a ''procurement officer'' may not be allowed to operate on the aforementioned documents. In RBAC, the access is predefined at the time of defining the roles (which in turn are explicitly related to the privileges). Whenever a subject attempts to access a resource, it is the subject's role which the ACM compares against rules to allow or deny an access request. In theory, RBAC enables the central management of access control without need for unwieldy ACLs. However, in practice, RBAC can easily result in ''roleexplosion,'' or the accumulation of roles and privileges that endure beyond the times they are justified, which can also require significant amounts of work to prevent or correct. Variations of RBAC have been proposed in the access control literature which improve upon the original RBAC scheme. However, adoption of these solutions has been rather limited due to factors such as projected deployment costs, the infeasibility of certain assumptions in real world settings, and inherent limitations on achieving fine-grained access control when to more recent schemes (such as attribute-based access control).

C. ATTRIBUTE-BASED ACCESS CONTROL (ABAC)
As the name suggests, the ABAC positions its access control mechanism on many attributes. In a sense, IBAC and RBAC can be seen as special cases of ABAC. IBAC uses the subject's ''identity,'' while RBAC leverages the subject's ''role'' to enforce access control. The difference is that the policies define a complex Boolean rule set through which multiple attributes can be checked. As described in [68], the ABAC can be defined as an ACM in which subjects' access requests to perform operations on objects are evaluated by taking into account the attributes of the subject, object and environment, along with the policies defined around the aforementioned attributes and conditions. Environmental conditions refer to the context in which the access request is sent through, e.g. time, week, location, and risk level. An overview of ABAC is presented in Figure 6. Extensible Access Control Markup Language (XACML) and Next Generation Access Control (NGAC) support and implement the ABAC model. However, they differ significantly from each other in their approaches to defining and managing attributes, as well as in how access decisions are made and enforced. A comparison of XACML and NGAC is conducted by NIST and discussed FIGURE 6. Overview of attributes-based access control. VOLUME 10, 2022 in [69]. In a nutshell, XACML defines the polices by using logical formulae involving attributes, while NGAC utilizes enumerations that involve configurations of relations. Below, an overview of the advantages NGAC has over XACML is offered: • NGAC utilizes a single linear decision-making algorithm that is applied over non-conflicting policies. By contrast, XACML involves multiple complex processes around the collection of attributes, condition matching, computation of rules, and resolution of any conflicts.
• NGAC allows complete separation of access control logic from the operational environment, unlike XACML, which only allows for partial separation.
• NGAC allows an easy inculcation of Discretionary Access Control (DAC), which is difficult to accomplish in XACML.
• Unlike XACML, NGAC allows per-subject and perobject reviews of combined policies.
Recently, access control is of particular interest for IoT systems such as those employed in homes. For example, the authors of [70] proposed a lightweight ABAC for IoT networks. Specifically, they considered a scenario in which a more powerful mobile device is accessing a smart device (i.e. a smart camera). This scenario is of particular interest as the powerful devices could potentially take control of any device in a smart environment to violate user security and privacy (e.g. by accessing a smart camera to see inside the home). For enabling ABAC in an IoT context, [70] proposes to collect contextual information, i.e. location, time, and network information such as MAC address and network type. Using this contextual information, the authors first compute the trust locally to ensure that the requesting device meets a predefined trust threshold to be allowed access to any smart device or sensor. Once the local trust value is established, the access request, along with the trust value and context information, is sent to a cloud-based decision point. If the access request is assessed as safe, an access grant decision is sent to the gateway that manages access to the smart device.
The authors of [71] calculate the risk associated with an access request based on the user's contextual information and use this for controlling access to a device by an unauthorized person while also protecting information about the context of the device from an adversary. They presented an access control mechanism that captures and adapts to the user's perception of the context (i.e. feedback through user interaction) and conducts autonomous classification of this contextual information. Specifically, the location context was detected via a Global Positioning System (GPS) and Wi-Fi Access Points, and social context was established by detecting people's proximal devices via Bluetooth. Subsequently, the context was classified by leveraging machine learning fed with scrupulously computed features from the contextual information. The outcome of the analysis was next retrieved as a score, which was then used to make a conclusive decision about access request. Similarly, the authors of [72] also proposed a combined trust and attribute-based access control scheme for use in the IoT environment. They determined the user's trust by leveraging his/her behavioral characteristics. To compute the trust level they leveraged fuzzy sets to determine the current trust level and then combined it with the previous trust levels to arrive at a final trust level. Once the trust was computed, they updated the trust attributes database and combined it with the other static attributes, i.e. subject, object, operation, and environmental condition.
The authors of [73] also investigated the problem of an IoT device disrupting other devices within a home context, either by accident or maliciously. They demonstrated that ABAC is the appropriate way to enforce access control in smart homes as it can incorporate user, device, and environmental conditions into the access decision process. The authors argue that XACML is not an appropriate choice within an IoT context and that NGAC affords an easier and more efficient approach to adaptive policy definition and management.

D. RISK-BASED ACCESS CONTROL (RbAC)
Compared to the previously discussed access control models, Risk-Based Access Control is covered in far fewer proposals, as interest in it has been limited predominantly to the context of military operations. Recently, however, a few proposals have emerged which investigate the adoption of RbAC in the general IoT context. With RbAC, rather than having static and predefined policies, risk analysis is used to determine the risk associated with the particular request. This is then compared against the access policies and acceptable risk to ascertain whether to allow access when receiving an access request.
Authors in [74] used fuzzy inference systems that leverage the security levels of the subject and the object to determine the risk for making access decisions. However, this approach is not scalable for the IoT environment as it requires an excessive amount of time to establish the risk value. Authors in [75] also used fuzzy modeling for measuring risk using action severity, risk history, and data sensitivity. However, instead of using real-time contextual information, the fuzzy rules were based on prior knowledge about the deployed scenario. More recently, authors in [76] used fuzzy inference in conjunction with expert knowledge to estimate risk when granting access in an IoT context. This approach used real-time contextual attributes of the subject making the access request to estimate the risk and make the appropriate access decision. These attributes included the user's context, resource sensitivity, action severity, and risk history to determine the risk.

E. CAPABILITY-BASED ACCESS CONTROL
Capability-based access control leverages the concept of capability to define the privileges of the subject. This concept was initially introduced in [77] as a token that gives its possessor privileges to access an entity within a computer system. The scheme relies on cryptographic signed tokens which determine the privileges of a subject to conduct a particular operation on an object. Authors in [78] proposed a distributed capability-based access control for an IoT environment. This solution has two phases. In the first phase, this approach computes a session key by performing the authenticated key-exchange. In the second phase, the session key is used to establish secure communication and a capability token is used to gain access to the protected resource. However, capabiliy-based access control demands that all devices act as a PDP, which may be deemed precarious for constrained devices (such as those which typically predominate in IoT applications).

F. USAGE CONTROL (UCON)
The Usage Control Model is a more flexible model for authorization which focuses on the granularity of access decisions. In conventional models, the attributes of subject and object can only change either before or after the authorization, but not once the authorization is permitted. The UCON introduces the concept of mutable attributes, obligations, and conditions. The mutable attributes can change their values during the time that the subject is accessing a particular object, thereby enabling policy enforcement before the authorization and also continuously during access of the object. Therefore, the proper remedial actions (e.g. blocking access) can be taken immediately after the attributes are updated, even if the subject is still in the process of accessing the object. The authors of [79] showed the efficacy of UCON toward the enforcement of energy-saving and safety policies in a smart home. For example, they demonstrated the suitability of UCON in implementing policies such as ''smart oven may only turn-on if there is an adult present.'' This is to ensure that there is no gas leakage or safety risk to the children in a home. The authors demonstrated that their access control scheme was capable of registering local and remote attributes in real-time, allowing permissions to be revoked if any of the attributes were updated. However, this approach requires a separate usage control system as well as an attributes manager on every node, which may again be precarious where resource constrained devices are concerned In line with the UCON approach, a number of proposals have emerged which focus on enhanced granularity in access authorizations. One of the main threats motivating this research has been increased recognition of insider threats within organizations. For instance, the authors of [80] propose a Linux container-based solution for isolating the system administrators from resources irrelevant to their current ticket's task while enabling them to obtain additional permissions when approved by the permission broker. A more granular and generic approach is proposed by Desmedt and Shaghaghi in [81]. Inspired by the concept of Functional Encryption, the authors propose Function-Based Access Control (FBAC). With FBAC, access authorizations are no longer stored as a two-dimensional Access Control Matrix (ACM). Instead, FBAC stores access authorizations as a three-dimensional tensor (called an access control tensor). Hence, applications no longer blindly give execution rights and users can only invoke commands which have been authorized at different levels such as data segments. Simply put, one might be authorized to use a certain command on one object while being forbidden to use the same command on another object. Evidently, this level of granularity and customization can not be efficiently modeled using the classical access control matrix. The authors discuss the theoretical foundations of FBAC and argue that their proposed model results in a new generation of applications capable of enforcing access restrictions at unprecedented granularity. The proposed solution has not to date been studied in the context of the IoT but it may bring enhanced access control capabilities to this domain if so deployed. It may be particularly relevant for such emerging IoT software architectures as the modular framework proposed in [82].

G. ARCHITECTURES FOR ACCESS CONTROL
As indicated in [83], there are three main architectures for access control, namely, Policy-based, Token-based, and Hybrid architectures. In the following, we succinctly review the basics of these architectures:

1) POLICY-BASED ARCHITECTURE
A typical example of this architecture is XACML, which comprises a PEP, PDP, Policy Administration Point (PAP), and Policy Information Point (PIP). Figure 7 illustrates interaction between the different modules involved. The policies are designed by the PAP and are made available to the PDP (1). The subject makes the access request (2), which is received by PEP and in turn forwarded to the PDP module (3). The PDP evaluates the access request against the set of available policies and if any further information is needed, the PDP obtains it by consulting the PIP (4,5). Finally, the PDP sends the access decision to the PEP (6), which implements the restrictions.

2) TOKEN-BASED ARCHITECTURE
The policy-based solution generally has a single, centralized point for policy evaluation and enforcement, which may not be suitable for situations where the resources are distributed across multiples nodes, as in an IoT context [83]. Therefore, VOLUME 10, 2022 the token-based architecture has recently been introduced as a suitable alternative in which permissions to subjects are encoded in tokens by the authorization services. These tokens are then used for granting access to the objects. Amongst many token-based standards, OAuth is widely used for allowing client applications to access the resources hosted on the HTTP servers. Figure 8 shows the different components of a typical token-based access control. The subject makes an access request to the resource owner (1), which provides the authorization grant depicting the authorization for the subject (2). Using the authorization grant, the client requests the access token from the authorization server, which upon verifying the request generates a token. Once the token is obtained by the client, it requests a desired resource from the resource server, which validates the token and serves the request only if the token is deemed valid. The entire process of this approach is shown in Figure 8.

3) HYBRID ARCHITECTURE
As indicated in [83], the token-based approach requires a level of user interaction which may be considered burdensome by the user. In view of this, the authors of [84] proposed a new approach referred to as User Managed Access (UMA), which extends OAuth, thereby allowing the configuration of policies in the authorization server so as to automate the generation of tokens without user interaction. In this way, UMA combines features of the Policy and Token-based approaches.  Table 9 presents a cross-comparison of the aforementioned approaches to access control [85]. As indicated in [83], there is no one-size-fits-all approach which will work across all different scenarios, from smart homes, to smart buildings, to IoT-context health applications. The reason for this is that different use-cases impose different requirements for policy management and evaluation. Irrespective of the specific scenario, an authorization framework should be fine-grained and context-aware. Both of these conditions are satisfied when either ABAC or UCON is used as an access control model. The most important requirement for an access control framework for smart homes is usability, meaning that the framework should demand minimal effort from the home owner to define policies. The optimal solution for this scenario will be a centralized and policy-based architecture in which policies are autonomously generated by leveraging contextual information. In addition, the PAP should aid the home owner in configuring and modifying policies. Latency can be tolerated to a certain extent in a smart home, thereby allowing for a run-time policy evaluation strategy. The PDP can be deployed on an edge device, such as IoT gateway or a local cloud. As indicated in [83], most of the frameworks designed for smart homes do not fulfil the aforementioned requirements. They tend to define policies which are either coarse-grained or involve overly simplistic consideration of environmental conditions. Similarly, the authorization framework for IoT applications in healthcare should require a minimal effort for administering the policies for multiple devices. In addition, it should have low latency (as latency in this context can potentially lead to a life-threatening condition), and should also take into account the constrained capabilities of medical devices. Therefore, the PDP should be deployed on edgedevices, while the PEP might reasonably be located in the device itself. Most of the authorization frameworks proposed for healthcare either do not reasonably consider usability requirements or adopt run-time evaluations of policies that can lead to latency issues. By contrast, contexts such as smart buildings and interconnected vehicles mostly involve direct d2d communication with little or no requirement for user interaction. This means that usability is not a significant requirement in a framework designed for this scenario. On the contrary, interoperability, latency, and automated decision making capability are the most important requirements for fully automated scenarios. A similar framework as described for health IoT applications can possibly work for this use-case with an added functionality of a PDP that allows the correct interpretation of policies from different administrative domains. As indicated in [83], however, most of the frameworks proposed for this scenario do suffer from latency issues.

H. BLOCKCHAIN BASED ACCESS CONTROL
The traditional centralized access control models have been well documented and analyzed. For example, the authors of [86] indicated that the existence of a trusted third party for access control can lead to a single point of failure.
Recently, blockchain has shown some promise as a suitable alternative for access control. As the blockchain is inherently distributed, it helps to alleviate problems associated with the centralized approach (i.e. single point of failure and risk of privacy leakage). Furthermore, it also aids in maintaining a trusted log, and a smart contract can help with enforcing complex access privileges. For instance, the authors of [87], leverage blockchain to implement a solution that continuously inspects access authorisations with the goal of providing auditability of data access for personal health record system users and facilitating detection of anomalies. Authors in [88] proposed leveraging blockchain for publishing the access polices corresponding to a resource and for distributed transfer of access rights amongst the users. This allows the users to monitor the policies related to a particular resource and determine who has rights to access that resource. This also helps to prevent a fraudulent denial of access rights granted by the enforceable policy. The authors demonstrated the efficacy of this approach by deploying the policies defined in XACML on a Bitcoin blockchain. The authors of [89] used blockchain-enabled smart contracts to enforce the access control policies, thereby enabling subjects to verify that the policies are correctly enforced while also minimizing the chances of fraudulent denial of access by a malicious third party. They store the smart contract representing the access control policy on blockchain with a proper transaction whenever it is created by the resource owner. When a subject makes an access request, a transaction is generated with a reference to evaluate the policy and make the access decision. They demonstrated the efficacy of this approach by codifying the XACML policies into a smart contract (using Solidify language) and deployed it on Ethereum. This approach provides a benefit against malicious denial of access (e.g. a policy enforcement party can fraudulently enforce the system to deny the access), as the subject can see how the policies are being enforced. Likewise, the authors of [90] proposed attribute based access control using consortium blockchain for the IoT. This scheme has two main components: attribute authorities and IoT devices. The attribute authorities simultaneously act as consortium nodes and as the key generation center. They act as the managers of the blockchain, and use a consensus mechanism to jointly manage the distributed ledger. Simultaneously, for every IoT device that registers with the system, they generate a pair of public and secret keys based upon its identity (i.e. by using identity-based cryptography), using which the devices can mutually authenticate one another and agree on a session key. IoT devices use the attributes assigned by the attribute authorities to prove permission before they can exchange the data.
There are a few vulnerabilities when using blockchain for access control. For example, the blockchain needs all transactions to be recorded on all peers, for which a consensus mechanism is used. Recently, a lightweight consensus approach had been adopted to improve performance. However, this performance is still not comparable to that of centralized solutions [91]. In addition, the transactions in blockchain are inherently transparent. However, this is not desirable from a privacy perspective. Due to this reason the permissioned blockchain emerged, which provides privacy at the cost of decentralization. Likewise, as pointed out in [91], maintaining and improving the security of smart contracts and blockchain is also challenging.
The evolution of quantum-computers poses a serious threat to public-key cryptosytems and digital signatures, thereby warranting modifications of blockchain for the post-quantum era. In view of this, many efforts are currently being undertaken to standardize post-quantum cryptosystems (PQCs). For example, the authors of [92], [93] proposed some modifications to blockchains for the post-quantum era. However, since the standardization of post-quantum cryptosystems is still in process, the proposed modifications to blockchains will only be validated once the standardization process is concluded. Similarly, there are numerous issues with postquantum blockchains, as pointed out in [94]. These include large key and signature sizes with correspondent impacts on performance. We refer readers to [94] for more detailed discussion on post-quantum blockchains and related challenges.

V. ENCRYPTION
As indicated in Section 1, zero trust implies tight control over the data. Given this, encryption is important to protect data at rest, in transit, or during processing. Encryption must be used to protect important enterprise data stored (i.e., at rest) in computing devices and portable storage devices (e.g. USB Flash drives). However, modern attackers have crafted numerous methods to retrieve encrypted data at rest. Numerous methods involving insiders and cryptographic or data integrity attacks have proven successful. Different methodologies such as data fragmentation and active defence attempt to remedy these problems. For example, [95] showed an effective way (referred to as ''Horus'') for data encryption in high-performance computing systems. Data fragmentation techniques such as Tahoe Least-Authority File Store and Storj [96], and active defence technologies such as Crypto-Move [97] helps with protecting data by distributing, transferring and mutating the encrypted data in such a way that it is difficult to identify, retrieve or damage the data. In addition VOLUME 10, 2022 to the aforementioned simplistic situation in which data at rest is encrypted, it is also important to protect the data while in the processing stage. Data is increasingly being managed and processed in public (or private) clouds due to their ubiquitousness and other associated advantages. This creates a problem, as the cloud server would need access to encryption keys for processing the data, leading to security concerns. The processing of data locally (i.e. by downloading the data and decrypting it using the secret key) is challenging and computationally expensive. Given this, two emerging techniques, homomorphic encryption and Secure Multi-Party Computation (SMPC), are of particular interest where computation on encrypted data is concerned. Unlike conventional encryption methodologies, homomorphic encryption allows for the performance of computations on encrypted data without needing the secret key, resulting in encrypted output (of computation), allowing the owner of the data to retrieve the plaintext using the key. Prior research shows that homomorphic encryption-being based upon Ring-Learning with Errors (RLWE) and its relation to the hard mathematical problem of high-dimensional lattices-s secure against the current quantum computer. This makes them more secure than RSA and other cryptography approaches that are based upon elliptic curves. Given that more and more enterprises are switching towards cloud environments for saving and computing data, the importance of homomorphic encryption is becoming apparent. However, a lack of standardization of homomorphic encryption is making it difficult to enable its widespread use. Also, this makes it difficult to have a uniformed and simplified Application Programming Interface (API), making it difficult for application developers to understand and use APIs in this area. In addition to outsourcing data to clouds, homomorphic encryption can also help critical infrastructures such as healthcare to enable privacy-sensitive computation which is otherwise not possible. For example, due to data privacy issues in healthcare, predictive analytics is difficult to conduct. However, homomorphic encryption can realize such analysis by performing computation on encrypted data, thereby reducing the privacy concerns. Numerous implementations of homomorphic encryption are enlisted in [98]. The recent advancements in pervasive computing necessitate the cooperative computation on data shared by many parties while maintaining the data confidentiality of individual parties. This joint computation can be accomplished through cryptographic primitive SMPC in a privacypreserving way. Existing solutions of SMPC including the overview of cloud-assisted cooperative methods, their architecture, and SMPC protocol for different scenarios such as privacy-preserving machine learning that is needed in many applications are discussed in [99].

A. LIGHTWEIGHT ENCRYPTION
The abundance of constrained sensing and computing devices such as IoT and sensor networks (and M2M communication in some situations) necessitates lightweight cryptographic methods which can conveniently work on devices with limited processing, storage, and power resources. Although conventional methods like AES (encryption), SHA-256 (hashing), and RSA/Elliptic Curve (signing) work desirably on most computing devices such as laptops, desktops, and smartphones, they fail to perform optimally in IoT device and embedded systems (e.g. RFID devices and sensor networks) contexts. For such constrained devices, lightweight cryptography is highly desirable as indicated in NIST's report [100]. The purpose of lightweight cryptography is to consume fewer resources (i.e. processing power, memory usage, and energy consumption) so that it can be accomplished on constrained devices. To accomplish this, we often see the smaller block, key and simple rounds of calculation in lightweight cryptography. However, this simplification comes at the cost of security (e.g. [101] demonstrated that 128-bit AES implemented on Arduino needed only 30 minutes to break by leveraging the differential and correlation power analysis). In hardware and software implementations of lightweight cryptography, RAM, energy, implementation size, and throughput are the important metrics to be considered.
The NIST-recommended methods for hashing (as a part of its early initiative on lightweight cryptography [100]) are SPONGENT, Quark, PHOTON, and Lesamnta-LW, as they all have a small memory footprint and input (of just 256 characters). SPONGENT makes use of sponge function and is based upon the finite state machine and cycles through the states as the input data is added. In sponge construction, a fixed-length permutation and padding is used to transform an input X * of any length to X o , where 'o' is defined as a part of the process. Precisely, the sponge construction makes use of a function (f) and has two phases, namely, ''absorption'' and ''squeezing.'' In the absorption phase, r bits of input and state are XORed and interleaved with f. In the squeeze phase, r bits of state are blocked as output and interleaved with f. The length of the output (in bits) is defined as part of the hashing process. Lesamnta-LW is based on AES (with S-box structure similar to that of AES), and is 5-times faster than SHA-256 and requires a RAM of only 50 bytes on an 8-bit processor. Quark also uses sponge function and can be used for both hashing and stream encryption. Three different variants of Quark (i.e., u-Quark, d-Quark, s-Quark) can accomplish the 64-112 bit security. PHOTON, on the other hand, is also based on AES and creates an 80 -256 bits hash. PHOTON can accept the input of any length and produce an output of variable-length. The detailed method of PHOTON is described in [102]. A comparison of different lightweight methods for hashing is presented in [103].
One of the promising alternatives of AES for lightweight encryption is PRESENT [104], which uses either an 80-or 128-bit encryption key. It operates on a block of 64 bits and employs SPN (substitution-permutation network) for encrypted output. PRESENT generally has 32 rounds with key-operation, S-box and P-box layers in each round. In operation, the key round performs the 'xor' operation between the key and input data, followed by an S-box (i.e., substitution) of 4 x 4 bits which helps in reducing the processing power in comparison with AES. Another alternate is XETA [105] which also operates on the 64-bit block and uses a key of 64-bits. XTEA is fast and has a small code size. Other options are SIMON (for optimized hardware implementation) with a key-size of 64-256 bits and a block-size of 32-128 bits, and SPECK for optimized software implementation. Mickey V2, Trivium, Grain and Enocor are the lightweight stream ciphers with low resource requirements as indicated in [103]. Similarly, CLEFIA is a lightweight block cipher that requires only 6k gates for implementation and has a block-size of 128 bits and variable key-size ranging from 128-256 bits. CLEFIA [106] is also included in ISO/IEC 29192 as a standard for lightweight encryption. RC5 is a flexible method that can support key-sizes of up to 2048 bits and block sizes of 32-128 bits. These parameters can be matched with the resources available on the device and its security requirements. For the lightweight signing of messages, Chaskey is a suitable option that uses a 128-bit key and requires around three thousand gates in contrast with SHA-256 that needs approximately fifteen thousand gates.
It is noteworthy that, NIST is currently standardizing the lightweight cryptography methods for constrained devices through an open competition-like process (the second round of the process completed in 2020) [107]. Therefore, the aforementioned methods may only be seen as a reference which were included in NIST's initial report on lightweight cryptography and until the standardization process is concluded. We refer readers to [107] for further details on the standardization process and recent entries. Table 10 presents a summary of lightweight hashing and encryption methods. However, the so-called Cryptographically Relevant Quantum Computer (CRQC) [108] is envisaged to be particularly precarious for asymmetric encryption (and even symmetric encryption and hashing in general). Though the symmetric encryption mechanism (e.g. AES) and hashing are secure against quantum attacks [109], they still need larger key size and hash length to maintain the necessary level of security [110]. This is particularly problematic for resource constrained devices with limited memory and weak computational power. In view of this, recent research works [111], [112] have attempted to design quantum safe lightweight cryptosystems using quantum permutation pads. However, significant research efforts are needed to analyze these approaches against known attack vectors and to evaluate their suitability for legacy hardware.

B. LIGHTWEIGHT MUTUAL AUTHENTICATION
Mutual authentication is of particular interest in the IoT environment. For example, an attacker can take over a vulnerable device (e.g. a sensor) and feed falsified data to the server to intentionally induce a bad decision, which in critical infrastructure can prove to be fatal (consider a traffic control system). This suggests that both sensor and server should mutually authenticate one another prior to data exchange. Again, as the devices in a particular IoT environment are generally constrained, this necessitates that the protocol should be lightweight. In this pursuit, a number of lightweight mutual authentication approaches have been introduced. For example, [113] presents a lightweight key agreement protocol (Algebric Eraser, -AE) which makes use of one way E-multiplications. However, their complexity increases linearly with the needed security level. Besides this, the vulnerabilities of [113] are indicated in [114]. Likewise, another approach that is referred to as NTRU is proposed in [115]. NTRU is based upon probability theory and polynomial algebra, and has received much attention recently due to its speed. Unfortunately, it has an associated problem with large key-size. Similarly, authors in [116] proposed a public-key encryption approach and demonstrated the efficacy of using it for a mutual authentication mechanism in constrained devices. The proposed encryption scheme is not computationally expensive, so it may be suitable for constrained devices. This mutual authentication protocol is shown to have benefits over other methods such as AE, NTRU, and Elliptic Curve Cryptography (ECC). Furthermore, the authors demonstrated that this scheme takes only around 125ms for mutual authentication on constrained devices. However, the evaluation has only been conducted on a Texas development kit. Thus, its actual performance on constrained devices within an operational environment remains unclear. Likewise, the authors of [117] proposed a lightweight encryption, key management, and authentication suite for IoT devices. They use one-keyfor-one-file encryption, where the encryption key is generated using a random number and keystroke seed which is hard coded in the hardware security module of the device, thereby requiring no key to be either maintained by the devices or transported between them. Therefore, for encryption, the device picks a random number and uses it along with the keystroke seed to generate the key. Once encryption is done, the key is deleted and a random number is sent to the receiving party. The authors assume that all IoT devices within a network have obtained the unique identification of every other device during the configuration process. For mutual authentication, the sender sends the time stamp and its identification is encrypted with a random key (i.e. generated as described above) and a random number sent (n 1 ) to the receiver. The receiver upon receipt, uses the random number (n 1 ) and keystroke seed to generate the key and decrypts the message to ascertain the device identity and time stamp. The receiver then sends a random number (n 2 ), and modulo-2 addition of his unique identification and sender identification, along VOLUME 10, 2022 with a time stamp, all encrypted using a key generated by a random number, back to the sender. The sender can decrypt this message using the key generated from n 2 to confirm its identity and complete the mutual authentication. Although this approach is lightweight and can potentially be deployed in IoT replacing the IPsec, however, the assumption that all the devices within the network possess the unique identity of each other may not always hold.
Issues With Lightweight Mutual Authentication: In addition to the specific problems identified above, the generic problem for PKI-based mutual authentication is the threat of Cryptographically Relevant Quantum Computer (CRQC) attacks. As discussed above, CRQC are envisaged to be capable of breaking the conventionally difficult mathematical problem(s) which form the basis of PKI. Therefore, research efforts are need to have quantum safe lightweight mutual authentication mechanisms. NIST is currently undergoing a standardization process for PKI. However, it is not clear that current proposed methods in Round 3 of the process (code-based, isogeny-based, hash-based, lattice-based, and multivariate system-based solutions) will be sufficiently lightweight to work on the kinds of resource constrained devices ubiquitous in critical infrastructures.

VI. SEGMENTATION AND SOFTWARE DEFINED PERIMETER
According to NIST, ZTA can be applied in an enterprise using various approaches [3]. These approaches inform how a PEP is implemented and what the driving policies are. The two most common approaches, which we will discuss here, are micro-segmentation and software defined perimeters.

A. MICRO-SEGMENTATION
Micro-segmentation defines security measures and where those measures are implemented in the network. The core principle of micro-segmentation is the implementation of security policies closer to the resource being protected, effectively breaking a network infrastructure into smaller logical ''segments'' to efficiently protect a single resource (or logical group of them). Micro-segmentation enables only authorised entities within the data centre to access the application or data on protected resources, thereby preventing lateral movement by an attacker. Devices such as Next Generation Firewalls (NGFW) or security gateways, which will act as a PEP and enforce policies defined in the PE, have been proposed by NIST in its ZTA proposal [3].
Traditional network segmentation techniques like Virtual LANs (VLAN), routers and firewalls prove to be ineffective in providing granular security to workflows. In order to protect the east-west traffic flowing in the data-centres, granular security controls are required to enforce strict security policies between individual resources. Figures 9 and 10 show the traditional and micro-segmented network architectures. The micro-segmented network architecture has a micro-perimeter (firewall or security gateway) enforcing access policies to applications and the data in resources.  Micro-segmentation techniques can be applied using various deployment models such as [118]: • Native Micro-segmentation: In this model, microsegmentation is achieved ''natively'' using the underlying infrastructure, such as a hypervisor or Operating System (OS) used to deploy the application servers. This allows the access policies to be deployed on the infrastructure or OS directly without the use of external hardware or software solutions. In addition, the security policies can be implemented to all applications instead of just few high priority applications. The advantage of this approach lies in applying fine-grained security controls closer to the applications without additional hardware. However, this model can only be implemented in the virtual environments where workloads are operating.
• Third-party model: In this model virtual firewalls are deployed by third-party firewall vendors. Virtual firewalls are used to configure and deploy access policies between virtualised servers. This allows controlling of a large number of distributed virtual firewalls from a single location and applying global access policies for workload needs. However, this approach requires enabling virtual firewall visibility to the workload traffic.
• Overlay model: This model uses agent software running on servers and a central controller or orchestration device to gain visibility into workflow communications and enforce dynamic access policies. The advantage of this approach is that the agents have greater visibility over an individual workload's communication patterns, allowing for dynamic deployment of access policies using a central controller. Most micro-segmentation implementations are networkdependent and require programmable software-based network equipment such as firewalls and switches where access policies are managed using centralised controllers [119]. Whereas implementations that use virtual firewalls or overlay networks are network-independent and can operate independent of the underlying network technologies.
The policies that need to be configured on these micro-perimeters require understanding the complete life-cycles of workflows and the complex interactions these entail within the enterprise network. In a network-dependent approach, the identified network flows related to a particular workflow need to be translated into network-based access rules using the network identities of the applications that require access. An example of such an access rule would be ''application-A can access database-A,'' which can be translated into network based policies as ''IP-address-App-A can access IP-address-DB-A:port2.'' The drawback of this approach is that workflows are identified and granted access solely using their network identities, which can be spoofed/forged. Network-independent approaches, on the contrary, use workload identities to create fine-grained policies. A few network-independent micro-segmentation approaches are given in [119]. [120] In this approach access control is achieved first by packet authentication in a TCP/IP communication. A stenographic overlay technique is used to embed identity tokens in a TCP connection initiation packet (TCP-SYN). The identity is first verified and the remainder of the TCP handshake process is carried out only if access is granted for the requesting identity. The drawback of this approach is that it can only be used with the TCP protocol and is not compatible other (connectionless) protocols such as UDP. In addition, the heavy cryptographic load before TCP connection establishment can be exploited by denial of service attacks to overwhelm the server resources.

2) LABEL-BASED ACCESS CONTROL [121]
This approach assigns labels to various workflows which are used to group and apply access policies based on said labels. This approach makes the access control policies independent of the protocols used and applicable to various types of workflows. However, as is the case with network based identities, these labels can be susceptible to spoofing attacks.

3) DPI-BASED ACCESS CONTROL
In this approach, Deep Packet Inspection (DPI) engines are used to inspect packet contents at various layers to either allow or reject connections [119].

4) API-AWARE ACCESS CONTROLS
This approach depends on breaking up workflows into smaller container-based (e.g. Dockers and Kubernets) services that communicate with each other using API [122]. Table 11 lists the various commercially available micro-segmentation products and their adopted deployment approaches.

a: GENERIC ISSUES WITH GRANULAR POLICY ENFORCEMENT
Allowing only specific authorised hosts from within the LAN reduces lateral movement in malicious activities; nevertheless such translation of workflow access rules into network-level access rules can lead to misconfiguration [119]. In addition, in a complex data center with a large number of workflows, identifying all the possible interactions between workflows and translating them into accurate access policies can be challenging and lead to disruptions in existing workflows. Furthermore, maintaining and updating such access policies due to workflow reconfiguration, new business policies and/or introduction of new workflows can be a challenge to these micro-perimeters. In addition to these challenges, with constantly evolving cyber security threats, network level VOLUME 10, 2022 access policies may not provide granular perimeter security to prevent sophisticated cyber attacks against workflows. Hence workflow access control policies must be context-aware and must be adaptable to changes in workflows as highlighted in [119] and [3].

b: NETWORK-BASED PERIMETERISATION
An improved approach to network-based perimeterisation in which security efforts are concentrated on workflows rather than network endpoints is proposed in [119]. This networkindependent perimeterisation approach (eZTrust) can be realised using microservices which run in lightweight containers and which can be monitored. In the eZTrust model, the packets generated by the microservices are stamped with a tag which contains the detailed set of identities of the microservice. Identities such as name, version, kernel version, library version, user details and deployment specific identities can be fetched from the microservice or the service orchestrator. These contextual attributes are used to build context-driven access policies that verify the identity and current state of access-requesting workflows before access is granted. This approach has been found to result in 2-5 times lower packet latency and 1.5-2.5 times lower CPU overhead when compared to other network-based perimeterisation approaches or network-independent techniques such as Transport-level, Label-based, Deep Packet Inspection (DPI) based or proxy based perimeterisation.
The micro-segmentation techniques mentioned thus far only bring perimeters closer to the applications hosted in data centers. It is important to note that Lateral movement by attackers occurs not only in data centers but also in edge networks where M2M communications take place, however. The IoT and OT devices deployed at the edge lack advanced defence capability due to their resource constraints. With the increased deployment of IoT devices and with the boundaries between IT & OT technologies progressively blurring, the need for IoT and ICS device security must be addressed urgently.

c: MICRO-SEGMENTATION IN IoT
In IoT devices, micro-segmentation can be achieved using Software Defined Networking (SDN) along with Network Function Virtualisation (NFV). The authors of [129] proposed an SDN-based IoT framework which leverages the NFV technology to implement fine-grained network functions close to the IoT device, such as routing and access control, to secure IoT communication and enable Quality of Service (QoS) for critical network traffic. An SDN OpenFlow-based controller and IoT gateway were configured to deploy IoT devices and implement fine-grained network functions. In addition, the IoT gateway creates a secure IPsec tunnel with the application servers to secure the communication between IoT devices and the application. Authors in [130], coin the term Policy Enforcement as a Service (PEPS), which enables the provision of a network-level enforcement point, which access control systems (both application-layer and network-layer) can subscribe to, whether they share the same network domain or are external. The resulting inter-layer and inter-domain access Control makes it possible to limit threats closer to their originating source (e.g. a compromised IoT device).
In order to provide fine-grained access control for users to various types of IoT or ICS devices, [131] propose an SDNbased micro-segmentation approach. The end-users are considered the regular users of the IoT/ICS device for their daily tasks; administrators manage access to various end-users and devices; and maintenance personal provide specialised services such as maintenance and repairs. These user types require different levels of access to the devices. Fine-grained access rights proposed in [131] are controlled by deploying separate device proxies per user profile, which are loaded into containers in the IoT gateway. Thus, a single gateway will contain multiple device proxy containers for each user, so different access rights can be provided upon request. The SDN-based network equipment that implements SDN micro-segmentation rules manages the automatic routing of user requests to the appropriate device proxies in the IoT gateways and prevents any unauthorised access to unrelated device proxies.
The authors of [132] have proposed a novel security architecture to implement fine-grained security policies closer to the edge of an IoT network, based on micro-segmentation principles. The authors argue that traditional centralised access controls do not provide sufficient security to the IoT network edge, where most computing is expected to occur as the use of IoT devices increases. The security architecture proposed in [132] consists of a traffic policer, an asset policy database and a network discovery module. The traffic policer is a transparent bridge device (can be implemented using a low-cost hardware device such as Raspberry Pi) that is deployed closer to the edge and controls the traffic flowing through it. The policy database contains fine-grained policies that are used by the traffic policer to make control decisions on the packets originating from or destined for the edge. The network discovery module updates the information about various devices discovered on the edge network. The policy database contains device specific policies as well as generic device policies which are applied when devices are discovered. The administrators can add fine-grained policies as required. The aforementioned approaches are summarized in Table 12.

5) ISSUES WITH REALISATION OF EFFECTIVE MICRO-SEGMENTATION
The main challenge associated with using micro-segmentation for ZTA is that the complex workload interactions which often exist in large networks make it difficult to achieve effective segmentation of applications. In addition, effectively translating workload access requirements into network or application-level access control policies is a challenging task; network-level access restrictions might not prevent malicious activities at the application layer. Furthermore, managing and maintaining various access control policies becomes difficult with constantly changing application requirements and the introduction of new applications which can also result in misconfiguration and errors. For these reasons, dynamic workflow access detection techniques and access control policies which can dynamically identify workflow interactions and update access policies accordingly are very much needed.

B. SOFTWARE DEFINED PERIMETERS
Another approach proposed by NIST for implementing ZTA is the use of a software defined perimeter (SDP) which acts as an overlay network to secure resource access [3]. The main principle of SDP is to verify and authenticate the client's identity before communication is established with the client [133]. This is in contrast to traditional networks which allow the client to establish a connection (e.g. TCP/IP) before authenticating. The SDP implementations consist of an SDP controller (which authenticates and authorises the clients) and an SDP gateway (which connects to the applications). The client has no visibility over the application servers and the client's communication to application servers is authorised and facilitated by the controller and the gateway (as shown in Figure 11).
The SDP architecture relies on five layers of security to protect client-to-application-server communication [134]. The five layers are: • Single Packet Authentication (SPA): This is a passive authentication technique which allows legitimate clients to connect securely to the application servers.
With the implementation of SPA, the SDP controller listens for connections arriving on closed ports and does not respond to any connection requests. It thus remains obscure to the kinds of port scan tools typically used in reconnaissance attacks. Only for clients that send a valid SPA packet, the controller authenticates and verifies the client before access is allowed to the service. One drawback of the SPA process is the lack of a link between the authentication phase and the following TCP connection establishment process [135], which can result in session hijacking attacks.
• Mutual Transport Layer Security (mTLS): A strict mutual authentication mechanism is enforced in communication between the device/client to the server. In this scheme both server and client need to provide valid identities to authenticate, as opposed to only the server identifying itself. This ensures only valid clients can connect to the back-end resources upon being authenticated and authorised. In resource constrained systems, such encryption-based authentication schemes are challenging to implement due to the computation overhead involved. Several light-weight mutual authentication schemes are discussed in Section V-A.
• Device Validation: Beyond merely matching the cryptographic identities presented by users or devices to corresponding rights under policy, the legitimacy of the entity behind the identity must be confirmed. As certain device identities can mimicked or stolen, spoofresistant device identities need to be used for validating IoT and Industrial IoT (IIoT) devices. Authors in [136], have proposed a unified definition of device identities that can be validated and are spoof-resistant, which are based on existing protocols and device resources currently available. The authors argue that device information from the physical and data link layers of the Open Systems Interconnection (OSI) model are more resistant to spoofing and can be leveraged to create unique device signatures. Information extracted from RF waveforms, device hardware, frame inter-arrival times VOLUME 10, 2022 from the Medium Access Control (MAC) layer, details from Bluetooth Low Energy (BLE) protocol stack can be effectively extracted to create device fingerprints [136]. Section III-D contains further elaborations on various device identities that can be leveraged and implemented in the SDP.
• Dynamic Firewalls: SDP implements dynamic firewalls that contain explicit deny rules, instead of multiple static access rules, to deny all incoming connections. Dynamic firewall rules are created once the previous authentication steps are completed.
• Binding secure Tunnels to Applications: This step ensures the communication between the device to application back-end is encrypted, ensuring that the communication between the client and back-end server is protected from various communication channel attacks.
Issues With SDP: These comprehensive authentication and authorisation steps along with encrypted tunnels makes SDP a potential candidate for implementing ZTA in IoT or IIoT networks [137] without heavily depending on perimeter firewalls and network segmentation. SDP will also enable securing existing communications based on IoT protocols such as MQTT which lack support for end-to-end encryption [134]. Even though SDP provides increased security to networks, there exist challenges which need to be overcome for successful implementation. One of the challenges of SDP is that it requires comprehensive changes to the network because it differs significantly from traditional networking practices. Clients as well as servers require modifications to make them work with SDP requirements. Furthermore, the central SDP controller can become a target for malicious cyber attacks which can adversely impact the SDP-based network [134].

VII. SECURITY AUTOMATION AND ORCHESTRATION
Security automation is one of the important principles that can help with the successful realization of ZT security. Security automation can abstractly be defined as a process that aims to curtail frequent mediation by security professionals through the automated detection and prevention of threats. Orthodox security logging approaches typically record a large volume of information that is subsequently used for generating alerts to the security operation teams, who effectively investigate the perceived threats. Often, these alerts are repetitive and include a large number of false positives, resulting in a waste of time and resources on inconsequential analysis. ML techniques can be adopted in such scenarios to support security technology's automatic detection of anomalies and determination of appropriate courses of action in light of the threats and vulnerabilities. This results in quick and seamless actions which help the security teams to focus on the threats at hand, as most of the repetitive threats and false positives are automatically taken care of. In the context of ZTA implementation, security automation focuses on the process of automating access decisions, re-evaluating trust in existing connections, and refining policy generation and enforcement using threat intelligence feeds, situation awareness, network activity logs, and system activity logs. The state-of-the-art approaches to accomplishing these individual components of security automation are presented below before making the final comment on challenges in automating and orchestrating this process for realization of ZTA.

A. THREAT INTELLIGENCE
To prevent catastrophic consequences in a highly connected world comprising a network of IoT devices, organisations have the responsibility of ensuring quick and effective reactions to cyber threats. In order to achieve this, it is necessary to collect and analyse information from various internal and external sources. Such information will refer to threats, vulnerabilities and cyber attacks, and provides the requisite data to enable Threat Intelligence (TI). Thus, TI information will provide organisations the latest information regarding cyber threats and will also govern the techniques for countering the threats. With numerous sources of such information, an effective feedback system that automates security control is required to enable the necessary actions.
Threat intelligence involves information gathering on existing and future threats, with the ultimate goal of preventing and mitigating such threats. However, it also involves the dissemination of threat-related information to support decision-making about what proactive security measures are appropriate for the effective management of existent and emerging threats. The authors in [138], proposed a conceptual design to integrate disparate cyber security domains of threat detection and policy controlled systems to automate the task of threat response. The authors argue that effective security countermeasures require risk-based security policies that use contextual information to calculate risk or threat level. However, these risk-based policies pose challenges due to the nature of the contextual information collected, as information local to the system does not provide sufficient information about threats emerging from outside the system. In a Cyber-Physical System (CPS) or an IoT network based critical infrastructure (CI), there exist many challenges for gathering threat intelligence, which include use of heterogeneous devices, standards, and protocols; various layers of data sources (physical, Fog, and Cloud); and the generation of large volumes of data. In order to overcome these challenges, the authors in [139], proposed a novel threat intelligence scheme for Industry 4.0 systems based on Beta Mixture and Hidden Markov Models (HMM). They proposed a threat intelligence architecture comprising smart data management, data pre-processing, and threat discovery modules. The smart data management module collects data from heterogeneous sources such as logs obtained from middleware platforms that connect sensors and actuators, as well as network activity obtained from intrusion detection systems. The evaluation is carried out on publicly available CPS-based power system logs. The pre-processing involves independent component analysis for reducing the dimensionality of features, and the threat discovery uses a Beta Mixture Model that fits the multivariate time series obtained from CPS and network traffic. The output is then fed to the HMM which learns the legitimate and suspicious states of CPS and network activity. Instead of just relying on physical layer parameters of CPS, authors in [140] have proposed a cyber threat intelligence framework which combines information from monitoring at the cyber and physical layers feed actionable countermeasures to the CPS. The proposed approach uses analysis of active malware samples and network intrusions on a CPS honeypot to gather cyber-layer threats. In addition, the physical layer measurements observe the state of physical processes, operation states and physical plant parameters. The CPS threat detector combines the malware attack signatures from the cyber-layer with the CPS data flows from the physical layer using semantic behavioural graphs. As these graphs can become complex with complex network activity, a subgraph approach is utilised to extract a compact representation of the threat activity. In [141], the authors proposed an actionable threat intelligence framework to implement a security response which incorporates feeds received from various sources. The proposed model uses Structured Threat Information eXpression (STIX) documentation, which structures the threat information received from various TI sources to feed a threat response system. The TI document contains the details of the cyber incident, indicators of compromise, and suggested action points. The TRM then communicates the possible actions to various security enforcement tools including host security endpoints, SDN controllers and DNS sinkholes. The authors used extensible Messaging and Presence Protocol (XMPP) as the communication protocol between the TI sources and TRM so as to leverage the scalability and security features of the protocol, with the ultimate goal of securing the communication channel.

B. DEVICE STATE MONITORING
Internally, threat intelligence data can be gathered from various resources that form the network, which can also be referred to as Technical Threat Intelligence (TTI) [142]. For effective policy enforcement and preventing compromised devices from accessing the existing resources and to mitigate the lateral movement of attacks, the TTI system needs to be acquainted with various factors that indicate a compromise, also referred to as Indicators of Compromise (IOC) [142]. The two main categories of resources from which technical information can be obtained are: • Network: these indicators can be obtained from the network activities, including features such as IP addresses, domain names, or URLs. The network-based indicators have short lifetimes and may contain spoofed identities, as adversaries constantly change such parameters.
• Host-Based: these indicators can be obtained from the Operating System and from software artifacts such as the hashes of malware binaries, Dynamic Link Libraries (DLL), registry keys, etc.
IOCs are essential artifacts that help in early detection of attacks as well as the implementation of mitigation policies. However, most IOCs are defined by cybersecurity experts and, given the growing sizes of networks and the numbers of connected devices they comprise, tasks at this level quickly become time-consuming and labour-intensive. To overcome these challenges, the authors in [143] have proposed automatic identification of IOCs using neural-based sequence modelling of data obtained from various cyber-security reports. The bi-directional long short-term memory model with conditional random fields was applied in accruing IOCs from cyber-security reports obtained from news articles and patient notes. This technique can be implemented in a threat intelligence network to gather IOC, so that security policies can subsequently be implemented with great effect. However, in IoT networks, obtaining device states requires collecting information such as resource usage, device activation state, firmware versions, hardware and application states [144]. Recently, Manufacturer Usage Description (MUD) data has been proposed to define the intended behaviour of IoT devices to enforce behaviour-based access control [145]. A typical MUD profile contains inbound and outbound access control entries that specify the intended behaviour of the IoT device in question. The authors in [146] have proposed an approach to automatically generate the MUD profiles of IoT devices based on its network traces. MUD profiles can be effectively used to enforce security compliance and also to identify anomalous device behaviour by comparing current behaviour data with pre-existing default MUD profiles. An example network access profile of an IoT device will consist of Domain Name Service (DNS) resolution of Fully Qualified Domain Names (FQDN) of the application servers to which the device is pre-configured to connect, as well as data about the actual connection request made, based on TCP or UDP on specific ports to the application. There are however challenges in using MUD-based access policies, as many commercial IoT devices such as Amazon Echo may connect to a wide range of IP addresses, as validated in [146]. Furthermore, in constrained IoT devices as well as in industrial control system devices such as embedded devices and programmable logic controller systems, there can be limited on-device hardware resources to store logs and events that can be necessary for state monitoring. The heterogeneity of protocols and applications implemented by such devices increases the challenges in acquiring such security-related information.

C. SECURITY SITUATION AWARENESS
As the concept of ZTA gains acceptance in various domains, effective implementation of the same requires integrating network and system state into the policy decision framework. This would enable a policy engine to make situation-aware access decisions. Situation awareness with regard to cyber security threats can be gathered from both internal and external networks to effectively calculate the potential risks to the existing resources. In addition, the resource access policies VOLUME 10, 2022 in ZTA depend on the level of trust of the users and devices rather than the location of the requester in the network which requires greater contextual information such as authentication logs, device states and network activity [147]. Access policies also need to be dynamically enforced and constantly re-evaluated [3], which facilitates effective decision making based on various data sources [147].
In [147], the authors present a policy management framework for ZTA referred to as FURZE (Fuzzy Risk Framework for ZT Networks), which incorporates the enterprise's security situation awareness (SSA) into the Risk-Adaptable Access Control (RAdAC) scheme to facilitate dynamic access policy application reflecting changes in network security settings. The SSA provides information regarding the current state of threats to assets and the criticality of those assets, which collectively indicate the potential impact on enterprise missions. One advantage of the proposed framework is that the RAdAC scheme adapts a dynamic and probabilistic calculation of risk score to make access control decisions instead of a static comparison of context-based system attributes.
ICS have received considerable attention in recent times with regard to cyber situational awareness, owing to repercussions of the Stuxnet malware, according to the review conducted in [148]. Simulated environments have been deployed to show the effectiveness of the different kinds of information sources in CI such as those that render measurements of voltage and current waveforms in power grids to field sensors measurements, to help identify the potential impact of a cyber threat on the operations of the CI.
In [149], the authors proposed a Big Data analysis based situation awareness architecture for smart grids, which incorporates information obtained from electrical devices, substation buses, network devices, station controllers, and control centers, to calculate the awareness scores of the various threats in the smart grids. The awareness score is calculated based on a neural network and game theory-based big data analysis. Such awareness scores can be incorporated into a RAdAC scheme proposed in [147] to dynamically deploy access policies in CIs such as smart grids.
However, challenges such as heterogeneous devices and systems, responding quickly to emerging threats, and interoperability issues in the deployed security tools, impedes the implementation/adoption of a comprehensive automated security system [150]. In order to alleviate this challenge, authors in [150] have proposed a SIEM-based security automation scheme which relies on the near real-time data collected from various existing resources of a network. SIEM provides various advantages which can be leveraged to automate the operations of a security control. The centralised logging and analysis feature of various network and device activities can provide greater visibility over the security state of the system being monitored. Correlated events collected by the SIEM system can be used to identify the root causes of security issues and for automating and applying effective mitigation controls at the most apt locations in the network. Furthermore, Machine Learning (ML) algorithms can be implemented to reduce false alarms and also to increase the effectiveness of the various security controls. Integration of various security tools is another advantage of SIEM solutions, as these can be plugged in to and adapted to various security platforms.

D. MACHINE LEARNING FOR SECURITY AUTOMATION
With a large volume of security logs being collected, ML can act as an enabler for security automation by understanding the behavior of security threats and recognizing the patterns to automatically take defensive actions as highlighted in [151]. AI has not been explored to its fullest potential for realizing an effective security automation procedure [151]. ML has shown promise in a number of critical scenarios for threat detection and prevention. For example, SDNs are being increasingly adopted for automated security monitoring in cloud computing platforms. As described in [152], SDN platforms are more susceptible to threats because of the separation that exists between their control and data planes. Conventional approaches for preventing DoS attacks in SDNs are not helpful, as they largely depend upon the flow characteristics of the packets, allowing the attackers to deceive the system with slight modifications to the traffic patterns (e.g., through changing packet headers to make it appear like legitimate traffic) [152]. Given this, ML approaches are deployed in SDN infrastructures to trace malicious traffic. For example, authors in [153] adopted SVM to detect DDoS in SDNs. The scheme extracted features from the traffic flow, including number of packets, bytes, packet variation, duration, etc. Through the utilisation of these features, the system is trained to serve as a classifier to classify the traffic as being normal or anomalous. Likewise, the authors of [154] adopted support vector machines (SVM) to detect anomalous traffic so as to prevent DDoS attacks in SDNs by leveraging the extracted features such as the standard deviation of the packet, bytes and flow entries. Similarly, a number of works have demonstrated the efficacy of deep-learning for intrusion detection. For example, the authors of [155] have adopted stacked-autoencoders in combination with SVM and Artificial Neural Network (ANN) for detecting impersonation attacks in Wi-Fi networks. Similarly, the authors of [156] argued that as novel cyber threats appear frequently, thereby making it difficult for conventional detection systems that rely on prior modeling of the attacks to detect them. To overcome this issue, they argued that deep-learning can help to successfully identify such novel attacks in social networks by leveraging their similarities to known attacks. This is achievable as deep-learning comprises of numerous layers of non-linear processing elements to learn the precise features that may help in attack detection in social networks, through sophisticated training on diverse mathematical models representative of the cyber threat data.
The authors of [156] demonstrated that long short-term memory recurrent neural network helps in the identification of anomalous traffic accurately in social networks as compared to other methods that rely on traditional ML algorithms.
Likewise, the authors of [157] demonstrated the efficacy of convolutional neural networks (CNNs) for attack detection. Similarly, ML (decision tree, SVM, MP) has also been shown to be useful toward detecting attacks in IoT networks [158]. The authors of [158] ran experiments on a dataset that consisted of network traffic obtained from nine IoT devices that were impacted by Mira and BASHLITE botnets. They extracted the statistical features from this data and after performing some appropriate data processing, they trained three different conventional ML algorithms and demonstrated their efficacy in accurately classifying legitimate and malicious traffic. Likewise, authors in [159] also demonstrated the effectiveness of the Random Forest algorithm for detecting different attacks in IoT such as DoS, Data type probing, and other malicious activities. Authors in [160] proposed an entropy-based method for detecting DDoS in IoT based on a stateful SDN data plane. They demonstrated that the entropy of different features such as source and destination IP addresses and ports changes significantly in the case of DoS and DDoS attacks in IoT. They also illustrated the ability of SDN to mitigate these attacks by simply adding entries to the flow tables of network switches. Likewise, the authors in [161] adopted deep learning for detecting various attacks in IoT devices. Precisely, they extracted generic features from headers of the individual packets and subsequently applied a feed-forward neural network for detecting four different types of attacks, namely, DoS, DDoS, reconnaissance, and information theft. Similarly, the authors in [162] also showed that deep-learning can be employed for online detection of network attacks on IoT gateways.
The aforementioned examples of ML application for attack detection specifically in SDN based networks can be effectively deployed as a feedback mechanism for SDN controllers which can subsequently enforce security policies on SDN-compatible network devices. An example of such a feedback system to protect IoT systems is proposed in [163], which consists of a mitigation module that takes input from the IoT communication path and feeds the output generated by the IoT device back to a detection module, which subsequently detects malicious output through application of ML techniques. The mitigation module is recommended to be a polymorphic hardware and software component which can adapt to the system security requirements. An SDN-based system can be adopted as a mitigation module which can be quickly reconfigured for meeting the stipulated security requirements.

E. POLICY ENFORCEMENT
In a ZTA, the contextual and threat intelligence feeds are leveraged to control access to the enterprise resources. As indicated in Section I, the policy decisions are taken by a PE and enforcement of policy is done by a PEP. A few representative policy enforcement frameworks for IoT networks are discussed below.
Automating access policy definition and enforcement in an IoT environment, where a large number of devices are connected to the network, is a challenging task. In addition, groups of IoT devices might work collectively for a specific workflow, which increases the complexity of manually implementing fine-grained access controls for M2M communications [164]. In consumer environments, many end-users will also lack the skills to manually configure such access policies. The heterogeneity in IoT devices and protocols will exacerbate the security policy enforcement. In such circumstances, SDN and Network Function Virtualisation (NFV) technologies can be used to orchestrate and enforce security closer to the edge. NFV is based on virtualisation technology which allows organisations to virtualise traditional hardware based network functions such as routing, switching, firewalls and many more and deploy them on commodity hardware by sharing computing resources. SDN and NFV have also become key enablers for the deployment of 5th Generation telecommunication networks which supports superior network performances to deal with billions of devices expected to be connected to the Internet [165]. Using SDN and NFV, dynamic security policies and virtual network security functions can be applied to control access to end resources [166]. Using these technologies, virtual network security functions such as virtual Firewall (vFirewall), virtual Intrusion Detection System (vIDS) and virtual Authentication and Authorization services can be quickly deployed to protect the resources. Such network security orchestrations can be achieved by combining NFV and SDN where NFV allows rapid deployment of network security services and SDN paves the way for a quicker enabling of network connectivity to network function virtualisation infrastructure (NFVI). Furthermore, to bring the NFV closer to the edge, container based network security virtualisation has been proposed wherein security functions are deployed in light-weight containers which require fewer resources compared to a virtual machine, and can be run on a single operating system [167]. With advancements in virtualisation technology, unikernels have now been proposed for virtual network function instantiation. Compared to containers, unikernels don't share the same kernel and are therefore more secure than containers running on shared kernel space [168]. However, containers seem to have better performance in terms of instantiation delays and TCP performance than unikernels, as highlighted in [169].
These technologies have an important role in realizing the ZTA model of security to deploy security functions closer to the resources. This requires PE and PEP to be SDN and NFV compatible and should be able to automate the deployment of virtual security functions in a reactive manner. In [170], authors deployed a machine learning based security orchestration function which uses monitoring agents to identify IoT attacks and dynamic reaction module that deploys a virtual security function to secure the IoT devices. In [171] authors presented a security management architecture for a NFV/SDN aware IoT systems called ANASTACIA. This framework leverages a security enforcement plane composed of NFV Management and Orchestration (MANO), SDN and IoT controllers to manage resources (computing, storage and network) for IoT devices interacting in the data plane. The framework adopts a two tier policy enforcement technique where a default preventive security policy is applied to secure IoT devices from known attacks and a reactive policy enforcement is used as an active countermeasure to prevent ongoing attack. When an attack is detected and virtual network function is deployed using NFV, SDN and IoT controllers in appropriate locations in the network. Instead of users manually selecting devices, the authors in [164] have proposed an automatic scheme to abstract the workflows and automatically select suitable IoT devices required to achieve the policy management task, which can also facilitate the specification of accurate access policies in M2M communications as well as help enforce least privilege access. The authors use search algorithms to select the most suitable devices for a particular workflow and to generate network access policies that can be implemented through SDN-based devices. In [172], the authors proposed an access control policy enforcement for zero trust networks that requires dynamic and per-connection access control. The proposed framework called Fuzzy Risk Framework (FUZRE) is based on RAdAC that makes access control decisions considering operational need and security risk posed by an access request. The FZURE architecture uses XACML, which defines the structure for an attribute based access control implementation. It comprises a PEP such as a WiFi router, which forwards the access requests to a context handler. The context handler subsequently communicates with various policy decision making modules; the environment evaluation, risk evaluation, topology awareness, access decision modules. The FZURE framework also incorporates a continuous evaluation of the access to adapt or re-evaluate access policies to changing network situations based on the balance between operational needs and posed security risks. The risk evaluation module is based on a variant of fuzzy logic that works well with subjective and vague inputs to make clear policy decisions. IoT-IDM (Intrusion Detection and Mitigation for IoT) [173], is a host-based intrusion detection and mitigation framework for smart home systems realised using the SDN OpenFlow platform. The framework consists of a device manager, sensor element, feature extraction module, detection and mitigation modules. The device manager contains the inventory of all the devices in the smart home network, their potential vulnerabilities, and any possible defence mechanisms in play. The database is based on public repositories that contain datasheets of available and new devices for smart home systems. The sensor element uses an inline sensor deployed in the SDN controller to monitor traffic of registered devices on the network. OpenFlow-based network switches are then configured to redirect the IoT device traffic to the sensor element. Based on the traffic captured by the sensor element, features are then extracted from the traffic. The detection module is responsible to detect anomalous behaviour of devices through the implementation of ML algorithms, and helps to identify the attack source specifications, which are then adopted to configure access rules on the OpenFlow switches through the mitigation module.
The Adaptive Risk-Based Access Control Model for IoT proposed in [174] adopts a risk estimation process depending on the IoT environment features, to calculate the risk associated with an access request. The risk estimation algorithm considers the context information of the user, the sensitivity of the resource being accessed, the severity of the requested action, and past risk scores associated with the user, in order to estimate risk. The quantified risk is calculated by the likelihood of a security incident to occur and the potential impact of such an incident. In addition, an adaptive approach is incorporated to evaluate risk continuously through the session. The resource sensitivity module assigns a risk metric associated with individual resources based on the sensitivity of the data or resource being requested. The risk estimator is responsible for analysing all the possible risk features to estimate the risk of granting resource access, whilst adhering to policy decisions. For continuous evaluation of user behaviour, the authors propose the use of smart contracts to ensure that the terms of the contract are adhered to throughout a given communication session.

F. TRUST COMPUTATION
As indicated in Section II-A, a trust algorithm is considered to be the cornerstone of the PE. Since there is no direct literature available relating to trust computation for enterprise networks or CI, we describe a few trust computing techniques employed in the AdHoc IoT networks which can be extended for trust calculation in CI. For example, the authors in [175] proposed a trust-based access control and management technique for fog computing platforms. For computing the trust, they use a score-based method which assigns different weights to a user's historic behavior, the type of device used, and its location, and then compares the computed score against a threshold, thereby making it a contextual scorebased algorithm, as discussed in Section I. Likewise, the authors in [176] also proposed a weight-based trust evaluation technique that makes use of information entropy and demonstrates its appropriateness as compared to traditional weightbased methods. In [177], the authors applied a Fuzzy-logic in Trust Based Access Control (FTBAC) mechanism, which incorporates contextual experience, knowledge and recommendations about the device to calculate trust. According to the authors, the access control is directly proportional to the trust established for the device requesting access.
In [178] the authors present a trust management system for securing the data plane of an ad-hoc network. The proposed method incorporates a fuzzy logic and graph theory based trust calculation model which allows individual nodes to calculate trust scores for all the other nodes in the network. The authors use average delay (AD) and packet delivery ratio (PDR) as the statistical parameters for trust calculation, as data plane attacks can affect the delay of critical packets or even target the successful delivery of packets. The PDR and the AD are calculated based on the packet sequence numbers and the packet timestamp, related to the acknowledgement packets received on the path. First, the path trust value is determined and then the node trust value is calculated. The idea is that all the paths that traverse the malicious node will have lower trust scores when compared to the paths that pass through the normal node. Considering the network as a connected graph, matrices of path trust values and vertices of the graph are obtained and flooded on the ad-hoc network by the threat actor, which enables all the participating nodes to calculate the node trust values for the entire network.
Issues in Integration of Individual Components for Trust Computation to Automate the Security Actions: Trust algorithms have been applied in peer-to-peer (p2p) and ad-hoc communications, to take the feedback of neighbouring nodes, central trust controllers and historical transactions. However, a trust algorithm that takes threat intelligence, device security status, contextual information, and cyber activity logs into consideration is lacking. Such a trust algorithm will encounter numerous challenges when compared to the p2p and ad-hoc trust algorithms, as it needs to handle heterogeneous data sources that generate data comprising numerical and imprecise information. The solution proposed in [172] which considers contextual information and situation awareness, feeds in a RAdAC system using XACML access control policy language. However, the proposed work lacks the implementation details of the risk calculating function and the inner workings on how the access control policies are implemented at both the application and network layers.

VIII. DISCUSSION AND FUTURE RESEARCH DIRECTIONS
This section summarises our findings on successful implementation of ZTA, with a particular emphasis on identified knowledge gaps in the various state of the art methodologies.
Discussions outlined in the previous sections indicate that there are numerous challenges in accomplishing zero trust. All the techniques considered essential for achieving zero trust, such as authentication, access-control, encryption, and security automation have shortcomings. In this concluding section of the paper, we pinpoint the weaknesses of these techniques and also identify directions for future work.

A. AUTHENTICATION
Despite the universal deployment and use of authentication technologies, there are still several dimensions of authentication that are yet to be fully realised. Below, we detail some of these as potential future research directions.
• User Authentication: As indicated in section III-A, almost all user authentication mechanisms such as passwords, fingerprints, facial recognition and iris scans have vulnerabilities. Multi-factor authentication (MFA) is a solution endorsed by many but most of these solutions rely on a secondary-device such as a mobile phone and also demand a significant user interaction for accomplishing the MFA. These requirements deter the widespread usage of MFA solutions. In view of this, solutions that demand minimum participation from users are needed. For example, the proximity of two devices (i.e. a login device and a pre-registered mobile phone) established using the ambient sound recorded by them has been proposed as a potential second-factor in [179]. However, this solution is still reliant upon a secondary device, thereby rendering it useless if the mobile phone is stolen, discharged, or lost. Therefore, MFA solutions that are not dependent upon a secondary device and which require minimal interaction from the user are needed. In view of this, entirely new directions for authentication are being pursued. For example, [180] recently proposed to use the frequency response of the person's ear canal as a distinctive feature for authentication. However, this solution demands an earpiece with a microphone for sending the sound signal into the ear cavity and recording its reflections simultaneously, which will be onerous for users. Similarly, authors in [181] proposed to use the whizzing sound of human breath as a feature for authentication. However, this solution necessitates a deliberate action from the users, i.e. it demands a user to place the microphone very close to the nose and to perform a deep-breath or sniff. This discussion reveals that even the most recent advancements in the field of user authentication come with drawbacks, thereby demanding significant research efforts on user authentication.
• Continuous User Authentication: In addition to entry point authentication, continuous (or active) authentication is also an active area of research. However, most continuous authentication mechanisms rely on user's behavioural biometric features associated with typing, tapping, and gait patterns. The problem with this approach is that they rely on sensors embedded in the devices. For continuous authentication across numerous devices (e.g., mobile-phone, laptops, tablets), the reliance upon sensors is an issue, as not all the devices possess resembling embedded sensors. Besides this, behavioral biometrics are also dependent upon the situation in which they are captured. For example, a user's tapping or typing patterns may be different while walking at different speeds. This necessitates a thorough processing and machine-learning pipeline that can extract the discriminating features of a user.
• Context-Aware User Authentication: Context-aware authentication is also being frequently adopted for user authentication. We observe that most context-aware authentication mechanisms rely on location information that is concatenated with some other information such as device time or proximity. However, some of the sensors that are deployed for capturing the contextual information (e.g., accelerometer or gyroscope) may not be available on some devices, and thus may not be usable across all sorts of devices. Therefore, to carry out context-aware user authentication, a suitable alternative could be to pair the user's location with his daily activity on the mobile phone and to generate questions from that activity (e.g., calls and message logs). However, this solution will demand a series of questions by taking into account the user's location information, which may pose privacy concerns for users. Nevertheless, this can be a suitable option for context-aware fallback authentication, which is only rarely required. Identifying the appropriate set of questions and crafting a location-based contextual framework are other open research directions.
• Device Authentication: In addition to user authentication, device authentication is also required for M2M communication and CPS. The popular mechanisms as indicated in the literature include Physically Unclonable Functions (PUF) that are built in these devices to generate an unique identifier by leveraging the hardware characteristics of the device. However, the literature suggests that PUFs are vulnerable to modelling attacks to clone the device identifier. This vulnerability can be addressed by using cryptographic methods, which are also widely adopted for mutual authentication in ad hoc and opportunistic networks such as VANETs. However, cryptographic methods are at a greater risk of subversion with advances made in quantum computing. Some recent research on quantum computing reveals that most current cryptosystems could become vulnerable in the future to exhaustive key search attacks. Therefore, to counter these issues, post-quantum cryptography is being actively researched (see section Post-Quantum Cryptography detailed ahead). A suitable alternate for device authentication can be inspired from contextaware authentication, wherein a device can leverage the sensors or other peripherals to capture the information related to its ambience or other parameters to enable authentication.

B. ACCESS CONTROL
Zero Trust necessitates a fine-grained and contextaware access control. The literature suggests that these access control requirements can be accomplished by attributes-based and usage-based access control. However, different IoT-enabled environments such as smart homes, smart grids, healthcare IoT, and smart buildings have entirely different access control requirements, requiring different arrangements of access control components. Most of the current frameworks for the aforementioned scenarios do not identify the appropriate requirements required by the corresponding scenarios, thereby leading to inadequate access control instantiations. Recently, blockchain is also emerging as a candidate for distributed access control. However, blockchain is still immature for use within this domain with its dependence on a consensus mechanism, making it less attractive than traditional centralised systems. Risk-aware access control incorporating capabilities of a fine-grained access control scheme such as Function-Based Access Control (FBAC) is recommended. This will enable evaluation of the risk posed for an access request, by leveraging information derived from diverse sources including threat-intelligence and CPS situational awareness systems, and granting access decisions at a high-level of granularity, as highlighted in [172]. However, such risk evaluation frameworks will face challenges due to the numerous sources of inputs and the volume of contextualdata collected. Similar challenges will be encountered by frameworks that use trust levels to grant or deny access to resources. The main challenges that such frameworks will encounter are summarized below that may be addressed in future: • Converting numerous sources of threat intelligence and SSA into actionable security policies is challenging due to varying data formats and heterogeneity of devices such as firewalls, IPS, SDN controllers and other network appliances. Interactions between such frameworks and the devices necessitates compatible communication standards. With no specific standards in place, realising such interactions is challenging.
• Contextual and behaviour data can be obtained from various security and logging appliances as well as from network hosts. Combining and correlating these large volumes of heterogeneous data threat identification and risk quantification in a reasonable time is a complex task. Besides, TA will also face challenges when combining different forms of information obtained from TI, SSA, SIEM, contextual and behavioural data sources, to making decisions.
• Risk evaluation frameworks or TA that incorporate security state of devices will face challenges in acquiring such information from constrained IoT and CPS devices used in critical infrastructure.
• Current TAs are designed for adhoc or P2P networks and are not directly applicable for trust evaluation in enterprise networks or CPS with heterogeneous devices and applications since the input parameters for calculating trust in adhoc and P2P networks differ considerably from the parameters that exist in enterprise networks and those specified in a ZTA.
• Current risk based access control techniques do not adequately enforce access control policies at both the network and application level as applying controls only at a particular level (layer) will lead to resources becoming vulnerable to cyber attacks at other protocol layers.
• Adoption of a central risk score or trust evaluation framework will make the central PE node vulnerable to cyber attacks such as DoS attacks. This requires either a robust PE calculation engine or a distributed PE to be in place, which can prevent such attacks. Countering such challenges requires the use of technologies that enable the processing of large volumes of data and correlation of events from multiple sources to perceive the threat to the system. The recent advancements in deep learning techniques have a potential use in such scenarios where useful intelligence is to be obtained from such data sources.

C. MICRO-SEGMENTATION AND SDP
Micro-segmentation has been proposed as an effective strategy for implementing ZTA, as it allows network perimeters to be positioned close to the application servers, enabling simpler enforcement of fine-grained access controls. It also enables creating and managing policy enforcement per access group, for preventing unauthorised access to resources within the network perimeter. However, some associated challenges in implementing micro-segmentation are described below that may be addressed in future: • The effective segmentation of applications is difficult due to the complex workflow that exists in the interactions and dependencies for large networks.
• Legacy, monolithic and non-virtualised applications are not suitable for micro-segmentation.
• Effectively translating workflow access requirements into network or application level access control policies is a challenging task. For example, network level access restrictions might not prevent malicious activities at the application layer. This has been addressed in limited manner through the use of micro-services in granular applications running in container services, as highlighted in [119].
• Managing and maintaining various access control policies is another challenge which is exacerbated through constantly changing application requirements and with the introduction of new applications, thus leading to misconfiguration and errors. In order to counter this issue, dynamic workflow access detection techniques and access control policies that can identify the workflow interactions and update access policies accordingly, are needed.
• As micro-segmentation is only applicable for application level access control, protection of physical devices and resources necessitates the consideration of SDN-based mechanisms such as those proposed in [131] and [132].
In addition to micro-segmentation, SDP has been adopted for implementing ZTA. Even though, SDP provides increased security to networks, challenges exist in successful implementation of ZTA. One of the challenges of SDP is that it requires comprehensive changes to the network as these networks differ significantly from traditional networking practices. In addition, clients and servers require major modifications in the communication process to be compatible with SDP requirements. Furthermore, the central SDP controller can become a target for malicious cyber attacks, which can adversely affect operations of the SDP-based network [134].

D. POST-QUANTUM CRYPTOGRAPHY
Recently, quantum-computing is progressing at a rapid pace. However, with all the anticipated benefits of quantumcomputing, there are some issues that require careful thought. For example, quantum computers are likely to subvert current cryptographic defences such as RSA and ECC. In [182] it was demonstrated that 2048-bit RSA can be easily cracked in around eight hours by a quantum computer consisting of 20 million qbits. Although quantum computers of such enormous power are currently non-existent (i.e. only 128 qubits presently), they are likely to have such potential in the future. Similarly, the authors of [183] showed that almost all of the current public-key mechanisms can be broken by Shor's algorithms [184]. Similarly, for hash functions, Grover's algorithm [185] can facilitate quantum birthday attack to find collision, thereby necessitating an output of larger size. Therefore, to safeguard against such advancements in the foreseeable future, post-quantum cryptography is being actively researched. Post-quantum cryptography is basically a new approach to cryptography that can be implemented on legacy computers-one which can withstand attacks launched by a powerful quantum computer. One of the obvious choices is to increase the length of encryption keys which effectively will hyper-exponentiate the number of permutations that a quantum computer algorithm (e.g. Grover's algorithm) will have to run. Authors in [183] have shown five key approaches for which no known way of applying the Shor's algorithm exists. These approaches are code-based encryption, lattice-based encryption/signature, multivariate-quadratic-equation signatures, and hash-based signatures. However, numerous shortcomings of the aforementioned approaches are mentioned in [183]. The standardization of post-quantum cryptosystems is currently underway at NIST [186]. The candidates (for encryption and digital signatures) included as finalists in Round 3 of NIST's standardization project are summarized below in Table 13 (interested readers are referred to [187] for more details). However, even without standardization, Google recently conducted some experiments with Lattice-based cryptosystems and demonstrated their appropriateness in large scale encryption cracking applications [188], [189]. As pointed out in [183], post-quantum cryptosystems still need varying designs alongside techniques for optimization and implementation. Besides, attack analysis of currently available approaches is also somewhat limited and their incorporation into existing (legacy) systems and applications is challenging and is expected to take more time [190]. All of these dimensions are active research areas which need thorough investigation for continued and sustained realization of zero trust in the future.

IX. CONCLUSION
This paper presents a comprehensive description of the new security paradigm, zero trust. We present the basic tenets of ZTA along with its logical components and an analysis of a VOLUME 10, 2022 range of implementation techniques. As there is no single technology or architecture that can successfully implement a ZTA, the paper reviews various techniques and approaches that have been identified as essential to realise its adoption. Based on this, we highlight the role of authentication and access control techniques that take into consideration an organisation's context, behaviour and perceived threats, so as to constantly re-evaluate the trust in ongoing connections for a successful realization of ZTA. In addition, encryption, micro-segmentation, and software defined perimeters are also essential components in realising a comprehensive ZTA. The paper articulates state-of-the-art approaches to instantiate the identified security techniques for adoption in various deployment scenarios and finally concludes with a description of various challenges that are posed by contemporary authentication mechanisms, access control schemes, trust and risk computation techniques, micro-segmentation approaches, and SDP. The identified challenges may serve as potential future research directions for instantiating ZTA in its true spirit. SYED W. SHAH (Member, IEEE) received the M.Sc. degree in electrical and electronics engineering from the University of Bradford, U.K., and the Ph.D. degree in computer science and engineering from The University of New South Wales, Sydney, Australia. He is currently a Research Fellow with Deakin University, Australia. His research interests include pervasive/ubiquitous computing, user authentication/identification, the Internet of Things, signal processing, data analytics, privacy, and security. He has received multiple awards, scholarships, and fellowships. His research work has been featured by numerous leading media outlets, including CNN, the Australian Broadcasting Corporation, New Scientist magazine, and CJAD 800 Montreal.
ARASH SHAGHAGHI (Member, IEEE) received the B.Sc. degree from Heriot-Watt University, the M.Sc. degree in information security from University College London, and the Ph.D. degree in computer science and engineering from The University of New South Wales (UNSW), Sydney, Australia. He is currently a Senior Lecturer in cyber security with the Royal Melbourne Institute of Technology, Australia. He is also a Visiting Fellow with the School of Computer Science and Engineering, UNSW Sydney. He has previously been affiliated with Deakin University, Data61 (CSIRO), The University of Melbourne, and The University of Texas at Dallas. He is a multi-award winning cyber security educator and researcher with a track record of publications at competitive international conferences and journals. To date, his cyber security research has garnered over AU$300,000 (across Chief Investigator and Partner Investigator roles) from various internal and external sources, including the Australian Government and the Cyber Security Cooperative Research Centre. He has reviewed numerous journal article submissions and has served as a technical program committee member, organizing member, and reviewer roles at various prestigious security and networking conferences. He currently serves as an Associate Editor for the journal Ad Hoc Networks. He is a member of the Australian Information Security Association. His research activities have been covered by the Australian Broadcasting Corporation and other media outlets. Visit arashshaghaghi.com for more information.
ADNAN ANWAR (Member, IEEE) received the master's degree (by Research) and the Ph.D. degree from UNSW. He has previously worked as a Data Scientist and Analytics Team Leader at Flow Power. He has over ten years of industrial, research, and teaching experience in universities and research laboratories, including La Trobe University, Deakin University, The University of New South Wales (UNSW), and National Information and Communications Technology, Australia (now merged with Commonwealth Scientific and Industrial Research Organisation as Data61). He is currently a Lecturer and the Deputy Course Director of Postgraduate Cyber Security Education with the School of Information Technology, Deakin University, Australia. He has had over 60 published works, including articles in Q1-ranked journals and conference papers and book chapters published by prestigious outlets. His broad research interests include the security of sensor-connected Internet of Things devices, cloud security, the security of supervisory control and data acquisition systems in critical infrastructure, data-driven intelligent techniques, and other data science applications. He was a recipient of several awards, such as the University Postgraduate Award, the UNSW Tuition Fee Remission Scholarship, and the Best Paper Award. He has received industry funding and several travel grants through the Association of Computing Machinery and the Postgraduate Research Support Scheme. He is active in the IEEE Computer Society Technical Committee on Data Engineering as well as in the IEEE Cybersecurity Committee.