Building In-the-Cloud Network Functions Security and Privacy Challenges

| Network function virtualization (NFV) has been promising to improve the availability, programmability

worked out the framework of network function virtualization (NFV) and proposed accordingly the white paper [1].NFV relieves the burden of the complicated deployments of networking and communication hardware infrastructures by transferring the communication data to standard hardware (e.g., standard high volume servers, switches, and storage) in the network nodes.It also simplifies the design and operation of network functions, and increases flexibility, programmability, and expansibility of network functions [1]- [3].Fig. 1 summarizes many representative applications of NFV, as also suggested in [1].With the increasing demand for advanced network infrastructures, the global NFV market size is expected to grow to $36.3 billion by 2024 [4].
Meanwhile, with the rapid development of cloud technologies [5]- [8], individuals and enterprises are tending to outsource virtualized network functions to the cloud, so as to mitigate the local burdens on provisioning and managing specialized hardware resources.As of today, network functions, also known as middleboxes, such as firewalls, deep packet inspection (DPI), load balancers (LBs), can be easily implemented on cloud platforms.Many cloud platforms have publicly offered the services of in-the-cloud network functions, e.g., Amazon CloudFront [9], Microsoft Azure cloud firewalls [10], and Google's networking services [11], including cloud NAT, cloud load balancing, and cloud router.Very recently, VMware Inc. announced that they had enhanced their VMware Ready for Telco Cloud to enable virtualized network functions with their VMware Telco Cloud platform [12].
The powerful computing capabilities of cloud infrastructure can provide much-improved high-performance packet processing demanded in various network functions.Individuals and enterprises will no longer suffer from cumbersome network hardware configurations and management, especially for the inconvenient network function updates.
Though promising, it is worth noting that, once the network function is outsourced, and the traffic is redirected to the cloud, the cloud server then has the root access to both the outsourced functions and communication traffic, which may raise severe security concerns.The cloud, with its nature of outsourced service offering, can be vulnerable to internal threats (e.g., abuse of the root access) and high-value target of external threats (e.g., various malicious attacks) [64].The communication traffic over the network may carry confidential information, and thus, directly redirecting the packets to the cloud will violate the user's privacy.Furthermore, the outsourced function itself (e.g., the inspection rules for intrusion detection systems (IDSs), which may be generated by a third-party professional service provider, such as McAfee [65]) might contain proprietary assets.These commercially sensitive data could be exposed if deployed in the environment without security attestation.Today's secure communication practices often employ end-to-end encryption to protect data-in-transit.However, such a practice inevitably hinders the processing and computation of the packets and rules at the middlebox.To better support the middlebox functionality, it may have to require the traffic to be decrypted in the middlebox, which would violate the end-to-end security [66].In light of this need, there has been a relevant research line to design new transport layer security (TLS) protocols that are compatible with middleboxes [67]- [70].These protocols enable traffic-owner-controlled decryption at the middleboxes.In other words, the traffic is visible to the middleboxes to some extent.This line of work has great significance in practical and systematic aspects but does not target the security level that we discuss in this article.In our survey, we focus on the designs where middleboxes are never supposed to learn the plaintext content in the traffic but can still inspect the encrypted traffic.As we show later, how to process encrypted packets with respect to protected rules without revealing any confidential information about the communication remains a big challenge.
Network functions can be described as matching and computing on the network flows with pregenerated rules and, finally, getting the action to be performed on the packets [37].With regard to the secure in-the-cloud network functions, the first challenge lies in how to support enriched operations on encrypted data.It is easy to complete the operations of matching and computation on the plaintext traffic.However, even the simplest single pattern matching can be costly to perform on encrypted traffic.Therefore, before diving into the specific privacypreserving techniques, it is important to categorize the network functions from the perspective of matching versatility.For example, the matching rule can be classified into equality matching and more enriched ones, such as range matching and regular expression matching.Furthermore, some network functions may also require modification on packets [48] (e.g., the network address translation (NAT) protocol) and inspection over stateful packets [15], [37].Besides functionality, privacy issues are also important in cloud scenarios.The traffic should be kept private, and the inspection rules are commercially sensitive.Therefore, it is essential to build market-compliant outsourcing protocols to protect both of them [25], [71].Furthermore, untrusted cloud servers or even malicious ones possibly with financial incentives, which may deliberately miscalculate results to reduce costs or disobey the procedures to obtain more confidential information, should also be considered [58], [59].In conclusion, the goals of outsourcing network functions in a privacy-preserving way are to: 1) complete general network functions; 2) preserve the privacy of the traffic and/or rules; and 3) enable verifiable computation on the cloud middleboxes.
We summarize the development and roadmaps of noteworthy secure in-the-cloud network functions over the period of 2012-2021 in Table 1.From the perspective of functionality, we categorize the solutions into different types of middleboxes, e.g., DPI/IDS, firewalls, and NAT.The pioneering work, BlindBox [24], first enabled an outsourced middlebox for DPIs without decrypting the traffic in the middle by using the technology of garbled circuit (GC) [72].Follow-up works based on different cryptographic tools can achieve more enriched network functions, such as range matching [26], [37], [40] and wildcard matching [38], [39].The outsourced middleboxes based on the trusted hardware can achieve more general network functions [13]- [16] and also enable the verification.Up until now, efforts have been made to promote the development of in-the-cloud network functions, but there still remains much room to explore.Hence, we believe that it is necessary to give a comprehensive study on the current progress of network function outsourcing, so as to make the remaining challenges and opportunities clearer to interested researchers.

A. Related Work
To the best of our knowledge, we are the first to provide a deep insight into the topic of outsourcing network functions from the aspect of protecting the privacy of the communication traffic.There are surveys about the issues of NFV [1]- [3], [73] and SDN [74], [75], which mainly focuses on the challenges, technologies, and implementations of NFV and SDN and do not consider the issues of privacy.Pattaranantakul et al. [73] take the security issues into consideration.However, this survey focuses on the security of the process on the virtual machine of the service provider, and the privacy problem is out of scope.Zave and Rexford [76] focus on very general security issues of network interactions.Therefore, the complementary viewpoints [73], [76] are orthogonal to ours.
The most related works are [77]- [79].Compared to our work, the surveys [77], [78] discussed a smaller portion of research on network function outsourcing and did not classify and conclude the state of the arts as extensively as we do.Poh et al. [79] have surveyed a larger scale of solutions than [77] and [78] from the perspective of different techniques.Unlike [79], our survey presents existing works in a new light, i.e., from the perspective of the complexity of network functions.Our angle focuses more on the problems than the solutions, which may give the researchers a clearer understanding of the challenges and corresponding privacy protection mechanisms.Besides, the comparisons among the solutions and cryptographic tools in our survey are much more diversified, and we also give metrics to evaluate the effectiveness of existing mechanisms.Therefore, our work presents a more in-depth tutorial that can help interested researchers quickly grasp the motivations, challenges, and techniques of the fast-rolling in-the-cloud network function.

B. Contribution
In this article, we systematically survey the problems and solutions of the in-the-cloud network functions over the period of 2012-2021.Our contributions are listed in the following.1) We, for the first time, extensively survey the privacy and security issues of in-the-cloud network functions from the perspective of function complexity.We systematically classify network functions into equality matching, function-enriched matching, and general functions, and summarize corresponding outsourcing mechanisms with pros and cons.2) We introduce detailed definitions of NFV architecture, outsourcing model, usage scenarios, and the threat model, providing a concrete description of the issues of secure network function outsourcing.We also give a rich background of virtualized network functions and cryptographic techniques, which can help lay out the comprehensive ground field for subsequent research.3) We give metrics to evaluate existing mechanisms and carry out meticulous comparisons among the privacy preservation techniques from the perspectives of functionality, security, and efficiency.We also provide possible future research directions on this topic to encourage readers to explore more practical and secure outsourcing constructions.

C. Organization
The organization of the remainder is shown as follows.Section II introduces the background of NFV and the related outsourcing practice, including the definitions and examples of NFV and the explanation of pattern matching and range matching in the cloud settings.Section III discusses the security issues, including the introduction to the cryptographic tools and the detailed description of the threat model of in-the-cloud middleboxes.In Section IV, we introduce the state of the arts from the perspective of functional completeness, such as equality matching, function-enriched matching, and general network function.Section V provides a deep comparison among the state of the arts on functionality, security, and efficiency.Finally, open research directions are proposed in Section VI.

II. N F V A N D T H E O U T S O U R C I N G P R A C T I C E
In this section, we start by introducing the concepts of NFV and several examples of network functions.Then, we categorize the outsourced network functions and introduce the outsourcing model.Finally, we present three representative usage scenarios of outsourcing network functions.

A. Network Function Virtualization
Traditional network architecture suffers from the incompatibility of diverse network hardware appliances.Middleboxes are designed and implemented under different standards, which dramatically hinders network application development and maintenance.It is also difficult and costly to implement new network functions by adding new middleboxes in an existing network architecture with incumbent hardware settings.
NFV is proposed to solve the above problem.It aims to standardize network middleboxes by consolidating various networking equipment, e.g., servers, switches, and storages, in network nodes and datacenters [1].As shown in Fig. 2, a typical NFV architecture includes three components: 1) virtualized network functions; 2) the NFV infrastructure, including virtual compute, virtual storage, virtual network, and related hardware resources; and 3) NFV management and orchestration.In the following, we present several typical examples of network functions.
1) Deep Packet Inspection: DPI is a network traffic analysis technique that performs traffic analysis, inspection, and filtering [80], [81].The rule generator determines inspection rules in middleboxes.Different from common filtering functions that only detect the IP header; the rules in DPI may involve the features of both the header and the application layer payload.A DPI middlebox will compare the incoming packets with the inspection rules.If there is a match, the middlebox will perform certain actions to the packets, such as accept and drop, according to the rules.DPI can be used for malicious network traffic detection, precise advertising, and so on.
2) Intrusion Detection System: IDS is a network function that can prevent malicious access to

Fig. 2. Architecture and examples of NFV [1]. The hardware-based middlebox (left) can be realized in an NFV architecture as virtualized network functions (right). The architecture can be divided into three levels: the network functions, the NFV infrastructure, including virtual compute, virtual storage, virtual network, and related hardware resources, and the NFV management and orchestration.
internal networks [82]- [84].IDS has three variations: 1) signature-based detection, which utilizes the pattern match to detect malicious traffic; 2) anomaly-based detection, which relies on machine learning to train a model to differentiate good or bad traffic; and 3) reputation-based detection, which scores each traffic flow to recognize potential threats.The widely used signature-based approach includes two steps: preprocessing and attack signature matching.The packets are first decoded and reassembled into IP packet fragments.When the flow is acquired, the IDS will first conduct the multistring pattern matching to detect specific keywords (or key ports).If nothing is found, the flow is tagged as "innocent."If there is a match, IDS will further perform signature detection, which includes the metadata pattern match (e.g., IP range), string pattern, or binary pattern match in the payload.

3) Firewall:
The firewall is a widely used network function that monitors and controls the network traffic between the internal network and the external network [85].Traditional firewalls run on a specialized middlebox and work as a packet filter based on the ruleset.According to the ruleset, the firewall checks the five tuples in the header, i.e., the source IP address, the destination IP address, the source port number, the destination port number, and the protocol, to decide whether to accept or block the packet.To tackle the problem that traditional firewalls do not keep the track of packet contexts, stateful firewalls have been proposed [86], [87].A stateful firewall maintains a state table to store the context of each packet so that new connection requests can be associated with previous connections [88].In this way, stateful firewalls can support more expressive policies and provide a stronger security guarantee.

4) Network Address Translation:
NAT is typically deployed at network edges to allow a large number of hosts to connect to the Internet with the same IP addresses [89], [90].When a user in an internal network wants to communicate with the Internet, the NAT gateway replaces the internal address with a public IP address according to the mapping table.NAT shields the internal network such that all computers within the intranet are invisible to the public network.NAT mitigates the problem of IP address exhaustion by enabling multiple computers to share Internet connections.
5) Load Balancer: LB assigns the incoming traffic to multiple network devices (e.g., firewalls) or links, which effectively improves the processing capability while guaranteeing high reliability.The LB acts as a scheduler in a server cluster, who first receives all requests from the clients and then assigns the requests to the backend servers according to the load condition of each server to optimize the overall network performance [91]- [93].
6) Content Delivery Networks: Content delivery networks (CDNs) [94], [95] aim to deliver large-scale content, e.g., video streams.CDN is a distributed network consisting of proxy servers and data centers that can improve the response speed and hit ratio of users.Through load balancing, content distribution, scheduling, and other functions of the central data centers, CDN relies on the proxy servers deployed in various places to enable users to get the required content nearby.In-network caching and   [96] can be leveraged to increase the efficiency of content delivery.

B. Outsourcing Network Functions
In an NFV architecture, network functions are performed on virtual machines, making it suitable to leverage cloud services.Outsourcing network functions to a cloud server can relieve the local computational burden and reduce the costs of deploying network equipment.Nonetheless, the traffic of clients has to be transmitted to the untrusted cloud.The clients should encrypt the packets before sending them to the cloud to preserve the privacy of the traffic.
The studies of traditional NFV do not specifically classify the network functions from the perspective of matching granularity.However, the operations on encrypted traffic are much more complicated than those on plaintext.To distinguish the difficulty of realizing different network functions on encrypted traffic, we need to classify the network functions from the terms of the computational effort.Following [24] and [77], we classify the general network functions into two classes: equality matching and functionenriched matching.The function-enriched matching-based network functions can be further divided into range match, substring match, wildcard match, and regular expression match.We present a toy example of matching types in the style of Snort rules in Table 2.In addition to the basic matching, stateful matching and packet modification are also common in network functions.In the context of outsourced network function, stateful match and packet modification can be regarded as advanced functions.Besides, according to Chiosi et al. [1], NFV is applicable to switching elements like routers and router functions.Thus, we can also regard routing as a network function from a high level of view.The explanations of the above functions are listed in the following.
• Equality match checks whether a packet contains one or more specific keywords (e.g., signatures or watermark) and matches the keywords with an action, such as drop or accept.

C. Outsourcing Model
There are two types of outsourcing models [26], [60]: the bounce model [97] (also called APLOMB system) and the go-through model [24] (also called NFV system).As shown in Fig. 3, in the bounce model, when interacting with external sites, the gateway "bounces" the traffic to the service provider to perform network functions, after which the middlebox sends back the traffic to the gateway.The rules in the middlebox are generated by the enterprise (or the gateway) who requests the network function outsourcing service.The go-through model includes four parties: the sender, the receiver (or the server), the rule generator, and the service provider.In the setup phase, the rule generator outsources the rules to the middlebox.The rule generator can be a third-party professional corporation, such as McAfee or an intranet administrator.During the communication, the sender directs the traffic to the middlebox that runs outsourced network functions, and the middlebox sends the traffic to the receiver.

D. Usage Scenarios
To clarify the motivation of outsourcing network functions to the cloud in a privacy-preserving way, we further introduce three representative usage examples.1) Parental Controls: This example is first introduced in [24].Alice has registered for parental control services from an Internet service provider (ISP) to monitor the traffic and filter adult content for her kids.However, Alice is concerned that the ISP would sell her personal communication data to other organizations.Therefore, Alice would like the ISP to inspect her encrypted traffic instead.
2) Enterprise Network Monitoring: Bob's company has subscribed to an Azure Firewall [10] for real-time malicious traffic filtering with high availability.However, Bob is not comfortable to expose his commercially sensitive data to Azure or the possible eavesdroppers from rival companies.In such a scenario, a privacy-preserving in-thecloud firewall will meet Bob's requirements.
3) Video Applications: Hulu is an American video platform that offers live TV streams [98].To achieve a better user experience, Hulu subscribes to Amazon Cloud-Front, a fast and programmable CDN running on Amazon cloud [9].In a CDN, the video contents are stored in distributed edge servers to enable quick video delivery.However, Hulu is worried that their clients may be unsatisfied that their private data (e.g., video access history) are exposed to other parties.Hence, Hulu would like to keep their data encrypted in the edge servers, while the CDN in the cloud can still eliminate redundant network traffic, distribute the video contents, and so on.

III. S E C U R I T Y I S S U E S O F O U T S O U R C I N G
In this section, we introduce the privacy and security issues regarding network function outsourcing.We formally define the threat model to address the challenges of outsourcing network functions.We also present several cryptographic tools that are commonly utilized to design outsourced middleboxes for privacy preservation.

A. Threats in the Outsourcing Practice
Outsourcing network functions to remote cloud benefits a lot in alleviating local computation resources, but, in the meantime, it brings in the risk of exposing sensitive data of both the traffic and the network functions.
Here, we introduce possible threats that the clients (or enterprises) may encounter when outsourcing their traffic and network functions, especially from the perspective of privacy.Specifically, we describe the threats from three aspects: adversaries in different network domains, attack surfaces in different network layers, and the attack means.Note that, since this article focuses on the privacy and security issues of building in-the-cloud network functions, we especially discuss the threats arising from the outsourcing practice, where network functions are deployed at untrusted settings.Generally speaking, our considered security and privacy threats are also relevant to other network function deployment scenarios where the context of the deployment might not necessarily be in the trusted domain.
1) Threats in Different Network Domains: From the perspective of administrative domains, threats may come from different network entities involved in the communication process.We consider three types of threats: 1) adversaries residing at the cloud; 2) malicious endpoints, and 3) eavesdroppers in the communication channel.a) Threats in the cloud: In the outsourcing model, the third-party cloud can get full access to the traffic and the function.There are two types of cloud servers: honestbut-curious servers and malicious ones [59].An honestbut-curious server strictly follows the protocols but tries to infer as much private information as possible from the traffic, outsourced functions, and processing results [24], [26].For example, some companies who provide cloud services are reported to sell the clients' private data [99].More seriously, the cloud is under various external threats, e.g., frequent cloud data breaches [100].To tackle such a threat, encrypting private data is essential.Malicious cloud servers may disobey the protocol, forge the results to avoid computational expenses, or even learn more sensitive information.In face of a malicious server, a computational verification mechanism is required to ensure the correctness of the results [101], [102].
b) Threats on the endpoints: This type of threat is the same as that considered by traditional network functions, i.e., the original adversaries considered by regular network functions, such as IDS [24].A malicious endpoint may send illegal packets to try to pass network functions like DPI and firewalls.Traditional intrusion detection assumes that at least one of the endpoints is honest [82], [83].Similarly, in the outsourcing context, at least one of the endpoints will actively use the outsourcing service, which we regard as fully trusted.The case of two malicious endpoints is generally not considered in today's literature study.If both endpoints are malicious, they can communicate with each other with a secret encryption key in some secret channel without using NFV.Detecting such kind of encrypted communication requires traffic pattern analysis [103]- [105], which is orthogonal to the topic of this survey.c) Threats in the channel: The cloud and the communication channels are vulnerable to eavesdroppers.The eavesdroppers may intend to snoop on client traffic or even jam some packets [106].Encrypting traffic and network function strategies is the most common countermeasure to defend against such threats.For example, security protocols, such as TLS, can defend against eavesdroppers by encrypting the communication traffic.In the design of privacy-preserving network functions, the traffic and the functions are kept encrypted in the communication channel to prevent eavesdroppers from learning private information.
2) Attack Surfaces in Different Network Layers: Since we focus on privacy issues, the attack surface here especially refers to the privacy vulnerabilities of network flows.To be more specific, the private information includes contents in the transportation-layer (i.e., L4) payloads and other header information in the data link and network layers (i.e., L2 and L3).Ideally, the design of the outsourced network functions should keep all the private information from the adversaries and enable specific computation in corresponding fields.Following Duan et al. [20], we also divide the private information into two aspects: L2-L4 headers and L4 payload.a) Private information in L2-L4 headers: The most important information in L2-L4 headers is the five tuples, i.e., the source IP address/port number, destination IP address/port number, and the protocol.It is obvious that such data in plaintext can reveal the identity of the sender and the receiver.Sometimes, clients may not want their private information (e.g., destination addresses) to be exposed to the cloud server or the adversaries in the communication channel.The problem is that it seems to make no sense to forward a packet without letting the routers learn the destination IP address in the actual network architecture.However, in the bounce outsourcing model [see Fig. 3(a)], the real destination IP address is encrypted and forwarded to the in-the-cloud middlebox to perform network functions, such as firewalls and NAT [26].Such a process protects the private information in the header from the in-the-cloud middleboxes and does not involve physical routers in the path of the sender and the receiver.
With regard to defending against the eavesdroppers in the communication channel, the clients can use security protocols like IPsec [107] and TCPcrypt [108] to protect the private data in the headers.b) Private information in L4 payloads: The L4 payloads carry the specific contents in the communication, and the adversaries strive to learn as much as sensitive information from them.The goal of designing an in-the-cloud network function is to protect private information while preserving the cloud's ability to inspect the traffic.
3) Attack Means: When outsourcing the network functions, we should consider two types of attacks: 1) the original attacks against entire traditional network functions and 2) the attacks on the primitives used to realize computation on encrypted data.The former includes the spying and tempering attacks, the denial-of-service (DoS) attacks, and so on.The latter includes the side-channel attacks on trusted hardware, the cryptanalysis-based attack against customized cryptographic techniques, and so on.a) Spying attack: In this attack, the adversaries listen to the communication traffic of two innocent end-users, trying to extract as much private information as possible [76].We have detailed the private information in Section III-A2, and encrypting the traffic can well protect such private information.Preventing spying attacks is the basic security goal of the network function designs.

b) Denial-of-service attack:
The DoS attack has a wide category, with the goal of making the target network service unavailable [109].In DoS attacks, the adversaries strive to exhaust the target network's bandwidth by controlling botnets to launch the User Datagram Protocol (UDP) flood, the Internet Control Message Protocol (ICMP) flood, the Domain Name System (DNS) flood, the HTTP flood, and so on.Many traditional network functions can prevent the DoS attacks, e.g., firewalls, IDS, and DPI.Fayaz et al. [110] also proposed to leverage the SDN/NFV techniques to defend against the DoS attacks.As for the outsourced network functions, whether a design can detect or prevent the DoS attacks can be regarded as an advanced feature [25], [111].

c) Side-channel attack:
A line of research on outsourced network functions relies on trusted hardware, especially the Intel SGX.However, SGX is vulnerable to various side-channel attacks [112].For example, in SGX, the memory pages need to be loaded into EPC with limited memory.Since the operating system has direct access to memory management, the malicious OS can decide whether to flush the translation lookaside buffer (TLB).Thus, the adversary can analyze the code of SGX applications to locate the attack address and then learn what the SGX has accessed by flushing the TLB and recording the memory footprint.Unfortunately, most existing hardware-based outsourced network function designs do not consider side-channel attacks.For security considerations, we emphasize that it is essential to consider such attacks when applying the outsourced network functions in practice.
d) Cryptanalysis-based attack: Besides the trusted hardware, another line is to utilize customized cryptographic tools, which may face potential security threats of cryptanalysis.Take the brute-force attacks for example.In this attack, the adversaries analyze the cryptographic protocols by exhausting the input space.Ideally, the cryptographic protocols are secure against brute-force attacks.However, the design of the cryptography-based outsourced network functions may have defects when applying the encryption schemes.For example, Ning et al. [34] pointed out that PrivDPI [32] is vulnerable to the brute-force guessing when the rule set is small.

B. Privacy and Security Goals
In the paradigm of outsourcing the network functions, both the traffic and the outsourced functions (rules) should be protected from the cloud server and other possible adversaries.
1) Rule Privacy: Typical rules contain inspection strategies, such as the filtering rules in DPI/IDS or firewalls [84] and the export policy in routing [42].With the rules, the middlebox calculates and compares the content of the packet to match the corresponding rule and, finally, informs the gateway what action it should take on the packet.Disclosing rules to adversaries will cause significant potential risks to network security.With the knowledge of the rules, attackers can deliberately construct malicious traffic that can circumvent these rules or even infer the contents of the packet from the results of the inspection process.
There are three types of rule generators: 1) the receiver; 2) the administrator of the sender; and 3) a professional third-party corporation.In all cases, the rules should be kept private from the in-the-cloud middleboxes.The cloud server can only perform the virtualized network functions without knowing the content of the rules.For the first two cases, one of the endpoints can learn the rules, but the other endpoint should not learn them.If the rules are generated by a third-party (i.e., case 3), the rules are trade secrets in this sense, so they should not be revealed to both the endpoints and the cloud middlebox [71].
2) Packet Privacy: The privacy goal of network function outsourcing is that the private information in packets (as discussed in Section III-A2) should not be known to third parties other than the sender and receiver [24], [67].Therefore, outsourced network functions will be performed on encrypted packets.For outsourcing, the encryption scheme of the packets should be designed together with that of the function to enable computation and comparison.One of the tremendous challenges is the allocation of keys.The endpoints can encrypt the packets, but the secret key should not be revealed to the rule generator who encrypts the rules.The one who encrypts the rules should not decrypt the packets with the rule keys, either.Hence, it is necessary to support middlebox functions over encrypted packets and rules using different keys.

C. Cryptographic Primitives and Trusted Hardware
To enable privacy-preserving computation on encrypted data, cryptographic tools, such as searchable encryption [113]- [115], homomorphic encryption (HE) [116], [117], secure multiparty computation (MPC) [72], [118], and trusted hardware [119], can be used as the building blocks.Although these tools have been well studied in the field of cryptography, how to apply them in the context of network functions remains to be fully explored.To help the readers better understand the privacy-preserving network function outsourcing designs, we briefly overview a comprehensive list of security and cryptographic tools that could be used to enable privacy-preserving network functions without exposing the packet content and/or the rules.A summary of the abbreviations is presented in Table 3.
1) Oblivious Transfer: Oblivious transfer (OT) [118], [120], [121] solves the problem that a party A wants to share one of the dataset D = (D1, D2, . . ., Dn) with another party B who wants D b .A does not want to share other data than D b , and B does not want to reveal b.OT ensures that B will obtain D b without learning D\D b or revealing b.OT can be built on asymmetric primitives, such as RSA [122].
2) Garbled Circuit: The notion of GC was first proposed by Yao [72] to enhance secure MPC (SMPC).A GC allows two parties to compute results jointly without revealing their own inputs.At a high level, a GC protocol includes two steps: 1) GC preparation and 2) circuit evaluation.In the preparation phase, one party generates the circuit according to the functions.Afterward, the other party evaluates the GC by her choice of the input and the generator's input keys The evaluated results are sent back to the generator to get the final results.Note that the evaluator will obtain the keys obliviously with the auxiliary of OT.Thus, the generator will not learn the input of the evaluator.
3) Homomorphic Encryption: HE is a class of public-key encryption schemes with homomorphic properties [123], where the operations (i.e., addition, multiplication, or both) on the ciphertext will exactly match the same operations on the plaintext.There are roughly three types of HE systems: partially HE (PHE), somewhat HE (SHE), and fully HE (FHE).In PHE, the operations on plaintext correspond to either standard arithmetic addition or standard arithmetic multiplication e.g., the famous Pallier cryptosystem [116].In SWHE, the operations are limited to some "low-degree" polynomials, e.g., the famous BGV system [117].In FHE, the operations can be arbitrary composition of addition and multiplication [124]- [128].

4) Multilinear Map:
The multilinear map is an extension of the bilinear map.Multilinear maps of the symmetric form and the asymmetric form were first realized in [129] from the ideal lattice.Later, Coron et al. [130] brought about multilinear maps over integers.Shortly afterward, Gentry et al. [131] came up with a scheme from general lattices.Braderski and Rothblum [132] proposed construction based on the asymmetric multilinear map to obfuscate conjunctions.

5) Order-Preserving
Encryption: Order-preserving encryption (OPE) [133] is designed to map the nonuniform distributed plaintext data into ciphertext intervals that are uniformly distributed such that the characteristics of data distribution can be hidden.In OPE, the order relation of the ciphertexts is the same as that of the plaintexts.However, it reveals the original order.Therefore, OPE is vulnerable to statistical inference attacks [134].According to Cash et al. [135], OPE reveals more information than plaintexts order.In the literature, many schemes have been devoted to improving the security of OPE [136], [137].
6) Order-Revealing Encryption: Order-revealing encryption (ORE), also known as efficiently orderable encryption, is symmetric encryption, such as OPE, which can be used for range search on the ciphertext.Compared with OPE, ORE achieves stronger security by obscuring the order relations of the plaintexts.Chenette et al. [138] and Lewi and Wu [139] proposed modified solutions to have its security ensured.Cash et al. [135] presented the first ORE construction with the bilinear map to reduce the leakage of sensitive information rather than the multilinear map [140].Nevertheless, Grubbs et al. [141] and Durak et al. [142] demonstrated that ORE could be under the latest leakage-abuse attacks.
7) Searchable Symmetric Encryption: Searchable symmetric encryption (SSE) [113]- [115], [143]- [145] was first introduced by Song et al. [143].Searchable encryption allows the client to outsource his/her private data while preserving the ability to search on the encrypted database.Curtmola et al. [113] further defined the formal security model of SSE and proposed an SSE scheme of optimal search complexity (which is linear with the number of the matched results).Later, Kamara et al. [114] proposed a dynamic SSE based on the work of Curtmola et al. [113].Typically, an SSE scheme is built on symmetric primitives, such as pseudorandom functions (PRFs) and pseudorandom permutations (PRPs).Thus, SSE has the advantages of lightweight computation and high run-time efficiency.In an SSE scheme, the dataset is first parsed as a keywordidentifier index, which is called an inverted index, and then, the index is encrypted and uploaded to the server.The encrypted index is generated with trapdoors for later search.With the search token (trapdoor), the server can search encrypted data containing the related keyword without learning either the keyword or the data.Similar to OPE and ORE, recent results have shown that certain SSE constructions may be subject to leakage-abuse attacks with various underlying adversarial assumptions [146]- [150].
8) Trusted Hardware: Trusted hardware, also known as a trusted execution environment (TEE), provides confidentiality and integrity guarantees for processes running inside the specific hardware.Here, we introduce a typical TEE design, i.e., Intel Software Guard Extension (SGX).SGX is a set of special CPU instructions along with a specially designed CPU architecture called secure enclave (enclave for short).Enclave mainly refers to a particular container with software codes, confidential data, and memory stacks.SGX places confidential information inside the CPU package, where the data and codes are transparent to the inside software, and the integrity of the software can be verified remotely.The container cannot be accessed or tampered with by either malicious adversaries or curious infrastructure owners unless opening the CPU package [119].
The enclave cannot be accessed or tempered because of its special hardware design.In SGX, there is a special memory region inside a CPU package called EPC [119].EPC can only be accessed from the inside of the enclave.However, EPC has a constrained 256-MB space as of today.To run the normal application inside the enclave, SGX leverages the memory encryption engine to handle the data swapping between untrusted memory and EPC.The memory encryption engine encrypts traffic between the processor package and main memory and verifies its integrity.The sophisticated hardware design of SGX makes its trusted computing base (TCB) smaller than the vanilla approach.The TCB of the SGX-based approach is the enclave and the hardware itself, while the traditional application includes an operating system and virtual machine monitor inside its TCB.Fig. 4 shows the TCB area of both approaches.Programs running inside the enclave must be trustworthy to ensure that the secrets are not leaked.To achieve this, SGX leverages the attestation and sealing technology [151].The instantiation of the enclave program can be attested at the very beginning of the deployment process, which makes the following confidential information transported to the enclave securely.

IV. P R I V A C Y P R E S E R V I N G N E T W O R K F U N C T I O N S
To preserve the privacy of network communication, the clients should encrypt the packets before transmission.In the early years, to perform network functions on the encrypted traffic, the traffic is decrypted in the middle by mounting the man-in-the-middle attack on the middlebox, and then, the middlebox can inspect the decrypted traffic [66].Although this method is intuitive and effective, decryption in the middle of the communication violates end-to-end security, which leads to privacy information leakage and brings security risks.
Network functions can be extracted as inputs that include packet contents and predefined rules and outputs that include the actions for the packet.Most network functions require matching on the packets and rules, or computation on the header of the packets.Functions such as NAT also require modification on the header.In this section, we introduce state-of-the-art approaches that achieve network functions in a privacy-preserving way by using cryptographic tools, such as symmetric encryption, HE, and MPC, and trusted hardware, such as, Intel SGX.We give an overview of the properties of the network function outsourcing schemes in Table 4.

A. Equality Match
Equality (pattern) matching is one of the most basic functions.Network functions, such as signature/watermark detection in DPI or exact IP matching in firewalls, require exact string match over a packet and predefined rules in the middlebox.Besides, matching can also be performed between packets, e.g., in-network caching for content delivery.Table 5 summarizes the main characteristics of the representative equality matching schemes.
1) Packet-Rule Matching: Generally, the rules, which can be regarded as sensitive keyword-action pairs in Snort [84], are tokenized, encrypted, and outsourced to the middlebox in the setup phase.Then, the traffic is also tokenized and encrypted by the sender.There are generally two types of packet parsing methods: delimiterbased segmentation used in [24] and window-based ngram used in [25] and [27].In some schemes, additional information, such as the offset of tokens, is attached to the tokens to fit more complex rules, such as multikeyword matching and domain matching.The middlebox performs the inspection on the encrypted rule database.Actions, such as accept or drop, will then be matched, and the middlebox will inform the result to the gateway of the receiver.
The main challenge of packet-rule matching is to encrypt the rules and packets while preserving the data association between the packets and rules.An intuitive idea is to encrypt the rule (watermark/signature) tokens and the packet tokens, respectively, and then compare the packet tokens with the rule tokens.In the following discussions, we introduce the OT-based schemes and searchable encryption-based schemes in detail.
a) Secure computation-based method: OT and GC can be used to exchange data privately.Sherry et al. [24] first introduced the GC to perform DPI on encrypted traffic without decrypting the packet.As shown in Fig. 5, the traffic is first tokenized and then encrypted through a deterministic symmetric encryption scheme, such as AES.The middlebox needs to encrypt the rules in the same way without learning the secret key of the packet.To hide the secret key from the middlebox, BlindBox designs a GC [72] to encrypt the rules by AES.Then, the encrypted rule tokens are kept in a search tree that enables logarithmic lookups.For single keyword matching, e.g., document watermarking, the middlebox checks whether there is a match between the search token and the rules in the search tree.For multiple-keyword matching, additional information, such as absolute and relative offset, is sent together with the token to the middlebox to check if the offset is a match.However, it has disadvantages on long setup time and the exposure of inspection rules to the middlebox.
To reduce the setup time in BlindBox [24], Lin et al. [152] replaced the AES GC with one-way hash functions and XOR functions to encrypt the rules and messages.During the setup phase, the sender randomly generates m pairs of strings (k0, k1) for every rule of m bits.Then, the middlebox uses the OT protocol [118] to obliviously get the random strings in the rule sets and encrypts the rule by XOR the strings.The packet is encrypted similarly.Different from BlindBox, the middlebox obliviously obtains the keys from the sender and encrypts the rules by itself, not from the GC.The encryption needs no interaction, and the shared keys do not reveal the content of the forwarding packet.However, the search tree cannot be used in the process phase because the rule tokens cannot be preprepared.The matching procedure needs to compute the one-way hash and XOR function on the encrypted rules, other than checking the equivalence of the rule token and the  A recent work, PrivDPI [32], also managed to lower the overhead caused by the one-time GC in BlindBox by providing a reusable encrypted rule generation method following the idea of a practical and simple OT protocol [154].In PrivDPI, the encrypted packet tokens in a new session can be derived from the encrypted tokens in the last session by preserving a count table, which makes the encrypted ruleset in the middlebox reusable.However, according to Ning et al. [34], PrivDPI is vulnerable to brute-force attacks, where the middlebox can forge any encrypted rules by itself and then infer the content of the encrypted traffic.Ning et al. [34] then proposed an improved scheme called Pine, which is more efficient and secure than PrivDPI.Pine also supports updatable rule sets, i.e., additions on the rules after the prepossessing step.
b) Searchable encryption-based method: For virtual network functions that mainly focus on pattern matching and filtering, searchable encryption can be used to perform matching operations directly on encrypted traffic.Searchable encryption, especially the index-based SSE, is efficient to search for specific keywords in an encrypted rule index while preserving both the privacy of keywords and the content of the packet.
When checking whether the content of a packet includes a malicious signature, the middlebox needs to search for the malicious rule tokens in the payload tokens.Decryptable searchable encryption (DSE) [155] can be leveraged to solve this problem in a privacy-preserving way.DSE can detect whether a particular token is in a given ciphertext by using a bilinear map and XOR the results.However, once there is a match, the plaintext of the token (the keyword) would be revealed.BlindIDS [71] fixes this weakness by adding a secret key and makes sure that there is no leakage over the rules (or the keywords).During the detection, the middlebox compares every token in the encrypted traffic with each preuploaded rule token on a DSE-based construction.Unfortunately, this method suffers from considerable detection time due to the one-toone comparison and the private-key setting.Nonetheless, compared to BlindBox [24], the above two designs, indeed, reduce the setup time by encrypting the patterns only once for all connections.
Index-based searchable encryption [114], [115], [143]- [145] is suitable and efficient for exact signature matching because of the sublinear search on the encrypted index.Yuan et al. [25] first used an index-based SE to realize the single keyword match on encrypted traffic.In their method, an admin server parses the rules as string-action pairs and encrypts them with trapdoors in a way that the trapdoors can be searched later.The encrypted rule index is built based on cuckoo hashing [156], [157].The pairs are stored in two hash tables, and the locations of the pairs in the hash tables serve as the trapdoors, which can be generated from the encrypted strings in the traffic.The payloads in the traffic are parsed into strings based on predefined principles.Then, the string is transformed into a random token using a PRF with a pregenerated secret key.When the middlebox receives both the encrypted traffic and tokens, it can search for the tokens on the encrypted rule index in the middlebox.Once a token matches an entry in the filter, the middlebox takes the result action on the encrypted traffic, e.g., dropping the packet.Note that the cryptographic primitives are all symmetric so that their method can be quite efficient.Nonetheless, the encrypted rule index is static, and only equality-match rules are supported.
In the dynamic network environment, updatable rule index is essential for enterprises to upgrade their network functions, such as firewalls and DPI.Guo et al. [30] extended the broadcast encryption (BE) construction to support updates on the rule index, i.e., the rules for DPI can be added or deleted dynamically.However, updates may lead to more leakage than static searchable encryption schemes [148], [158].Guo's method achieves forward privacy [159] (which prevents adversaries from inferring the keywords in newly added data with search tokens).Their scheme consists of two noncolluding servers: a rule server and a filter server.The rules are encrypted as a searchable index.During the inspection, the gateway parses the packet as a header and a message, then generates the search token, and delivers the token to the rule server to search for an encrypted message.If there is a match, the filter server will check the corresponding action and inform the gateway.For forward privacy, the entries for a string will be reencrypted under a fresh key when it appears in a newly added rule.The disadvantage of this work is that it considers watermark-like rules, and string-like rules (e.g., watermark fragments of different lengths) are not supported.
2) Packet-Packet Matching: In-network caching can be well leveraged to increase the efficiency of content delivery, especially scalable media, such as images and videos.In-network caching is helpful to mitigate a large amount of video traffic.However, the cache-enabled routers are vulnerable to potential attacks and, thus, threaten the privacy of user data.Secure transport protocols, such as HTTPs, bring difficulty in leveraging in-network caching, for the confusion caused by the encryption over the traffic.Different from functions such as DPI, there is no specific rule in an in-network caching scheme, and only matching between packets is required.Generally, packets are processed and outsourced on middleboxes, such as routers, and then, they are compared with subsequent packets.RE, which was first introduced in [96], is one of the essential functions of efficient video delivery.The architecture of RE is illustrated in Fig. 6.The video trunks are cached in a router.When the client requests a certain video trunk, he/she can first check whether there is a preloaded trunk in the router.If there is a match, the client can directly download the trunk without having to request from the video provider.Technically, the middlebox first generates fingerprints for incoming packets, tokenizes the traffic with a fixed-length sliding window before hashing each token, and, then, checks if the subset of fingerprints of a packet can match one in the fingerprint cache.If so, the corresponding packet should be taken out from the packet cache to maximize the matching region.Otherwise, the fingerprints and the packet should be put into the cache.Taking privacy into consideration, the client may be reluctant to expose the content of the requested video to the middlebox.Thus, it is crucial to design a secure RE method to protect privacy while preserving the search function.
In the early days, Misra et al. [52] utilized BE to control access to the encrypted video trunks in nearby routers for different clients.Wu et al. [53] protected the confidentiality of the scalable video coding videos by attribute-based encryption in a content centralized network.These are the earliest attempts to achieve secure video delivery.However, Misra's work is not designed for general contexts but only for its specific content centralized network, and Wu's work does not leverage the efficient in-network cache.
Yuan et al. [54] first combined searchable encryption with efficient video delivery.They proposed a secure RE method to perform high-efficiency video delivery through encrypted in-network caching without revealing sensitive information about the videos.To ideally make use of the in-network caching, video chunks should be encrypted in a way that the routers can locate and access them easily without learning the content of the video.An encrypted fingerprint index is generated in advance by the application server, e.g., YouTube, and stored in a middle semihonest request handler.The encrypted index is built from a cuckoo hashing-like method.Each video chunk has a pseudonym that will be inserted in the hash tables.The address in the hash tables is calculated by the video fingerprint and several PRFs, which can be used as the search token for the handler to locate the pseudonym in the encrypted index.When the handler succeeds in locating the related pseudonyms, it looks up the addressing table for the pseudonyms and sends the pseudonyms along with the user addresses to targeted routers.Then, the routers can forward the requested chunks to the users.
Fan et al. [56] proposed REET, which can support both intrauser and interuser REs over encrypted network traffic.As shown in Fig. 7, the sender encrypts fingerprints and payloads with two-layer encryption.During the first encryption, the sender uses AES to encrypt every chunk of the fingerprints.As for the second layer, the sender continues random encryption on the payload chunks.This framework uses the public-key traitor tracing scheme proposed in [160] and extends the present BE algorithm to deliver the content only to legitimate users with high-level security.Besides, they managed to cache the content at a nearby router such that valid users can receive the content even when the providers are offline.
Near-duplicate detection (NDD) is a general data caching function that can help to reuse near-duplicate data and alleviate network traffic congestion.Cui et al. [55], [57] proposed a secure NDD function by resorting to multikey searchable encryption (MKSE) [161] to enable queries on encrypted content uploaded from multiple users under different secret keys.Using MKSE, the service provider can transform the user's query encrypted with her own secret key into the form of different content providers' keys.Besides, their method adopts the locality-sensitive hashing (LSH) functions [162] to label the data items, which can be regarded as the keywords in MKSE.Because LSH may bring false positives, a secure two-party computation protocol based on Yao's GC is implemented to determine whether the difference between the candidate fingerprints and the query item is small enough (i.e., the distance between the hashes generated from LSH is under an expected threshold).However, due to the one-time security of the GC, the GC needs to be refreshed every time, thus leading to a long detection delay.

B. Function-Enriched Matching
While the equality match is adequate for certain applications, it is essential to support more enriched functions, such as range match [26], [37], [40], [41], [163], [164], substring match [27]- [29], [33], wildcard match [27], [36], [38], [39], and regular expression match [28].The rules that require enriched matching are much more complicated than single keyword matching.A simple solution is to decrypt the suspicious traffic and perform the regular expression on the plaintext, as introduced in Blind-Box [24].However, this solution discloses the content of the packet to the middlebox.In this section, we will introduce several methods that succeed in performing the function-enriched matching on encrypted packets without leaking the private data.Table 6 illustrates the main characters of the approaches of enriched matching.1) Range Matching: Range match is a common function in firewalls, NAT, and DPI, which checks whether the value in a field matches a predefined range.For example, a firewall may have a policy that requires dropping the packets whose port number is between 0 and 2048.
Khakpour and Liu [163] first attempted to outsource the firewall to the cloud privately.They proposed Ladon, a design leveraging bloom filter [165] as an efficient tool to anonymize the firewall rules.The bloom filter is a binary vector data structure that can be used to detect whether an element is a member of a set or not.The bloom filter maps data to bit vectors by multiple hash functions, with the corresponding position being 1. Ladon is built on a new data structure named the Bloom Filter Firewall Decision Diagram (BFFDD), which is a range-based decision tree generated from the general firewall decision diagrams (FDDs) [166].Fig. 8 shows the process of BFFDD.The key idea of Ladon is to use a bloom filter to represent the edge sets in an FDD constructed from a given ACL such that the ISPs could only access anonymized firewall policies.Another work [164], however, points out that, although the cloud service providers cannot learn the exact information of the original firewall policies, they are still aware of the final decision of a certain packet.In this case, the cloud service provider can know whether a packet is good or not in a period anyway.Instead of building a set of BFFDDs to eliminate the ambiguities as Ladon did, Kurek et al. [164] decided to use a single BFFDD and get multiple decisions on purpose.Afterward, the packets resulting in the same decisions are processed as they are supposed to be, while packets resulting in multiple decisions need to be filtered additionally in the private cloud.
Embark [26] is the first system designed to support outsourcing a wide range of network functions.Compared with BlindBox, Embark enables more middleboxes, such as firewalls, NAT, and HTTP proxies, to be outsourced with the privacy protected using a combination of three cryptographic tools, namely, traditional AES, KeywordMatch from the BlindBox construction in [24], and their innovative scheme PrefixMatch.PrefixMatch allows an encrypted value to be compared with the encrypted endpoints of a certain range using the operators ≤ and ≥ so that the middlebox can decide if the encrypted value is situated in the encrypted range.Specifically, when the prefixes are encrypted, the endpoints of prefixes or ranges [si, e1] are arranged in increasing order, and each pair of endpoints divides P0 = [0, 2 len − 1], where len is the size of the endpoints, into several nonoverlapping intervals Ii.The interval belonging to the same set of prefixes is assigned with one encrypted prefix, which is a random value with the same size as the prefixes.This encrypted prefix of the interval Ii is also assigned to the value v ∈ I, and the suffix of v is randomly chosen.From the perspective of performance, it is discussed in [26] that the long-lived connection between the gateway and the in-cloud middlebox saves time of handshake.In addition, compared with existing schemes based on OPE [137], [167], Embark achieves significantly better performance with a higher security level.However, we should not underestimate the prohibitive cost of rule updates in Embark, as it is possible that the new prefixes or ranges overlap with the old ones.
However, the prefix match in Embark becomes ineffective if the filtering rules cannot be represented in the form of a prefix.To tackle this issue, Guo et al. [40], [41] proposed an efficient ORE scheme to realize more general range matching with limited leakage.In their schemes, the header strings and rules are decoded as data values and encrypted into blocks with their order information preserved.Guo et al. [40], [41] also protect the location information of the blocks by randomly permuting them with searchable encryption.The extension version [41] improves the efficiency of range matching for contiguous rules.The key idea is to formulate range matching as a fuzzy search problem.The contiguous range is represented as wildcards and encrypted by fuzzy searchable encryption (FSE) [168].The wildcard-based design only needs one round of fuzzy search, improving the efficiency greatly.
2) Substring Matching: Some rules have particular constructions, where multiple segments of unequal lengths in different locations in a packet are required to inspect [84].For such rules, the construction information is also needed to be securely outsourced.One idea is to outsource the construction information on an encrypted index, which is similar to equality matching.SPABox [28] supports many keyword-based functions over encrypted data, including single keyword matching, keyword sequence matching (i.e., rule matching), and regular expression matching.The substring matching uses a hierarchical hash table to build the encrypted rule index.The first level restores the first five bytes in the first keyword in every rule.The value contains a pointer to another hash table whose entries are related to all possible following keyword tokens.If there is a match through all the tables, the packet is related to some rules.Compared to [24], [25], and [30], the matching process is linear with the rule size, which may be inefficient when dealing with large rule sets.For more general regular expression matching, SPABox adopts the technique of garbled-DFA [169] to do the RE matching on the receiver side without letting the receiver learn the rules.
A similar rule outsourcing method is CloudDPI [27], which uses the reversible sketch instead of cuckoo hashing to avoid insertion failures, compared to [25].CloudDPI parses a rule into several fragments by the wildcards defined in ClamAV [170] and uses a sliding window of a specific size to segment the fragments into tokens.The tokens are then restored in the reversible sketch.Before the insertion, the hash bucket is checked through a bloom filter to ensure that the bucket is available.Each hashed token is associated with a pointer list of related signature fragments, each of which is stored together with pointers to a rule, and the previous and next fragments.The pointer lists are uploaded to the middlebox for fragment checking and rule checking.Thus, the relationship between the rules, fragments, and hashed tokens is revealed.Using the pointer lists, CloudDPI achieves complex rule types, such as substring matching and wildcards.However, CloudDPI reveals too much information about the rules, e.g., the number and repetition of fragments in a rule, and the co-occurrence of the same fragment in different rules.As an extended version of CloudDPI, Li et al. [33] further extended the famous ac pattern matching algorithm [171], [172] to operate on encrypted data.They replace the plaintext character, which is the input to the goto function, with the hashed token, and then, every ending state is associated with a pointer to the related signature fragment list.Because of the adoption of a finite-state pattern machine, the inspection throughput is independent of the size of the ruleset.However, the storage in the middlebox becomes larger as the ruleset grows.Similar to CloudDPI, it also suffers from the leakage of the rule structure.
To solve the pattern matching problem where the search tokens may require arbitrary keywords of arbitrary lengths, Desmoulins et al. [29] designed a new searchable encryption mechanism, named Searchable Encryption with Shiftable Trapdoors (SEST).SEST is a public searchable encryption scheme that deals with substring search.Given a ciphertext and an appropriate search token, SEST can return whether the corresponding substring is in the corresponding plaintext and the positions where the pattern appears.The scheme is constructed by bilinear groups that consist of three cyclic groups and a bilinear map.The plaintext string is encrypted character by character.Every element in a string has a corresponding secret key and a public key.Based on the properties of bilinear groups, the encrypted string and keyword (in other words, the trapdoor) remain the relationship that can be elegantly designed to check the occurrence of these patterns.One of the benefits of SEST is that it can generate arbitrary trapdoors after the encryption of the original string.The search tokens can be universal because they can be generated from arbitrary keywords and used for arbitrary ciphertexts.However, the size of secret keys and public keys is linear in the plaintext size, and asymmetric cryptographic primitive is not as efficient as symmetric ones in [25] and [30].To address this problem, Lai et al. [35] proposed a practical matching protocol, SHVE+, based on symmetric hidden vector encryption (SHVE) [173].SHVE+ encodes the encrypted messages into query trapdoors of SHVE and lets the middlebox search the trapdoors on the precomputed encrypted rulesets.SHVE+ achieves better inspection performance than SEST [29] and supports a wider range of matching functions than [25], [30].
3) Wildcard Matching: There are a large number of matching rules with wildcards in firewall policies.A naive way is to label the positions of the wildcards explicitly and skip the labeled positions when matching [27], which will expose the location of wildcards.Based on the multilinear map, Shi et al. [36] built a framework in SDN, named Secure framework for Outsourcing Fire-wAll (SOFA).In SOFA, middleboxes are obfuscated by the cryptographic multilinear map before outsourcing so that the policies remain confidential from the service providers.There are two basic phases in SOFA: the obfuscation phase and the execution phase.In the obfuscation phase, the local control plane sets up parameters of the multilinear map to construct an obfuscator.In the execution phase, the cloud service provider filters inbound and outbound network traffics to execute corresponding functionality, with the detailed configurations confidentiality preserved.Sheng et al. [38] and Wei et al. [39] leverage HE to encrypt the firewall policies and support wildcard matching.In their designs, the rules are abstracted as r = (v, W, A), where v denotes the bitwise value of the rule, W is the set of wildcard positions, and A is the action.

Fig. 9. Example of the wildcard matching scheme [38] that applies PHE to bit-level representation for the rules and packets with respect to wildcards.
For every bit in v, the client who generates the rules needs to precompute all possible results for the corresponding value in the packet and calculate trapdoors for values 0 and 1.When obfuscating the firewall, the client will choose the same trapdoor of 0 and 1 for the bit in wildcard set W .An illustrative example is shown in Fig. 9.To hide the action, the schemes split the cloud middlebox into two: one for packet matching and the other for the action process.The packet is transmitted to the matching middlebox (which is loaded with obfuscated policies) in plaintext.If there is a match, the encrypted action will be sent to the action middlebox for decryption.The main disadvantage of the above schemes is that the content of the packet is disclosed.Furthermore, the two clouds cannot collude.Otherwise, the adversary can observe the packets, matching results, and actions to infer the policies.

C. Routing
Unlike network functions that are mainly based on rule matching (such as IDS), secure routing involves computation on the outsourced rules and encrypted packets.Here, we specifically discuss the progress on secure routing, i.e., realizing the Boarder Gateway Protocol (BGP) on encrypted data.Although the widely used BGP can well manage available routings among different groups, i.e., corporations and administrations, it has disadvantages in reliability [174], efficiency [175], and privacy [176].For example, information such as routing policies can be inferred from BGP settings [177].Internet eXchange Points (IXPs) provide a centralized route server (RS) service for ranking, selecting, and distributing BGP routes [178].However, IXP members may be reluctant to distinctly forward their private routes to an RS.For ideal privacy concerns, both the routing results and the routing policies should be kept from the IXPs.Generally, route policies include import policy, next-hop policy (which includes the local preference and the shortest path computation), and export policy.The challenges lie in how to preserve both the routing policies (e.g., the local route preference) and communication information (e.g., topology information, destination, and distance information) while obtaining the best choice of routing.Different from network functions, such as DPI and firewalls, the routing function requires computation on the packets and policies, other than simple matching.Thus, most schemes are based on SMPC and HE.Table 7 demonstrates the characteristics of several representative schemes of privacy-preserving routing.
1) Secure Computation-Based Methods: In the very early years, Gupta et al. [42] first introduced SMPC to solve the secure interdomain routing problem.They elaborate on the advantages and challenges of the problem and explore a common cryptographic scheme to implement it.This work presents a heuristic idea.However, it does not meet practical needs.Asharov et al. [45] took a more in-depth insight into secure interdomain routing.They use a two-party secure Boolean circuit, Goldreich-Micali-Wigderson (GMW) [179], to calculate the routes.The GMW protocol can efficiently evaluate the same subcircuits in parallel.Asharov et al. [45] converted two interdomain BGP routing algorithms into MPC-based ones, which supports two routing policies: the neighbor relation-base policy [180] and neighbor preference-based policy [42].The routing choices are calculated by precomputed AND and MUX gates implemented on GWM.Different from the above schemes, instead of designing a substitute solution for BGP, SIXPACK [46] focused more on privacy issues on IXPs.In SIXPACK, there are two noncolluding RSs: one is performed on the IXP, and the other is outside the IXP but locally close to it.The application of MPC makes sure that the RSs do not learn anything about the input and the output of the route computation, e.g., the route exportation and route selection.The routing procedure can be divided into three phases.First, taking the route policies and BGP routes as input, the RSs compute all exportable routes using the ABY framework [181] and forward the result routes to the members.Then, the members locally rank the routes and forward the ranking values to RSs.Finally, the RSs select the best route according to the next-hop ranking and their performance-related information using a MaxIdx tree circuit [182].SIXPACK relieves the computation burden by performing the most complex ranking computation on plaintext in the IXP member instead of in the RSes where more rounds are required.
MPC-based schemes can handle various functions in routing (e.g., shortest path computation) while preserving the privacy of the routes.However, MPC-based schemes incur complex setup and bandwidth overhead due to large circuit size and the usage of OT, which could hurt the scalability of the routing designs.

2) Homomorphic Encryption-Based Methods:
To deal with the shortest path computation, Henecka and Roughan [43] proposed the Secure Transitive Routing Information Protocol (STRIP), a privacy-preserving interdomain routing protocol, which allows participants (routers) to find the shortest path in a network without learning the topology information.To conceal the path lengths, the authors use the additive HE to compute the weight-sum in the shortest path computation approach, path-vector protocols (PVPs).When the origin router receives announcements from its neighbors, it sends a probe message, including essential information to all the neighbors who have announced a route to probe the path.In the path, every intermediate router adds this path length (encrypted by Paillier [116], an HE scheme) to the original one.When the destination router receives all the probe massages arriving in a limited time, it decrypts the sum of the lengths using the secret key to find the smallest one.Then, it sends a response along the shortest path until the original router receives it.In this scheme, the intermediate routers only know their last and next hops and the lengths from themselves to the next router.As for the destination router, it can only learn the last hop and the distance of the paths.In this way, topology information is protected.Besides, the distance information is also private because no one knows the distance of the paths except the destination router.Though STRIP ensures strong privacy of the routes and routing rules, it only realizes limited routing protocols, that is, it cannot cover as many routing functions as BGP does.Moreover, the HE system, Paillier, introduces 20% extra overhead over PVP, which heavily impacts the efficiency of STRIP.
While the above schemes focus on interdomain routing, Chen et al. [44] proposed PYCRO, a secure cross-domain routing design that supports policy-compliant shortest path computing.To compute the shortest path, PYCRO constructs an equivalent cost graph for the significant nodes and builds a privacy-preserving shortest path tree whose root is the source switch using the cost graph.All the domain controllers use the additively HE to encrypt intradomain path lengths in their own domain and interdomain link lengths from their own domain.Then, they send them to the source domain controller to perform subsequent computation.Note that the operations above only involve addition computation and rerandomization based on the "secure-if" operation proposed by the authors (which allows choosing two options based on a condition in a privacy-preserving way).Finally, the tree is leveraged to establish the shortest path.In this protocol, only the distances from the source node to every significant node and the parent node are leaked.Similar to STRIP, PYCRO also suffers from a long delay.The extended version of PYCRO [47] improves the efficiency by performing a onetime off-line preprocessing.In this way, the execution time is reduced to 20 ms, and the computation cost is significantly saved.The communication cost is reduced since the online part is reduced to 1 kB.

D. Stateful Matching and Packet Modification
In this section, we further introduce more advanced functions, e.g., stateful matching and packet modification.

1) Stateful Packet Inspection:
Stateful packet inspection, e.g., stateful firewall, is an advanced function that can inspect the traffic to detect wider illegal packets according to dynamic network states.
Melis et al. [37] utilized public-key encryption with keyword search (PEKS) to handle the stateful network function.The state table describing the traffic context can be regarded as a dynamic rule-action pairs, where the rules contain additional virtual fields, including "timeout" (which records the valid time of the packets) and "state" (e.g., new arriving or established).The client maintains the dynamic encrypted state table by PEKS when the network state changes.To keep the cloud from learning the packets and rules, the client (rule generator) will compute the trapdoor for necessary fields, shuffle them, and encrypt the state and action.Receiving the trapdoors, the cloud will create an entry in the encrypted state table and return the entry identifier to the client.The client can then decrypt the packet and check the ACK and FIN flags.Based on the changes in the flag, the client tells the cloud middlebox to update or delete the state table entry.
PEKS is based on asymmetric cryptographic primitives, which does not meet practical efficiency requirements.Trusted hardware is a more efficient tool and can achieve general functions.However, considerable effort is needed when dealing with the stateful process.SGX-Box [15] integrated mOS [88], which is a networking stack monitoring the stateful flow, inside the system to support stateful flowlevel processing.An mOS built-in event is triggered every time the flow is changed.SGX-Box, thus, gains the ability to handle stateful processing.LightBox [20] stores the enormous states inside the disk encrypted and develops a special data structure to index and search for the needed flow table entries on disk while making the most required entries available inside EPC (a protected memory region inside CPU).With the performance bump, LightBox can perform the stateful process at near-native speed.
Note that both SGX-Box [15] and LightBox [20] rely on the trusted hardware, in particular, Intel SGX, to achieve secure network functions.As such, they are capable to support arbitrary network functions as in the plaintext domain, to be discussed in Section IV-E.
2) Packet Modification: Network functions, such as NAT, require modification on the packet.Intuitively, it is hard to modify an encrypted packet.To address this issue, Splitbox [48] realized modification on the content by splitting the packets into many parts and forwarding them to different middleboxes, as shown in Fig. 10.The entry middlebox A encrypts the traffic and splits the packet into t pieces and forward them to a distributed set of middleboxes.The distributed middleboxes B(t) perform the matching and action process collaboratively by a t-outof-t secret sharing scheme.The match rule is preencrypted by a trusted middlebox C (the client's middlebox) and forwarded to A, so the matching pattern is hidden.The ruleset is described as a tree, where each edge denotes a matching function m : {0, 1} n − → {0, 1}, and each node denotes an action function a : {0, 1} − → {0, 1}.The opposite result of the matching function is denoted as m.C also splits the actions using the secret-sharing method and forwards the pieces to B(t).Thus, the action (how to modify the packet) is hidden.During the inspection, each middlebox B(t) will traverse the policy tree by calculating each matching result (the edge).If m(x) is 1, then traverse to the left node; otherwise, traverse to the right node to perform x = a(x).Finally, the final action function outputs the final packet x.The results of the middleboxes will be merged in middlebox C. Splitbox can support network function chaining due to the policy tree.However, Splitbox does not tackle the issue of packet privacy protection since the entry middlebox can directly obtain the plaintext of the traffic.

E. Generic Network Functions
Generic network functions refer to the design of the middlebox that can handle arbitrary network functions.General privacy-preserving computation tools include HE and trusted hardware, such as Intel SGX.Here, we introduce existing designs based on FHE and trusted hardware.
1) Homomorphic Encryption-Based Method: Melis et al. [37] put forward ways to address the private issue of general network functions.The authors abstract network functions (e.g., firewall, LB, carrier-grade NAT, IDS, and simple DPI) as a function ψ(x), where the input x represents the original packet, and the output vector ψ(x) represents the packet processed by the function ψ. ψ is split into a pair of functions: (m, a), where m(x) denotes the match process and a(x) denotes the action part.The goal of network function is that, when the result of m(x) is valid, we can obtain the final result of the network functions from a(x).The above ψ can be represented with the formula Based on the above presentation of network function, Melis et al. [37] proposed a general private NFV (PNFV) system on FHE, with the goal of protecting the inspection rules and results.With the nature of FHE, all the computations can be performed on ciphertexts without revealing the content of the rules and results.However, in the design of PNFV, the cloud middlebox is assumed to obtain packets in plaintext to lighten the computation on the client side.Hence, the packets are not protected in this work.
2) Trusted Hardware-Based Method: Trusted hardware enables hardware support for applications to run securely.Migrating network function toward trusted hardware is natural, considering the efficiency and security of TEEs.Among the various TEE designs, Intel SGX provides a state-of-the-art choice, which includes a smaller TCB and better efficiency [119].However, the limitation lies in the relatively small enclave space.Applications running in the secure enclave suffer from the performance penalty whenever it exceeds the space limitation.
The common workflow of network function on SGX is depicted in Fig. 11.The service provider deploys the network function application into the remote service vendor host enclave [119].After finishing deploying a basic application (which is open to the public and can be verified by everyone), the service provider requests an attestation from the targeted enclave.Moreover, the enclave makes a report of its metadata and sends it to the Intel Quoting Enclave.The quoting enclave then verifies the report with its report key.Obtaining the quote, the service provider verifies the legitimation of the quote and establishes a secret channel with the enclave.Through this channel, the service provider secretly sends the configuration and secret information.
Table 8 summarizes the main characteristics of several representative SGX-based designs.Kim et al. [13] took the first step toward leveraging a hardware-assisted approach to solve the problem of efficiency.SGX is used to verify the promise while keeping its privacy intact.Kim et al. [13] also demonstrated the usability of SGX in in-network functionalities of TLS sessions.They break a TLS session into two sessions.Each endpoint exchanges a unique session key with the middlebox and, thus, makes the middlebox "the Man in the Middle."Both key exchanges are finished inside the enclave and stored in a special memory region, EPC.Every packet is decrypted inside the enclave.Thus, arbitrary functions can be enforced on the payload of traffic.All packets will be encrypted and sent back to the network.Fig. 12 shows the general system model of an SGX-based scheme.
Secure as it is, the man-in-the-middle approach breaks the end-to-end encryption, which leads to a considerable modification to existing utilities.PRI [14], SGXBox [15], and TrustedClick [16] establish a secure channel between a single endpoint and the middlebox, share the session key through the secure channel with the other endpoint, and, thus, ensure the end-to-end policy.EVE [23] provides programmer-friendly Rust APIs, which makes it flexible to set the client's own strategies.These approaches follow the privacy-preserving manner and simplify the workloads to modify TLS.A very recent work, Phoenix [22], explored the possibility of achieving secure CDNs from a new perspective, i.e., protecting session keys.Phoenix is the first keyless CDN that protects both sensitive key materials Table 8 Comparison Among Trusted Hardware-Based Approaches and web content.Phoenix is built on a novel design called conclaves, containers of enclaves, which enables Phoenix to perform on multiple processes and realizes the scalability.It also supports web-application firewalls and multitenancy, with only minor efforts to modify existing web architectures.
Aside from payloads, many network functions are performed on the header.Modification has to be made to enable hardware-assisted designs to support general network functions.Take the routing function as an example.Trusted Click [16] and EndBox [111] leveraged the Click modular router to make header reading and modification possible, where Click is a tool containing network elements to perform various tasks, such as IP tunneling configuration and ethernet switches.Trusted Click integrates the trusted hardware into Click to support arbitrary NFV applications.Protecting the sensitive information of header and metadata is essential.The work of [18] adds security extension to traditional IP and MAC protocol, providing network and link layer end-to-end security.SafeBricks [19] leverages the IPSec tunnel to protect the header.The work of Mastorakis et al. [18] and SafeBricks [19] use a similar approach to protect forwarding IP and MAC addresses.However, they both leak some metadata, such as packet size, count, and timestamps unprotected, which leads to potential exploits.LightBox [20] segments the flow into indistinguishable parts and encapsulates them into a secure tunnel, thus eliminating the possible metadata leakage.
Secure NFV chaining lets SGX-based middlebox achieve better performance and provide rich functions.Trach et al. [17] and Mastorakis et al. [18] leverage the Intel Data Plane Development KIT (DPDK) [183] feature to establish a secure channel between middlebox instances.LightBox [20] designed etap (enclave tap, a virtual network device similar to tun/tap) to enable NFV multithreading while tracking the flow states without data racing.Though different network function instances are running independently on each socket, they share huge memory pages on untrusted memory.These designs adopt packet buffers stored on the shared memory page.Each network function instance gets a new packet from the receiving (Rx) rings and writes the packet back to transmitting (Tx) rings.Rx ring can be chained with Tx rings if the NF functions are chained together.SafeBricks [19] achieved NFV chaining from a different perspective.SafeBricks utilizes the Rust programming language to isolate different network functions.Rust, with a reliable compiler, makes it impossible for the irrelevant code to access private data, thus isolating different network functions inside a single enclave instance.The approach reduces extra overhead by eliminating the unnecessary process via the exchange of information inside a single enclave.
Trusted hardware-based methods preserve security and privacy while maintaining decent performance compared to software-based methods.However, due to the limitation of SGX, these approaches are vulnerable to side-channel attacks and may also suffer from the complicated implementation of existing network architecture.Specifically, SGX-based methods may require modification on the common network protocols to fit inside the enclave, which leads to an extra pile of deployment workloads.

F. Verifiable Network Functions
When the server is malicious, the middlebox may not be always faithful to follow the protocols, for the malicious intentions, such as saving computing costs or even stealing more private information.Fayazbakhsh et al. [59] first considered the problem conceptually and proposed a verifiable network function outsourcing design, vNFO.vNFO mainly consists of two parts, a trusted shim running inside the TPM of the cloud and a trusted central logging entity (CLE).The trusted shim samples the inward and outwards traffic and generates reports about traffic.The CLE instrument each shims instances to work properly, collects the reports, and sends them back to the consumer.As a conceptual design, vNFO does not intend to meet practical efficiency requirements.
Among the literature, there have been software and hardware methods to guarantee the integrity of network functions.The trusted hardware-based methods naturally support verification with the help of attestation and sealing technology.For example, the work of Kim et al. [13] runs the traditional verification program in the secure enclave.Yuan et al. [60], [61] utilized lightweight cryptographic primitives, such as hashing and PRFs, to assure the correctness of the inspection results and, thus, improve the likelihood that the middlebox captures the malicious packets.They proposed a ringer-based construction, which is a cryptographic sampling method to probabilistically check the correctness [184].Ren et al. [63] designed a two-layer architecture with two noncollusion servers.In their design, a large scale of packets is first filtered by bloom filter, and the middlebox verifies the tokens based on an efficient no-dictionary verifiable SSE scheme [185].The above works are practical due to the use of lightweight primitives.However, they only support verification on pattern matching.
As for more advanced middleboxes, Zhang et al. [50], [51] considered outsourced virtualized service function chaining (vSFC) and proposed the first scheme that could verify the correctness of the path traversal in vSFC hop by hop.This work focuses on path verification, and the correctness of the execution on middleboxes is not considered.A recent work [62] further explored the verification of stateful middleboxes.To deal with dynamic states, it adopts stateful sampling to capture the packets of the same state and then replays the packet samples locally.It is worth noting that existing works have not covered all the middleboxes, e.g., middleboxes that perform in-network caching and content delivery.

V. S Y S T E M A T I C C O M P A R I S O N S F R O M U S A B I L I T Y, P E R F O R M A N C E , A N D S E C U R I T Y
In this section, we carry out meticulous comparisons among the privacy preservation techniques and give metrics to evaluate the outsourcing network function designs.We particularly compare the techniques (e.g., cryptographic tools, such as HE and searchable encryption, and the trusted hardware, such as SGX) from usability, performance (including computational cost, communication cost, and storage problem), and security.

A. Usability
Functional property is one of the primary factors to be considered when designing a scheme.NFV includes many functions, such as firewall, DPI, NAT, and load balance.When outsourcing a network function, middleboxes may operate on the header (for functions such as firewalls, routing, and NAT), the payload (for functions such as in-network caching), or both (for functions such as DPI and IDS).Interestingly, these functions all contain the matching process.Middleboxes match the data in some field of the packet with the outsourced rules and then make the corresponding action on the packets according to the matching result.To divide the functions by the complexity of the rules, as introduced in Section IV, there are equality matching and more enriched matching, such as range matching and stateful matching.
Schemes introduced in Section IV can cover these functions in different degrees.Here, we further introduce the capability of the tools.Intuitively, methods can use cryptographic tools, such as deterministic encryption, to encrypt sensitive information to let the middlebox perform simple equality matching rules according to the hidden information and the encrypted rules on the middlebox [49].Using more complicated multilevel encryption with the basic deterministic AES encryption, the work of [56] can realize RE, i.e., equality matching on the packet level.Similar to deterministic encryption, index-based SSE, which is naturally suitable for searching on encrypted traffic, can be utilized to perform the equality search, such as watermark and signature matching in DPI and in-network caching.However, it is difficult for AES or index-based SSE schemes to more function-enriched matching (e.g., range matching), which are, however, quite common in many network functions, such as port and IP address range matching in the firewall.The bloom filter [163], [164] can check whether an element is included in a set, which can be modified to judge whether the data in a specific field of a packet are in the ruleset, or whether the data in a field are within a range.Prefixmatch, OPE, and ORE can reserve certain information in plaintext and, thus, can be extended to range match by encrypting the endpoints of the range and comparing the encrypted boundary with the traffic.Besides matching, sometimes, it needs more complicated computations over the traffic and the rules, e.g., network routing.HE can directly perform calculations, such as addition and multiplication on ciphertext, so it can be used to compute the shortest path and bandwidth allocation for the route decision.Also, SMPC can be utilized to calculate the routings or filtering results by multiple cooperative and middleboxes without collusion.GC and OT can provide secure information exchange among the middleboxes.The methods mentioned above are based on software, while, with the auxiliary of the secure running context provided by secure hardware, such as Intel SGX, more general functions can be performed securely.Intuitively, the packets can be decrypted and processed in the enclave.With SGX, almost all the functions can be achieved, including general DPI, NAT, and routing on L2 (data link layer) and L3 (IP layer).
Remark: We can evaluate a network function outsourcing design from a functional perspective, i.e., whether the design can achieve equality matching, more enriched matching, stateful matching, or general function.We conclude that the techniques of trusted hardware, HE, and MPC are more suitable for general and complex network functions.Other lightweight primitives can only handle relatively simple network functions.Hence, when designing in-the-cloud middleboxes, we can choose appropriate techniques according to the complexity of the network function.

B. Performance
Generally speaking, the performance of a network function is twofold: computation cost and communication cost.Besides, in the context of NF outsourcing, the storage occupation on the middlebox also needs consideration.In this section, we compare the state of the arts from the perspective of computation, communication, and storage cost, and summarize the results in Table 9.
1) Computation Cost: Computation cost mainly includes precomputation time in advance of all communication connections, setup time, and processing time during a connection.The middlebox needs to be configured in advance, e.g., the rules may be encrypted and uploaded at the beginning, which forms the precomputation time.The precomputation time does not affect the real-time performance for communication, while the latency strongly influences the communication efficiency and the user experience.Latency consists of the setup time to establish a secure connection or prepare for secure data exchange and the processing time, i.e., inspection time in functions, such as DPI and firewalls, and computation time in functions, such as secure routing.
For different applications, such as long-or short-lived instant connections, the requirements for the setup time are different.Long-lived connections are more tolerable for longer setup time, while the short-lived ones are sensitive to setup time.For example, the pioneering work of outsourcing DPI, BlindBox [24], suffers from a long setup time due to the preparation for AES GC.There are lots of works aiming to achieve lower setup time.Embark [26], for example, reduces the average setup time by preserving a long-lived connection between the gateways and middleboxes instead of changing the encryption keys for every connection between the clients, which needs a fresh setup every time there is a connection.Besides, the proposed PrefixMatch in Embark is four orders of magnitude faster than OPE.In SGX-based methods [13], [17], there is a necessary remote attestation for enclave initialization, which accounts for approximately 10% of the latency in the setup phase.For example, according to the experiments in ShieldBox [17], the latency of SGX remote attestation takes about 26.4 ms.
The computational complexity of the processing phase is generally related to the number of rules.In many cryptobased schemes, the middlebox needs to match the packets with the outsourced rules one by one, i.e., linear inspection complexity [29], [37]- [39].Sublinear matching is an advantage of index-based searchable encryption methods.For example, in BlindBox, the encrypted rules are stored in a tree-based structure, thus leading to a search time logarithmic with the number of rules.Other index-based methods [25], [28], [30] utilize hash tables to store the rules to enable sublinear inspection.Besides the time of inspection for a packet, the latency of each matching operation will also affect the processing efficiency.Generally, the SSE-based method is much more efficient than the HE-based method because SSE is based on symmetric cryptographic primitives, such as PRF.For example, the latency of index-based methods [24], [25], [28] for processing one packet over thousands of rules is within milliseconds.The HE-based method [37] takes seconds for processing a packet with five fields on ten policies, which is slower than the index-based methods.The SGX-based approach LightBox [20] takes about 20 μs to inspect a 1500-byte packet over a 5000-sized ruleset, which achieves near-native speed.Moreover, SGX-based approaches can scale linearly according to the core count.SEC-IDS [186] shows, in the experiments, that, with a limited number of flows, SEC-IDS dual-core performance is almost twice of its single core.Mastorakis et al. [18] demonstrated similar performance scalability with up to four enclaves running simultaneously.
2) Communication Cost: Excessive communication costs will occupy network resources, thus affecting overall efficiency.For example, one disadvantage of the GC-based BlindBox is that it uses OT for data exchange.The size of a 128-AES-circuit is about 500 kB, which brings much bandwidth for a large ruleset.Compared to BlindBox, DPF-ET replaces the GC with XOR, which reduces the bandwidth of AES-circuit, but the OT still consumes higher bandwidth than the lightweight cryptographic primitives.SSE-based approaches require additional transfers of tokens of the packets, which are related to the size of the packets and the tokenization method.Tokens are generally encrypted by PRF, so the size of each token is relatively small.Besides the inherent communication costs brought by the primitives, some designs also introduce multiple servers to cooperatively perform the functions, which also introduced extra communication [38], [48].
3) Storage Cost: The storage cost on the middlebox is another performance indicator.In general, outsourced rules will be kept on the storage in the middlebox.For software-based schemes, the storage requirements are not very strict because the cost of memory is not high for cloud servers.Here, we discuss the storage problem of SGX, whose memory is limited.In SGX, the programs are split into the public and secret parts, as adopted in S-NFV [187] and SheildBox [17].LightBox [20] considers how to reduce the consumption caused by EPC paging.In LightBox, the state of each flow is tracked, and the temporarily unnecessary parts are placed outside the enclave.Meanwhile, the authors utilized a dual lookup design with cuckoo hashing, which significantly reduces the lookup time consumption due to EPC paging.
Remark: Both computational cost and communication cost are important indicators for evaluating the performance of the designs.We can measure the computation cost from three parts: the precomputation time for rule generation, the setup time for connections, and the processing time.As for communication cost, we can measure the additional bandwidth caused by search tokens in SSE, exchange messages in GC, and configuration data in SGX.When designing in-the-cloud middleboxes, we should strive to minimize the computational and communication overhead.The ultimate goal is to achieve the performance on par with local middleboxes performing on plaintext traffic.

C. Privacy and Security
The issues on privacy and security are twofold: 1) the protection of sensitive information, such as inspection rules and packets and 2) the security problems of the techniques themselves.
1) Privacy: According to Sections III-A2 and III-B, the privacy information includes the rule privacy and traffic privacy.The traffic privacy can be further subdivided into header privacy and payload privacy.Table 4 1 presents the protection coverage of the representative privacy-preserving network functions, i.e., the protection of header/payload and the protection of rules from the endpoints/middleboxes.
In the context of network function outsourcing, the rules are generally generated by a third-party rule provider.Hence, ideally, the inspection rules should be kept from both the middlebox and the clients to protect trade secrets.BlindBox [24] protects the rules from the clients by adopting an AES-GC.However, it reveals the rules to the middlebox.In the SSE-based schemes [25], [28]- [30], the rules are encrypted and kept in hash tables to hide the private information.Note that some designs, such as [25], [26], and [48], do not explicitly protect the rules from the endpoints because the rules are generated by one of the trusted endpoints.In the context of routing, the privacy issue of rules can be further refined into three aspects: the protection of the destination address [42], topological information [43], and the routing policies, including the ranking of routes [43], [45], the shortest path [43], and export policy [42], [44], [47].The abovementioned designs all managed to protect the rules from the middleboxes.
As for packets, both the information in the header and the content in the payload should be protected.It is easy to protect the payload, while, for the header and the metadata, especially in the bottom layer, encryption makes it hard for the middleboxes to transfer the packets.The payload can be easily protected by encryption without affecting the low-layer packet forwarding.However, the protection of the header information is complicated.Recall that there are two outsourcing models of network functions, i.e., the go-through model and the bounce model.In the bounce model, the encrypted header can be inspected in the cloud-side middlebox and then sent back to the client's gateway.Thus, the five tuples (i.e., the source/destination IP address, the source/destination port number, and the protocol) of the original packets can be freely encrypted and filtered.The designs using the bounce model can protect the full information in the header from the cloud [36], [41].However, in the go-through model, the packets need to be forwarded in real-world network routers.Encrypting the destination address will make it hard for the routers to forward the packets.Nonetheless, many works have managed to protect other fields in the header in different scenarios.For example, Embark [26] hides the source IP for NAT middleboxes and the URIs for HTTP headers.The HTTP header protection has a weaker security guarantee because the comparison information between the fields is revealed.SHVE+ [35] hides specific fields in the header, for example, the HTTP method for HTTP headers.In some SGX-based methods, the MAC and IP addresses are replaced with the addresses of the middlebox [18], [20].Besides, to hide the metadata (e.g., packet size), LightBox [20] reorganizes the packet, fixes the size of each packet, and uses a streaming manner to forward the packets.
2) Security: The security of cryptographic primitives or trusted hardware is another factor affecting the security of the whole system.For example, OPE is proven to be insecure for the disclosure of the order of the encrypted fields and is vulnerable to brute force attacks [26].SSE is a promising lightweight cryptographic technique with high efficiency.However, efficient SSE schemes generally leak access patterns and query patterns, which may disclose the linkage between the tokens in the packets and the frequency of the words in the packets and the inspection rules.Due to such leakage, SSE schemes are vulnerable to the leakage-abuse attacks [146]- [150].It is also reported that PrivDPI [32] is vulnerable to brute-force attacks when the space of the rule set is small [34].
Different from the cryptographic primitives, SGX-based approaches are mainly faced with the problems that the system calls may be issued from an untrusted host, and the system clock may cause security flaws.As a result of the design of SGX, the attacker can easily get the enclave memory mapping.SGX-Box [15] proposed a new programming abstraction called SB lang, which encapsulates the underlying C and C++ languages, and ensures that the API does not contain any insecure pointers that are vulnerable to cache overflow attacks.As for the problem of the system clock, LightBox [20] and ShieldBox [17] adopt the etap clock and the NIC clock to avoid the attacks brought by the unsafe clock, respectively.
Remark: According to Sections III-A2 and III-B, we can evaluate the privacy-protection mechanisms by analyzing how much private information they have protected.The private information can be detailed into three parts: the L4 payload, the L2-L4 headers, and the inspection rules.The protection of inspection rules can be further divided into protection from the endpoints and protection from the middlebox.The more private information a design protects, the more secure it is.

VI. O P E N R E S E A R C H P R O B L E M S
Albeit existing works have shown the potential of network function outsourcing, there are still challenges to design an efficient, secure, and functional outsourcing architecture.Here, we list several research challenges and potential future research directions.

A. Limitations of the Cryptographic Tools
The cryptographic tools, such as SSE, HE, and multilinear maps, can well solve the privacy problems in the outsourcing process.However, when applied to NFV, these cryptographic tools have limitations on efficiency or functionality to different degrees.
The SSE-based NF outsourcing designs have the advantage of high efficiency and lightweight computation and communication burden.Nonetheless, existing efficient SSE-based methods only enable the most straightforward equality matching rules, such as the signature inspection.There are potentials to combine more functional SSE techniques, such as tree-based range search [115], [188], [189] and Boolean search [190]- [192].Interestingly, the structured encryption [193] can encrypt and query the structured data, such as graphs, which may be adopted to solve the secure routing problem, such as the shortest distant computation [194]- [196].
A drawback to HE is its poor efficiency.The nature of HE is very suitable for solving the privacy problem of NFV, i.e., computations on encrypted data.However, the FHE-based methods are not ready to be applied in practice due to their low efficiency.From the perspective of functionality, HE-based methods can only be adopted to the network functions, such as firewalls and routing, where the involved data to be computed are relatively small.For the low-efficiency applications, they could not employ HE-based designs.In order to improve, researchers can explore more efficient HE schemes in the future.As an alternative direction, it is possible to limit the NFV operations to specific operations by meticulously designing the privacy-preserving network function computation algorithms, so as to replace fully HE with efficient partial HE.
The multilinear map is a potential cryptographic tool that can be applied to the arbitrary polynomial circuit, thus enabling the obfuscation of the bit vector of the rules in firewalls in an efficient way.However, the obfuscation methods based on the multilinear map [129], [130] have been proven insecure [197].Besides, in multilinear map-based schemes, noise is often introduced to interfere with the man-in-the-middle attacks and ensure security.However, the noise will increase rapidly with the coding level, which may affect the efficiency.How to balance the security and efficiency of the multilinear map is also a key point that we need to consider.

B. Chaining of Outsourced Network Functions
A complete network service requires the packet to go through a set of network functions [198] (a.k.a. the service chain).For example, an HTTP packet should go through a IDS − → proxy chain, and packets from the internal sites should be processed through a NAT − → firewall chain.When the functions are chained, the communication between the middleboxes may leak more privacy information than the single function setting.SICS [49], [153] uses deterministic encryption, such as AES to encrypt the header, which contains the destination and processing action as a label, and the middleboxes will learn the behavior (the destination and processing action) according to the label and a prebuilt label-behavior dictionary.However, the rule generator needs to enumerate all the destinations and actions, and the analysis on the header is not actually outsourced.SGX-based methods, such as SheildBox [17] and LightBox [20], can mount the functions on different virtual machines in a physical machine.So far, research on outsourcing service chains remains underexplored.There is no formal definition of security (e.g., the leakage definition) for service chaining.More tools can be used to ensure minimum information leakage when transferring among different middleboxes.It is also important to consider how to improve the flexibility and availability, i.e., update the original chain topology without affecting current business.

C. Verifiable Outsourced Network Functions
Verifiability [199] is another major research issue in the field of outsourced NFV.Outsourcing increases the cost-efficiency of NFV.Since the network function providers code to control the computing infrastructure, the consistent provision of such service can be compromised by cloud vendors so that they can gain extra profit.Oversight of such behaviors can be computationally intensive and place extra overhead on performance.Finding a proper and efficient way to detect malicious vendors forms a big problem.
Verifiability, though important, is not discussed in most schemes that we introduced before.In a privacy-preserving middlebox, the main focus is to prevent the third party from stealing precious information.The work of [13] leverages the trusted hardware to verify the correctness of network function but is vulnerable to the denial of service attack and, thus, is hard to ensure the quality of service.Besides, especially in the SSE-based method, the validation of the search tokens is indispensable if one of the endpoints is malicious.Besides the correctness of the functions performed on the server side, the integrity and consistency of the tokens and traffic should also be checked, which has not been widely studied.How to develop a middlebox, which is both private and verifiable, would be a promising yet challenging future direction.

D. Attacks on Outsourced Network Functions
To enable detection, analysis, or process on encrypted traffic with outsourced (encrypted) network functions, more information will be disclosed than the traditional unmanageable encrypted packets (e.g., HTTPs traffic).For example, the SSE-based methods may be vulnerable to the leakage-abuse attacks [141], [147], [149], [150].The deterministic search tokens reveal the statistical distribution of the traffic.According to existing weaknesses, we can design possible attacks to infer the content of the traffic or rules.Besides cryptography-based attacks, traditional attacks, such as DoS attacks and man-in-themiddle attacks, can also be mounted on the middleboxes.Encryption makes it easier to send spam messages and occupy resources.However, very few network function outsourcing schemes consider the DoS attacks, which remains a security threat to the practical application of the schemes [200].

E. Side-Channel Attacks on SGX-Based Middleboxes
The side-channel attacks against SGX have been in a state of not being considered in the SGX-based middleboxes.Although these schemes claim that there is a general method to respond to the side-channel attack [17], [19], [20], the general countermeasures may bring problems to the performance-sensitive virtualized network functions and invalidate the functions that have been realized.
The side-channel attacks on SGX can be conducted by leveraging the TLB flushing, page faults, exceptions, and so on.Take the TLB flushing as an example.Since the operating system has direct access to memory management, a malicious OS can decide whether to flush the TLB.Then, the malicious OS can locate the attack address by analyzing the code of SGX applications and learn what the SGX has accessed by flushing the TLB and recording the memory footprint.To defend such side-channel attack, Xu et al. [201] proposed four general solutions: 1) disable the operating system paging, which limits the functions of enclave application; 2) enable self-paging, which requires extensive changes to the configurations, such as new hardware or new paging interfaces; 3) use the techniques of oblivious RAM or noise injection, both of which may produce extra computation and communication overhead; and 4) check the execution time.Since side-channel attacks will affect the execution efficiency, it is reasonable to check the time to find out whether an attack has occurred.
It is not hard to see that, while existing SGX-based schemes assume that conventional defense methods can evade the side-channel attacks, the countermeasures either require case-by-case constructions, incur overhead, require new hardware support and program modifications, or are not secure enough.Therefore, it is necessary to take the side-channel attacks into consideration for the SGX-based middleboxes.LightBox [20] emphasizes that a stateful middlebox may often support millions of flows concurrently, which shows the necessity of low overhead in the enclave.However, the advantage of LightBox might be whittled down if the general defense is adopted.Hence, it is essential to design specialized countermeasures to side-channel attacks for SGX-based solutions.

F. Virtual Machine Isolation for NFV
In addition to the threats from the cloud server and external adversaries, it is also significant to consider potential attacks from the tenants in the same cloud.The cloud service provider is a physical host virtual machine inevitably shared by multiple users [202].The malicious tenant can make use of the loopholes in virtual machine isolation, such as the side-channel attacks, to steal the private data of her neighbors [73].Currently, this problem has not been discussed in the field of network function outsourcing.The environment of virtual machines on the cloud service is much more complicated than that of the enterprise or individual level virtual machines.Therefore, how to prevent the attacks of malicious neighbor tenants in NFV from enhancing the credibility of the hypervisor is another interesting topic in the field of network function outsourcing.

VII. C O N C L U S I O N
In this article, we have conducted a comprehensive survey on the latest literature on NFV and related security and privacy issues when moving network functions into the cloud.To better demonstrate the issue, we raised the challenges and goals of the outsourcing model.We categorized the outsourced network functions into exact matching, function-enriched matching, and very general functions, and introduced existing solutions, respectively.Cryptographic tools, such as searchable encryption, GC, and HE, and trusted hardware, such as SGX, can be utilized to design an outsourced network middlebox in a privacy-preserving way.Furthermore, we carefully compared the privacy preservation technologies from the perspective of functionality, efficiency, and security, and also concluded the metrics to evaluate the solutions.Finally, we put forward several open research problems for future investigation.

Fig. 3 .
Fig. 3. Two types of network function outsourcing models [26]: (a) bounce model and (b) go-through model.In the bounce model, the gateway sends the traffic to the service provider, and the traffic will be sent back to the gateway after detection.In the go-through model, the traffic is processed in the middlebox between the sender and the receiver.

Fig. 4 .
Fig. 4. Comparison between TCB with SGX and without SGX.The gray blue area stands for TCB.
message token, thus leading to a longer detection time compared with BlindBox.

Fig. 5 .
Fig. 5. Architecture of BlindBox [24].In the setup phase, the endpoint prepares a garbled AES embedded with the encryption key on the tokens.With the garbled AES, the middlebox can encrypt the rules with the secret key of the tokens.The traffic is then tokenized, encrypted,and transmitted to the middlebox for inspection.In addition, for correctness assurance, the encrypted tokens and traffic will then be sent to the receiver to validate the tokens in case the sender may be malicious.

Table 1
Development on Different Categories of Secure In-the-Cloud Network Functions

Table 2
Examples of Different Matching Types in the Style of Snort Rules redundancy elimination (RE)

Table 3
Abbreviations of the Security Primitives

Table 4
Properties of the Representative Privacy-Preserving Network Function Designs

Table 5
Approaches of Equality Matching

Table 6
Approaches of More Enriched Matching

Table 7
Approaches of Secure Routing

Table 9
Estimated Performance of Different Privacy Preservation Technologies