Privacy Preserving and Serverless Homomorphic-Based Searchable Encryption as a Service (SEaaS)

Serverless computing has seen rapid growth, thanks to its adaptability, elasticity, and deployment agility, embraced by both cloud providers and users. However, this surge in serverless adoption has prompted a reevaluation of security concerns and thus, searchable encryption has emerged as a crucial technology. This paper explores the Searchable Encryption as a Service (SEaaS) and introduces an innovative privacy-preserving Multiple Keyword Searchable Encryption (MKSE) scheme within a serverless cloud environment, addressing previously unmet security goals. The proposed scheme employs probabilistic encryption and leverages fully homomorphic encryption to enable operations on ciphertext, facilitating searches on encrypted data. Its core innovation lies in the use of probabilistic encryption for private multi-keyword searches. To validate its practicality, we deploy the scheme on the public cloud infrastructure, “Contabo,” and conduct rigorous testing on a real-world dataset. The results demonstrate that our novel scheme successfully preserves the privacy of search queries and access patterns, achieving robust security. This research contributes to the field of serverless cloud security, particularly in the context of searchable encryption, by providing a refined solution for safeguarding data while maintaining usability in a serverless computing landscape.


I. INTRODUCTION
Cloud computing has enabled convenient usage of computational and network resources [1].For data storage as well as its management, the cloud provisions multiple services.The cloud is inexpensive [2] but storing data on the Cloud makes it vulnerable to advanced persistent threats (APT) [3].Serverless computing is rapidly expanding as a result of its widespread adoption by cloud providers and tenants due to The associate editor coordinating the review of this manuscript and approving it for publication was Nitin Gupta .its adaptability, elasticity, and deployment agility With the boom in the serverless paradigm, security issues are being reevaluated as well [4], [5].One way to protect data from these vulnerabilities is by encrypting it prior to its storage on the cloud.However, traditional encryption techniques limit the usability of the data i.e. keyword search can not be performed on the ciphertext.The solution to this lies in searchable encryption (SE).
Homomorphic Encryption (HE) is a groundbreaking approach that has enabled cloud data owners/users to not only store but also to search data [6].HE paired with SE has opened many pathways for the cloud computing industry.Consider the healthcare industry, which has shifted towards adopting serverless cloud solutions to cater to the high volumes of patient records.Patient data consists of sensitive health-related records whereby the confidentiality and privacy of the patient data are of utmost importance [7], [8], [9].SE enables secure searching on the encrypted patient data [10], [11].Similarly, in the aviation industry, SE plays an important role in protecting the passenger's personal details while the airline receives verifiable authentication of the passenger's credentials.The latest research [12] has shown the need for SE to solve a variety of real-world problems ranging from the health records of patients to highly critical records in the defense sector.
The first SE scheme with a keyword search on encrypted data was proposed in [13].Further research performed on SE was described in [14], [15], [16], [17], and [18].However, the focus of these schemes was single keyword search with Boolean search results.Multiple schemes based kNN, and fuzzy search are presented in [19], [20], [21], and [22].Ranked multi-keyword search was discussed in [23], [24], [25], [26], and [27], however, it included the use of search index which not only affected the efficiency of search but also diminished the security of the scheme.Formal definitions for SE schemes over cloud computing such as search pattern, access pattern, and adaptive and non-adaptive indistinguishability have been proposed in [15] and [28].
Recent research on SE can be characterized as either index-based search or linear search.Major research has been conducted on index-based SE schemes [29], [30], [31], [32].Due to the fact that the index structure in index-based search engine schemes reveals details about the correspondence between a given keyword and its associated document, these techniques are intrinsically vulnerable in terms of security.Therefore, the probability of data security breaches is higher in such schemes.
In addition to the security challenges presented by the introduction of index tables, research on SE does not provide search patterns and access pattern privacy.Search pattern privacy is concerned with the identification of keywords given an encrypted trapdoor.Search pattern attack entails the adversary being able to identify which keywords are being searched by the client.while access pattern privacy, on the other hand, is related to the mapping of an encrypted trapdoor consisting of keywords, to a set of documents that contain that respective trapdoor.If the adversary is unable to map a trapdoor to the set of documents containing the trapdoor, access pattern privacy is preserved.
In today's digital landscape, safeguarding sensitive information from ever-evolving security threats is paramount.The need for a cloud security solution to achieve high levels of security and performance forms the motivation of this research.This research seeks to address this challenge by developing a robust searchable encryption solution.By incorporating probabilistic trapdoors, elevating security levels, and optimizing performance efficiency, this work aims to provide search pattern and access pattern security, ensuring the confidentiality and integrity of critical data in an increasingly vulnerable digital world.
This research proposes a novel multi-keyword SE (MKSE) scheme that preserves privacy while enhancing usability in terms of data stored over a serverless cloud.The proposed novel MKSE scheme is based on HE to enable search operations on encrypted text.It provides a high level of security using probabilistic encryption to generate encrypted data and queries.The proposed scheme has been deployed and tested on the ''Contabo'' public cloud.MKSE scheme can have a profound impact across different sectors which include healthcare, aviation, IoT, banking and finance, law enforcement, and education.

This research contributes the following:
• A novel privacy-preserving multi-keyword SE (MKSE) scheme based on fully HE is presented which provisions data confidentiality and privacy over a serverless cloud environment.The scheme is strategically based on probabilistic trapdoors thus hiding search patterns.
• The proposed homomorphic capability reduces the client-side computations and the leakages associated with the conventional SE schemes.A thorough security analysis verifies the security definitions.The proof of concept prototype is deployed and tested over the cloud platform ''Contabo'', to analyze its performance.

B. ORGANIZATION
The rest of this paper is organized as follows: Section II reviews existing research on multi-keyword SE.Section III presents the cryptographic primitives (DGHV) used in the MKSE scheme.The system overview elaborates on the system model, the threat model, as well as the security goals in Section IV.Section V revisits the security definitions associated with the proposed scheme.The proposed MKSE scheme is presented in detail in section VI.The security analysis highlighting leakage profiling is carried out in Section VII.Implementation aspects are described in Section VIII while the performance evaluation presenting the computational analysis and discussing the limitations is carried out in Section IX.Section X summarises the research and concludes with possible future directions.

II. RELATED WORK
SE is an important cryptographic primitive that provisions search on encrypted data stored on the cloud.It is a revolutionary algorithm that lets entities perform search queries on the ciphertext.This section discusses existing literature on multi-keyword SE as well as some of the latest works in serverless cloud computing.Secure and dynamic multi-keyword SE is proposed in [31].The authors have used tf-idf and vector space model to generate an index table and queries.The secure k-NN approach is used to encrypt documents and queries.
They have employed relevance scores between the index and query to provide accurate ranked results to the cloud user.Using transformation matrices, they have provided resistance against statistical attacks.
In [30], the authors have utilized the sparsity of matrix to propose a k-NN based MKSE scheme.They have also reviewed previously proposed SE schemes based on k-NN and bloom filter.Based on the k-NN approach, they have chosen sparse matrix pairs to efficiently encrypt the indices.Further, they have used bloom filters to solve the dictionary update problem.Their proposed scheme provides data privacy, keyword privacy, trapdoor privacy, and access pattern privacy.
Ahamed et al. proposed a lightweight searching protocol for RFID tags in a serverless cloud environment [33].This protocol can effectively carry out the search for a specific tag without the need to use and maintain server(s).A privacypreserving MKSE is proposed in [34] using Ciphertext Policy Attribute-Based Encryption (CP-ABE).The authors have addressed the privacy concerns present in the healthcare industry regarding private patient records in multi-owner settings i.e. multiple patients.Their scheme also provides trapdoor/keyword privacy.Using real-world datasets they have shown that their scheme is practical and feasible for implementation.
The authors have also used secure k-NN to provide search security in [24].The scheme provides authorization and ranking for cloud users.Trapdoor unlinkability, confidentiality, and collusion resistance are also enabled for the search on the encrypted data.Bilinear mapping is used in [25] to provide privacy-preserving MKSE for multiple data owners.The scheme has employed the tf-idf model to provide ranked results to cloud users.The tree-based index is used with the construction of a privacy-preserving function to provide efficient search results for every data owner.These indexes are merged by the cloud server.Then, depth-first search is used in this scheme to find required top-k files to return to the data user.
In mobile cloud computing settings, a new method SEED was suggested [35] for serverless efficient encrypted deduplication.The authors claimed that SEED guaranteed the security features of data consistency, confidentiality, as well as collusion resistance for the data in cloud without the use of extra servers.The lack of specialized servers improved SEED's efficacy for the mobile cloud, where user movement is deemed critical.
Authors in [26] have proposed an efficient SE scheme for multi-keywords in mobile cloud storage.Their proposed scheme incorporates relevance score and k-NN technique to provide accurate and ranked search results.Furthermore, they have used blind storage to ensure that the access pattern of cloud users is concealed.An efficient index is generated to enable the practical functioning of the scheme.Their scheme also provides data and index confidentiality, trapdoor privacy, and search pattern concealment.
Minhashing-based scheme is presented in [29].An encrypted index is constructed in multiple steps including feature extraction, index generation using minhash function, and index encryption using HMAC.Encrypted query/trapdoor is generated based on the signatures present in the encrypted index, providing keyword privacy.Ranked results are returned to the cloud users based on relevance score and tf-idf.The authors have further modified this scheme to provide access pattern privacy using two separate servers for searching and retrieval.In the modified scheme, they have used Paillier encryption to encrypt documents and their corresponding relevance scores.
Two Round SE (TRSE) is proposed in [36] using HE and vector space model.Multi-keyword ranked retrieval is performed using relevance score and tf-idf.The scheme provides privacy-preserving SE by performing ranking at the user end.The search pattern is hidden in [37] to provide trapdoor privacy and keyword privacy.The authors have achieved their conjunctive keyword SE using a special variant of additive HE.They have considered two servers i.e. a cloud server and an auxiliary server to achieve the privacy goals of their scheme.To augment security measures from the user's perspective, they have employed random polynomials to guarantee that only the desired outcomes can be obtained by the user.Their scheme provides a stronger security guarantee to the cloud user.Their scheme performs an efficient search in parallel, independent of the search index.
HE is used in [38] to enable efficient multi-keyword retrieval through the use of correlation scores to return accurate and ranked results to the cloud user.They have modified a HE scheme to provide secure retrieval of documents.They have shown that their scheme provides keyword privacy and efficient and accurate retrieval of topk documents.
An ABE-based secure and efficient access control system [39] is developed for serverless security computing for resource and knowledge sharing.The data is first secured using user characteristics before being divided into cyphertext.Finally, it is decoded using a decryption method, and the cyphertext shares are spread across the network, while the encapsulated texts are kept in the serverless system.The authors indicated that the suggested method outperforms the current methods in terms of data security in a serverless system.
Verifiable PKE with keyword search is performed in [40].The authors have considered that multiple users are accessing the encrypted cloud data through an inverted index.Dual Embedding Space Model (DESM) is presented in [41].The authors present a lightweight construction aimed at achieving accurate ranked search results.DESM index generation ensures that retrieval of the ranked results is efficient.The authors have proposed a ranked search mechanism enabling multi-keyword search using improved k-NN.They have solved the issue of index updates using dimension reduction in DESM.
A verifiable privacy-preserving MKSE scheme is proposed in [42] to provide verifiable search results to the cloud user.The scheme is based on the adaptive Homomorphic MAC technique to enable verified search results.In this way, the ranked search results returned to the cloud user can be checked and detected for incorrect search results.Using real-world datasets, the authors have performed security and performance analysis of their scheme.
The scheme in [43] put forward a threshold access control for cloud sharing based on groups of data users.A multi-user, MKSE is proposed in [44].The authors have identified the flaws of the commonly used scheme k-NN and developed a new scheme to eradicate those flaws.Their scheme supports keywords in arbitrary languages and provides flexible authorization and time-controlled revocation of access for cloud users.Another feature provided by their scheme is data privacy protection.
The authors [46] generated a multi-keyword vector to provide searching on encrypted data through probabilistic trapdoors.Their proposed scheme provides data privacy and trapdoor unlinkability and is resistant to indistinguishable attacks, providing enhanced security and functionality to the cloud user.The performance analysis carried out by the authors has shown that their scheme has unique search functionality advantages over other existing multi-keyword schemes.
The paper [47] presents a Single Keyword Searchable Encryption (SKSE) scheme implemented using the Paillier Cryptosystem, with two variants, Secure SKSE and Efficient SKSE.The Secure SKSE scheme prioritizes security through probabilistic encryption and trapdoors, while the Efficient SKSE variant offers significant performance gains, being 84 times faster.The research evaluates both schemes on realworld aviation data in a use-case scenario of airport security, deployed on the Contabo public cloud platform, providing insights into their effectiveness in achieving security and performance objectives.
The authors introduced an innovative approach to multikeyword ranked searchable encryption (MRSE) in [45], addressing a crucial privacy concern.Unlike previous MRSE methods that solely perform complete keyword searches and ranking on the server-side, MRSW allows users to include a wildcard keyword in their queries, enhancing search flexibility while maintaining data security through a Bloom filter-based approach.The proposed MRSW system is rigorously analyzed for security under adaptive chosenkeyword attack (CKA2) models and demonstrates efficiency and practicality through experiments on real web of science data.
The article [48] addresses critical security concerns in the context of Public Key Encryption with Keyword Search (PEKS) for cloud data storage.It highlights vulnerabilities, such as keyword guessing attacks, incorrect results from untrusted cloud servers, and the looming threat of quantum attacks.To counter these challenges, the paper introduces VR-PEKS, a novel ciphertext retrieval scheme based on fully homomorphic encryption (FHE).VR-PEKS not only enables verifiable searches but also mitigates risks through the use of an oblivious pseudorandom function to randomize keywords and FHE for encryption.Moreover, the article demonstrates the scheme's security and effectiveness, proving its resilience against adaptive keyword selection attacks.
Another scheme introduces the Verifiable SE Framework (VSEF) [49] as a foundational solution that can withstand insider KGA and enable verifiable searches.Building upon this framework, the enhanced VSEF is presented, designed to support multi-keyword search, multi-key encryption, and dynamic data updates.The research underscores the importance of practicality and scalability in real-world SE applications.Extensive experiments with the Enron email dataset demonstrate that the enhanced VSEF achieves both high efficiency and robust resistance to insider KGA while ensuring the verifiability of search results.
Presented studies confirm that there is a need to address privacy-preserving goals such as search, and access pattern privacy, along with trapdoor unlinkability.The presented scheme is proposed to address these privacy concerns using probabilistic encryption.Table 1 emphasizes that the MKSE scheme uses HE to provide security goals such as search pattern privacy, access pattern privacy, and trapdoor unlinkability.In comparison to other schemes, the MKSE scheme provides the most accurate results but does not support ranking since it is not index-based.Table 1 presents the literature review in a summarized form.The security goals presented in the table are briefly described below: • Search pattern privacy is concerned with identifying keywords given an encrypted trapdoor.If the adversary can identify which keywords are being searched by the client, leakage of search pattern privacy exists.
• Access pattern privacy is termed as the relation of mapping of an encrypted trapdoor consisting of keywords, to a set of documents that contain that respective trapdoor.If the adversary is unable to map a trapdoor to the set of documents; access pattern privacy is preserved.
• Trapdoor unlinkability looks at the link between trapdoor and corresponding keywords.The adversary should not be capable of deducing any link between a keyword and the trapdoor generated by it.

III. PRELIMINARIES A. DGHV
DGHV [50], is a FHE scheme over integers.The parameters include γ -the bit length of integers in the public key, η -the private key length, ρ -noise length and τ -public key integers.The basic phases of DGHV [50] are described as follows: 1) KeyGen (1 λ ) The public/private key pair is generated in this phase.
For the private key S k , a random prime integer p of size η bits is generated.For the public key P k , a random odd integer q 0 is picked such that q 0 ∈ [0, 2 γ /p).Then x 0 is calculated using x 0 = q 0 • p. Further, a PRNG is initialized with a random seed value.Using f (se), a set of integers is generated such that Public key is set as Using the pseudorandom generator f (se), the integers X i are recovered and x i is calculated such that Then, the ciphertext is calculated using: Calculate original message m by taking modulus p then modulus 2 i.e. m = (c mod p) 2 4) Evaluate (P k , C, c 1 , . . . . . ., c t ) Binary addition and multiplication are performed on t ciphertexts using a circuit C with t input bits.Addition and multiplication gates are evaluated and the resulting integer is returned.

IV. SYSTEM MODEL A. NETWORK MODEL
Single data owner/data user (DO/DU) model with Asymmetric SE is considered for ease of implementation and understanding.The single data owner/data user (DO/DU) is synonymous with the Client in the Client Server Model with the cloud server CS as the server.DGHV over integers [50] is used for encrypting documents/files and search queries.The scheme is designed for a serverless cloud environment such that after the initial outsourcing of encrypted data, the function is triggered only when a search query is generated.
There are six phases in this system model shown in Figure 1.
The DO generates public and private key pairs.It is assumed that the public key of the DO is not broadcasted and is used for encryption of files/documents and trapdoors.They can only be decrypted using the private key of the DO, hence providing data confidentiality.After the key pairs are generated, the DO encrypts his/her files/documents using his/her secret key.These files/documents go through AES encryption.The encrypted files /documents are then uploaded to the CS.The CS allows the DO to upload encrypted data at his/her discretion.The files/documents of DO are pre-processed before performing HE i.e. distinct words are picked and then stored in the corresponding file.The chosen words are hashed and homomorphically encrypted and then uploaded on the cloud.Once the DO has uploaded his files to the CS, it can perform SE using encrypted trapdoors.The DO will be able to issue a multi-keyword trapdoor to perform a search on the encrypted files stored on the CS.The trapdoor is encrypted and then sent to the CS.
When the CS receives the trapdoor, it then performs homomorphic operations on the stored files and trapdoor to get the encrypted resultant vector.The CS sends this encrypted resultant vector to the DO.The DO uses the private key to decrypt the encrypted resultant vector to get the required encrypted file names.The DO downloads these encrypted files and uses his secret key to decrypt them to get the search results.

B. THREAT MODEL
The threat model assumes a passive adversary, whereby the CS is honest-but-curious and uses every piece of information it can learn to make inferences, deduce results and exploit such information for its own benefit.The CS has trivial information related to the data stored on the cloud, such as data size and description.The CS also has search queries and their corresponding documents/files [19], [51].The CS can use the available information to exploit data security and obtain sensitive information.

D. CORRECTNESS
The correctness of a MKSE scheme can be verified if for λ, the key pair S k , P k generated by Setup(1 λ ), for encrypted documents C AES output by DocEnc(P k , D, N ), the search using the trapdoor T returns the corresponding keywords kw present in the documents.The correctness of the MKSE scheme can be achieved if the following holds true: a.For kw ∈ D; The soundness of MKSE scheme can be verified if for the security parameter λ, the public-private key pairs S k , P k generated by Setup(1 λ ), for encrypted documents C AES output by DocEnc(P k , D, N ), the search outcome based on the trapdoor T returns accurate results i.e. the resultant response R does not contain false positives.The soundness of the MKSE scheme can be achieved if the following holds true: a.For kw ∈ D; The security definitions for MKSE scheme are discussed as follows:

A. KEYWORD-TRAPDOOR INDISTINGUISHABILITY
Let MKSE = (Setup, DocEnc, DWE, TrapGen, MultiKS, Decrypt) be a MKSE scheme over a set of documents D, the security parameter λ, and polynomial-time adversary A = {A 1 , A 2 , . . ..., A N } where N ∈ N, consider the following experiment, where state of the adversary A is represented by S A .Keyword-Trapdoor Indistinguishability is achieved by the scheme if the following holds true: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

1) DESCRIPTION
The experiment comprises three phases.It begins with the challenger generating a set of encrypted documents.
• Initial Phase: The challenger receives a query Q from the adversary/opponent and the challenger returns the corresponding trapdoor T to the opponent A till the challenger has a set of encrypted document data.
• Challenge Phase: During this phase, the adversary A is asked to select and submit two queries Q 1 and Q 2 of his choice.The challenger generates corresponding trapdoors T 1 and T 2 .The challenger then tosses a fair coin C toss ∈ {0, 1} and based on the result sends the corresponding trapdoors to the opponent A.
• Outcome Phase: The opponent A receives the two trapdoors T 1 and T 2 .He has to correctly guess if the trapdoor belongs to query Q ∈ (Q 0 , Q 1 ).If the opponent is able to distinguish correctly, he wins the game.Otherwise, MKSE provides Keyword-Trapdoor Indistinguishability.

B. TRAPDOOR-DOCUMENT INDISTINGUISHABILITY
Let MKSE = (Setup, DocEnc, DWE, TrapGen, MultiKS, Decrypt) be a scheme over a set of N documents i.e.D = {D 1 , D 2 , D 3 , . . ..., D N }, λ, the maximum number of keywords M , and an adversary A = {A 1 , A 2 , . . ..., A N } where N ∈ N, consider the following experiment: where S A shows the adversary's state.Trapdoor-Document Indistinguishability is achieved by the scheme if the following holds true: The experiment comprises three phases where the challenger initiates by generating the set of queries.
• Initial Phase: An adversary A sends the queries set Q s to the challenger which generates trapdoors T and returns the corresponding encrypted documents till the adversary has a set of trapdoors with corresponding documents.
• Challenge Phase: An adversary A is asked to choose two queries i.e.Q = {Q 0 , Q 1 } such that, a. Q 0 and Q 1 must be unique and distinct b.Q 0 and Q 1 must be present in unique documents i.e.D 1 and D 2 c.Q 0 and Q 1 must be present in the documents' set.
The challenger generates trapdoors to the corresponding queries in the set Q and returns corresponding encrypted documents D 1 and D 2 .
• Outcome Phase: The adversary/opponent A is then asked to choose the trapdoors corresponding to the document.If the adversary A guesses correctly, he wins otherwise he loses.Hence, the probability to win the game is 50%.

C. SEARCH PATTERN PRIVACY
The proposed MKSE scheme is based on probabilistic trapdoors.Such trapdoors ensure that the scheme provides search pattern privacy.Search Pattern privacy entails whether the adversary A is able to identify whether the search is conducted with the same keyword.Since the scheme relies on probabilistic trapdoors, meaning that the trapdoor generated for a particular keyword will differ each time it is generated.This makes it difficult for an adversary to track and identify keywords that have been searched multiple times.

D. ACCESS PATTERN PRIVACY
The access pattern is preserved if the adversary/opponent is not able to identify that the trapdoor Q 1 corresponds to documents D 1 and D 3 .If the adversary can not determine the corresponding documents of a unique trapdoor, access pattern privacy is preserved.MKSE scheme relies on probabilistic trapdoors, hence the encrypted trapdoors will be different for the same keyword for all queries.

VI. PROPOSED WORK
This section discusses the proposed MKSE scheme.Secret key (AES) K p begin choose randomly prime η-bit integer p choose randomly odd q 0 ∈ [0, 2 γ /p) and let x 0 = q 0 .pgenerate a set of integers return S k = p, P k = (se, x 0 , δ 1 , . . ..., δ τ ), K p end algorithm takes input K p , D, and N .It performs AES encryption and outputs C AES .This algorithm is executed at the client end.

3) Distinct Words Encryption:
The distinct words W i are picked from the plaintext documents and then they are hashed using the SHA3 algorithm to optimize the number of homomorphic operations performed in the search phase.The hashes are then encrypted using the DGHV encryption function.This is a probabilistic algorithm that performs fully HE on the set of documents to generate an encrypted set of documents C HE from.It is performed at the DO end.It takes input the public key K p to encrypt the set of documents D.

4) Trapdoor Generation:
Trapdoor generation is a probabilistic algorithm to generate trapdoors for the DO.Multi-keyword query Qi.e.Q = {kw 1 , kw 2 , . . .., kw n } provided by the DO is first hashed using SHA3-224 algorithm.DGHV fully HE is used to encrypt the hashed multi-keyword query provided by the DU.It uses the public key to encrypt the trapdoor.

5) Multi-keyword Search:
Using multi-keyword trapdoor T and the set of encrypted documents C HE , multi-keyword search is performed.Ciphertexts T i and C HE i are subtracted (homomorphic) from each other, and the result is saved in resultByte.This result is then homomorphically added by operator overloading.The K c contains encryption of either 0 or 1 i.e. if the keyword is present or not.The flag kw is initialized with a set of encrypted ones.The size of flag kw is equal to the size of the trapdoor T and for every keyword kw in the trapdoor T , the flag is set to 1 i.e. keyword not found.The flag kw is then homomorphically multiplied to K c by operator overloading.The final result represents whether the keywords present in the multi-keyword trapdoor T are found in the set of encrypted documents C HE .The resultant vector is a set of encrypted document names that contains the multi keywords provided by the DO.The resultant vector is then decrypted using private key (HE) S k .The retrieved documents are then decrypted using K p .These are the documents that encompass all the keywords included in the multi-keyword trapdoor T .The DO has the option to download and decrypt any of the individual documents/files at his/her convenience.MKSE scheme provides probabilistic trapdoors and documents preserving the privacy of the client.Some trivial information is leaked during the communication between the DO and the CS.Those leakages L 1 , L 2 , L 3 are described below:

Algorithm 6 Decrypt
• Leakage L 1 : The leakage L 1 is related to the trapdoor generation phase.Trapdoor generated in MKSE is unique for every multi-keyword query.It is assumed that every entity i.e. cloud server CS, adversary/opponent A has passive access to the trapdoor generated by the data owner/user.Leakage L 1 is defined as: Since the trapdoor is generated with probabilistic encryption, even if the adversary gets access to the encrypted trapdoor, it cannot deduce the corresponding keyword just by looking at the encrypted trapdoor.MKSE scheme provides trapdoor privacy i.e. change in one bit of the plaintext (query Q) will generate a completely different ciphertext (trapdoor T ).
• Leakage L 2 : The leakage L 2 is associated with the search result returned by the cloud server.The cloud server and any adversary/opponent A have access to the search results.Leakage L 2 is defined as: flags found represent the encrypted vector returned by the cloud server to the data owner/user.It contains encrypted file names that had the multi-keyword trapdoor present in them.Because the flags found is encrypted, the adversary can collect multiple responses, and try to guess which file names correspond to which trapdoor.But even with all of the accumulated information, the adversary will not be able to guess the correct relationship between the trapdoor and the set of documents since both are generated using probabilistic encryption, hence the ciphertext is different for the same plaintext every time it is encrypted.
• Leakage L 3 : The leakage L 3 is associated with the T and the number of keywords present in it.
• Leakage L 4 : The leakage L 4 is associated with the size of encrypted documents/files that are uploaded on the CS.Homomorphically encrypted and standard encrypted documents/files are stored on the cloud.CS is aware of the documents/files being uploaded by the data owner DO.It also has the knowledge of storage resources consumed by the DO.Leakage L 4 is defined as: Considering all the leakages, the scheme provides data privacy to the DO.The leakages are insignificant in terms of the content that is leaked to an adversary.Since trapdoors and files/documents are encrypted, the information gained by the adversary is inconsequential.

VIII. IMPLEMENTATION
The library used for DGHV homomorphic operations is DGHVlib-v1.1 [52], [53].It has various implementations of DGHV such as DGHV itself [50], CMNT [54] and CNT [55].CNT [55] is chosen for DGHV implementation as it provides public-key compression and modulus switching.MKSE scheme relies on hashing algorithms before HE to enable secure search.Because of that, multiple hashing schemes are discussed in the next section to analyze their security.

A. COMPARATIVE ANALYSIS OF HASHING ALGORITHMS
Various secure hash algorithms are compared and considered to determine the most efficient and secure hashing algorithm for the MKSE scheme.A brief comparison based on block size, output size, number of rounds, and cryptographic weaknesses is given in table 4. The SHA-3 hashing algorithm is chosen for the MKSE scheme as it is secure against collision attacks and length extension attacks, as seen in the table 4.

B. DATASET DESCRIPTION
The dataset used in this research is the Switchboard-1 Telephone Speech Corpus (LDC97S62) [56] which has been collected between 1990-91 under under DARPA sponsorship.The dataset is a constituent of around 2400 spontaneous conversations with an average length of 6 minutes.Around 240 hours of recorded speech has been covered in every major dialect of English (US).It contains over 120,000 unique keywords.Over 1000 files are used for the demonstration of the Multi-keyword SE (MKSE) scheme with an average file size of 5.2 KBs.

C. SYSTEM SPECIFICATIONS
The system specifications for the public cloud server CS (Contabo has been used in this research) and the client side are given in Table 5.

IX. PERFORMANCE ANALYSIS
The implementation of the MKSE is executed in C++ using DGHV library [52] and the results are compiled using the ''matplotlib'' library of Python 3.10 that is used for graphical representation and visualization.In this section, we further discuss the computational overhead caused by each phase.Then, we discuss the network latency in the communication overhead subsection.In the end, the comparison between storage consumed by plaintext data and encrypted data is illustrated.

A. COMPUTATIONAL OVERHEAD
The computational overhead is discussed in terms of phases of the MKSE scheme in Table 6 along with a comparison of some other SE schemes.The asymptotic notations are represented for every algorithm.The time complexity for the Setup phase is O(1) as it generates key pairs.The standard encryption algorithm takes the total number of documents and encrypts them using AES.The trapdoor generation algorithm depends on b representing the keywords in the query Q.The multi-keyword search algorithm takes as input the number of encrypted files N , the maximum number of distinct words in a file/document M , and the number of keywords B in the query phase.The asymptotic notation for the search algorithm is given by O(Nbm).The decryption algorithm depends on the total number of encrypted documents N , hence the time complexity is given by O(N ).

1) DOCUMENT ENCRYPTION
The files/documents were encrypted via AES in approximately 6 seconds for a dataset of 1000 files as illustrated in Figure 2.

2) DISTINCT WORDS ENCRYPTION
The files are pre-processed in such a way that the unique words are picked from each file and then they are hashed using the SHA3 algorithm.These hashes are then encrypted using DGHV-HE so that we can perform a search in phase 5.The time it takes for the files to go through all of these steps is shown in Figure 3, which shows that with the increase in files, the time taken by this algorithm also increases linearly.
115214 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.For encrypting the 1000 files in the dataset, the DGHV encryption takes approximately 120 minutes.

3) TRAPDOOR GENERATION
The time it takes to generate multi-keyword trapdoors is shown in Figure 4.The trapdoors are generated starting from two keywords to ten keywords.The graph shows that it takes 0.40 seconds to generate a 10-word trapdoor.The query in plaintext contains 'one two' for two keywords and 'one two three' for three keywords and so on.

MULTI-KEYWORD SEARCH
The MKSE algorithm is executed for multi-keyword trapdoors containing two, three, and four keywords as illustrated in Figure 5.The search is performed over the public cloud ''Contabo''.The plaintext query containing two keywords is 'hello world'.For three keywords, the plaintext query used to search on the dataset is 'I Am Here' and similarly for four keywords, the plaintext query is 'I Am Here Now'.It is worth noting that as the number of keywords to be searched for increases, the search time also increases.However, the algorithm depicts a linear growth for searching across these trapdoors.

5) DECRYPTION
The file/document decryption time is presented in Figure 6 which illustrates that the decryption algorithm takes 6 seconds to decrypt a dataset of 1000 files.The client may download a subset of these files depending on the query, which will take a negligible time.

B. COMMUNICATION OVERHEAD
This is computed by the time taken by the client to send a trapdoor to cloud.The cloud performs the multi-keyword search and sends results to the client end.This time is illustrated in Figure 7 as a comparison between search time with latency and without latency.It may be noted that the difference is minor in both plotted lines.This shows that the latency is insignificant in comparison to the search time.

C. STORAGE OVERHEAD
This section discusses and analyzes the storage overhead of the MKSE scheme at the client and cloud server end.The storage consumption at the cloud server consists of AES encrypted files/documents (6.9 MB), and homomorphically (DWE) files/documents (24.6 GB).
At the client end, storage is consumed by DGHV publicprivate key pair (143.2 kB), security parameters (220 bytes), and AES secret key (64 bytes).Table 7 represents the size of the dataset before and after encryption has been performed.The increase in the dataset from MBs to GBs emphasizes the need to have a cloud platform for data storage services.

X. CONCLUSION AND FUTURE WORK
This study presents a privacy-preserving HE-based solution that enables MKSE over a serverless cloud computing domain.The security and privacy issues of the cloud data are addressed by utilizing the probabilistic encryption of DGHV.Performing search through probabilistic trapdoors on probabilistically encrypted documents has been addressed through this research which preserves the search and access patterns.The security analysis of the MKSE scheme presents possible security leakages in the proposed scheme.Further, the performance analysis of the MKSE scheme is discussed in terms of computational, communication, and storage overheads incurred during the execution of the scheme.Computational Overhead is discussed in terms of time complexity for every algorithm in the proposed MKSE scheme.Results show that the MKSE scheme is practical and provides a higher level of security in terms of privacy goals.It also gives the data owner/user the option to provide multiple words in the query to find the desired files/documents.The MKSE scheme is tested on a real-world dataset by deploying it on the public cloud ''Contabo'' to analyze its performance and security.The scheme provides search as well as access pattern privacy while providing the data owner/user with accurate results.In the future, we plan to introduce parallel processing thus ensuring performance enhancements.

Algorithm 2
DocEnc Input: Secret key (AES) K p , Set of documents D, Number of documents N , Output: AES Encrypted documents C AES for i ← 0 to N do C AES i = Encrypt AES (D i , K p ); end Return C AES Algorithm 3 DWE Input: Public Key P k , Set of documents D, number of documents N Output: Set of homomorphically encrypted documents C HE while

Algorithm 5
MultiKS Input: Encrypted Multi-keyword Query T , Set of encrypted documents C HE , Public key P k , number of documents N Output: Encrypted Response R for i ← 0 to b; b is the number of keywords in T do flag kw = Encrypt HE (One, P k ); end Return flag kw for i ← 0 to N do for j ← 0 to b do initialize K c ←− Encrypt HE (Zero) resultByte = Subtract HE (T [j], C HE [i]); K c = add HE (K c , resultByte); end flag kw [i] = multiply HE (K c , flag kw [i]) for k ← 0 to b; b is the number of keywords in T do flags found + = flag kw ; end R ←− flags found end Return R 6) Decryption:

TABLE 1 .
Comparative analysis of multi-keyword searching schemes.
Setup (1 λ ) → (S k , P k , K p ) It is a probabilistic algorithm to generate a private HE key S k , secret AES K p , and public HE key P k on the client end.It takes a security parameter λ as input and then outputs S k , P k , and K p to enable encryption and decryption on the client side.2) DocEnc (K p , D, N ) → C AES Document encryption takes input K p , set of documents D, number of documents N , and outputs corresponding encrypted documents C AES .The encryption algorithm is executed at the client end.
3) DWE (P k , D, N ) → C HE Words encryption is a probabilistic algorithm, executed at the client's end that takes input P k , D, number of documents N , and outputs corresponding encrypted documents C HE .4) TrapGen (Q, P k ) → T Trapdoor Generation takes input multi-keyword query Q consisting of keywords kw, and the public key P k to generate T .This algorithm is run at the client end to search over a set of encrypted data.5) MultiKS (T , C HE , N ) → R Multi-keyword search algorithm takes input encrypted multi-keyword query T , set of encrypted documents C and the number of documents N to perform search and returns encrypted response R to the client.This algorithm is executed at the cloud server CS. 6) Dec (K p , S k , C AES , R, N ) → F The phase takes place at the user's end.Decryption algorithm takes the secret key (AES) K p , private key (HE) S k , set of encrypted documents C, encrypted vector returned to the client R, number of documents N and outputs decrypted set of documents F.

Table 2
k , S k , and K p .2) Document Encryption: Documents/files of DO are encrypted using secret key K p generated in the Setup phase.Using standard AES encryption, documents/files of the DO are encrypted and uploaded on the CS.The document encryption
Algorithm 1 Setup Input: Security parameter λ Output: Private key (HE) S k , Public key (HE) P k ,

TABLE 3 .
Comparative analysis of hashing algorithms.

TABLE 4 .
Comparative analysis of hashing algorithms.

TABLE 6 .
Comparison of computational complexity.

TABLE 7 .
Data size before & after HE).