Using Granule to Search Privacy Preserving Voice in Home IoT Systems

The Home IoT Voice System (HIVS) such as Amazon Alexa or Apple Siri can provide voice-based interfaces for people to conduct the search tasks using their voice. However, how to protect privacy is a big challenge. This paper proposes a novel personalized search scheme of encrypting voice with privacy-preserving by the granule computing technique. Firstly, Mel-Frequency Cepstrum Coefficients (MFCC) are used to extract voice features. These features are obfuscated by obfuscation function to protect them from being disclosed the server. Secondly, a series of definitions are presented, including fuzzy granule, fuzzy granule vector, ciphertext granule, operators and metrics. Thirdly, the AES method is used to encrypt voices. A scheme of searchable encrypted voice is designed by creating the fuzzy granule of obfuscation features of voices and the ciphertext granule of the voice. The experiments are conducted on corpus including English, Chinese and Arabic. The results show the feasibility and good performance of the proposed scheme.


I. INTRODUCTION
Voice activation devices, such as Amazon Alexa, Apple Siri, Google Assistant or Microsoft Cortana were widely used on over 2 billion smartphones in 2018. Moreover, as the demand for smart home devices continues to grow, sound interaction devices such as Amazon Echo, Apple Home-Pod, or Google Home are also widely deployed. When people enjoy using these devices, personal privacy may be revealed if the data is stored in the cloud server with the plaintext. Therefore, data owners tend to encrypt the data and then outsource the ciphertext to the cloud server. However, with the proliferation of data volume and number of users, cloud servers may become the performance bottleneck of cloud services. This results in the long waiting time and seriously affects the user's search experience. Hence, how to quickly obtain the search results in the vast ciphertext is a challenge for using the personalized search technology.
The associate editor coordinating the review of this manuscript and approving it for publication was Xiaochun Cheng .

A. CIPHERTEXT SEARCH SCHEMES
The existing ciphertext search schemes can be classified into searchable symmetric encryption (SSE) framework and public key encryption with keyword search (PKEKS) framework. According to technical details and its inherent nature, the SSE scheme is further divided into a sequential scan scheme and a secure index scheme. Song et al. first proposed the SSE scheme based on sequential scanning in [1] by splitting the plaintext into ''words'' and then encrypting them. When the user submits a search request, the ciphertext file containing the keyword is returned by sequentially scanning and comparing the ciphertext word with the keyword to be retrieved. It was able to support searching for any word in a file. However, its efficiency was extremely low as the server had to traverse the entire file during the search. Moreover, the scheme cannot resist the frequency analysis attack on ciphertext.
Goh [2] proposed an improved SSE scheme based on secure forward index. The secure index of each file was matched to keywords by the server and the user's keyword VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ search can be supported with a high efficiency. However, the search results were not completely correct due to the positive mis-detection probability of the Bloom Filter in the building index. This may bring some additional overhead of bandwidth and computation to users. The scheme security can reach the indistinguishability against chosen keyword attack. In contrast, the scheme presented by Chang et al. [3] can avoid the positive mis-detection probability of Goh's scheme. Furthermore, by adopting the inverted index construction method, its ability to anti-selective keyword attack was stronger than Goh's scheme. It can resist the adaptive selection keyword attack. Curtmola et al. [4] further improved and clearly defined the security of SSE scheme. On the one hand, they proposed SSE-1 and SSE-2 solutions to achieve indistinguishable security under adaptive and non-adaptive models. On the other hand, PEKS. Boneh et al. [5] proposed the PEKS scheme and presented several construction schemes based on bilinear pairings. Abdalla et al. [6] further gave the complete definition of the PEKS scheme, and presented the process of constructing PEKS based on the identity anonymity scheme. In [7]- [9], researchers designed PEKS schemes that don't require a secure channel under the random language model and the standard model.
Subsequently, some improved searchable encryption schemes were proposed for various scenarios that promoted the development of searchable encryption technology [10]- [16], [20]. Zhao et al. [17] combined content filtering and collaborative filtering to provide users with personalized search results. Experimental results showed that the method can provide accurate search results and improve the user's search experience. Leung et al. [18] obtained the user's interest preference by mining the user's click data, and introduced the user's location information, and adopted entropy to balance the weight between the user's preference and the location information. This method improved the search accuracy and promoted the user's search experience.
However, it is still a challenging task to achieve a personalized search in a ciphertext environment and improve the user's search experience. Fu et al. [19] constructed user models based on the user's search history and integrated users' interests into the user's query keywords through keyword priority according to the word net. Then they searched the ciphertext stored on the cloud server and got the top K search results with the highest relevance score of the user to achieve personalized search in the ciphertext environment. These searchable encrypted schemes proposed above are from the perspective of numerical calculation.
In summary, the schemes mentioned above did not pay much attention to the hierarchy of data. To improve the performance, we will design a new scheme based on the hierarchy of data from the granular computing viewpoint.

B. GRANULAR COMPUTING
Information granule is an information that is ubiquitous around us. It is a basic concept of human to know the world.
Humans tend to put a part of similar things together as a whole in understanding the world to study their nature or characteristics. In fact, this way of dealing with things is information granulation and the study of the ''whole'' is called information granule. In granular computing, the information granule is used as the basic operation unit instead of the sample, and the exact solution is replaced by the approximation solution, which can achieve the purpose of designing high performance algorithm.
As a methodology, granular computing aims to effectively establish an external world-based, user-centric concept that simplifies the understanding of the physical world and the virtual world. In the process of solving the problem, the ''granule'' with the appropriate level of granularity is used as the processing object, so as to improve the efficiency of solving the problem under the premise of ensuring satisfactory solution. Since Zadeh published the first paper on information granularity in 1979, researchers have made in-depth research on granular computing theory and models, and combined them with computational intelligence and machine learning techniques. A lot of research results have been achieved.
The appropriate granularity is often determined by the problem itself and its context, which is important for designing data processing framework based on granular computing. For example, someone asked his or her friend, ''When did you return home in China?''. The time granularity chosen to answer this question is actually determined by how long his or her friend has been back to China. If it was not more than one day, then the answer could be ''Yesterday afternoon''. If it was more than one week, the answer can be ''Last week''. Note that the above answers have different granularities, namely afternoon and week. If you do not use the appropriate granularity but the unified time stamp format to answer, such as: ''at 1:00 am yesterday'', it might make people feel awkward.
As early as 1979, a famous American cybernetic expert, Zadeh [21] firstly presented the problem of fuzzy information granulation. He believed that human cognition can be summarized into three main characteristics: granulation, organization, and causation. In 1985, Hobbs [22] proposed the concept of granularity. In the early 1990s, Zhang et al. [23] pointed out that ''a recognized characteristic of human intelligence is that people can observe and analyze the same problem from very different granularities in their monograph ''Question Theory and Application''. People can not only solve problems in different granular worlds, but also quickly jump from one granular world to another, freely and easily, without difficulty.'' This ability can deal with different granular space and is a powerful manifestation of human problem solving.
Yager and Filev [24] further pointed out that ''people have formed a granular view of the world, in which human observation, measurement, conceptualization and reasoning are carried out.'' These views all believe that granulation, as one of the important characteristics of human cognition, plays an important role in the knowledge discovery of complex data. The concept of granular computing was first proposed in 1997, Zadeh [25] and the principles were identified by Pedrycz [26]. Pedrycz showed how information granules were constructed and subsequently used in describing relationships among data items. Later, many scholars in different fields worldwide began to pay attention to this problem, which gradually formed a new research direction in intelligent information processing.
In addition, granular computing has promoted the development of many concepts, such as diagrams [51], information tables [52], knowledge representations [53] and so on. Granular computing is also widely used in time series forecasting [54], manufacturing [55], mission forecasting [56] and information fusion [58].
Data granulation is the process of decomposing complex data into information granules according to a given granulation strategy. According to different data modeling goals and user needs, a variety of granulation strategies can be adopted. Most of the common granulation strategies relying solely on data can be attributed to a granulation scheme based on data binary relations, which essentially distributes two data samples that satisfy a predefined binary relationship into the same granule. In many granulation strategies, data can be granulated into corresponding binary structure by using equivalence relations, similarity relations, maximal similarity relations, fuzzy equivalence relations, fuzzy similarity relations, neighborhood relations, and dominant relations [27]- [34]. The current data granulation strategies and methods are mostly based on single modal characteristics, setting weight parameters between different modal features or simply integrating results, which can not effectively solve the problem of data co-granulation with multi-modal features.

2) MULTI-GRANULARITY PATTERN DISCOVERY AND FUSION
Multi-granularity pattern discovery and fusion are the inherent logic requirements for solving complex problems under the granular computing framework. The so-called multigranularity includes multiple data subsets, multiple subspaces representing a space, multiple different modal variable sets, multiple local or intermediate results in a problem solving process. They correspond to multiple problems angle and multiple local or multiple levels. In order to obtain a global solution to the overall data set or problem, it is necessary to fuse multiple patterns found on a single granularity. Although the term multi-granularity has not widely been used, scholars have conducted research on multi-modality in the fields of medical image analysis, network, video semantic analysis, annotation and retrieval, emotion recognition, and mainly consider data from different modalities. In these situations, the features are extracted separately to form a multi-modal feature space to develop the method of pattern discovery with multimodal features. The current research focuses on three aspects: multimodal data classification based on multicore learning [35], multimodal data modeling based on multidictionary collaborative expression [36] and multimodal data fusion based on deep learning [37].

3) GRANULAR COMPUTING REASONING
Reasoning is one of the important abilities in human intelligence. It is a formal logic, a science used to study people's forms of thinking, laws, and logical methods. The role of reasoning is to obtain unknown knowledge from known knowledge. The reasoning of Granular Computing refers to the logical method of deducting using known information granules or granule spaces. In the field of Granular Computing, there have been some studies on reasoning [21], [38]- [43].

4) HIGH PERFORMANCE ALGORITHMS
In recent years, there have been some preliminary explorations on the use of granular computing to solve big data problems. Ye et al. [44] achieved the clustering analysis of large-scale data by granulating the data space and feature space using integrated learning technology. Chang et al. [45] proposed a big data decomposition method using decision trees, and then separately learned the Support-vectors Machine classifier on each decomposed data granule, which greatly improved the learning efficiency of Support-vectors Machine. Gopal et al. [46] employed the hierarchical relationship between data categories and gave a corresponding Bayesian model to increase its generalization performance. Miao et al. [47] proposed a property reduction method that can be computed in parallel by adopting the data decomposition principle in MapReduce. By splitting the original big data set into multiple easy-to-process information granules, Liang et al. [48] proposed an efficient big data feature selection algorithm by solving and merging the feature selection results on each information granule. Qian et al. [49] employed the information granularity to construct the forward approximation of the rough set and proposed the feature selection accelerator to accelerate a series of feature selection algorithms of forward greedy search. Chen et al. [50] pointed out that different information granules imply different characteristics and patterns, which can be used to design machine learning and data mining algorithms effectively. The challenges were mainly reflected in two aspects: Firstly how to rationalize the information granulation and ensure the effective solution; Secondly how to efficiently obtain an approximate solution by balancing the algorithm efficiency and the solution accuracy.
This paper proposes a novel scheme for searchable symmetric encrypted-voice from the new perspective of granular computing. The rest of this paper is organized as follows. Section II presents the construction of system model for voice retrievals. In Section III, we will discuss how to extract the voice feature, transform the raw data into information granule, encrypt data, and search over encrypted data via granule computing. The system evaluation is given in Section IV to demonstrate the feasibility and performance of the proposed approach. Finally, a brief conclusion and future work are described in Section V.

II. SYSTEM MODEL A. OVERVIEW
The system model has three types of entities: user, HIVS and servers. It is composed of two phases, namely voice uploading phase and voice retrieving phase. During the uploading phase, users upload voice and the keywords to HIVS; the voices are encrypted and the features of voices are extracted and obfuscated by HIVS. Then the features and encrypted voice are submitted to the server for storage. During the retrieving phase, users send the voice query to HIVS; the features of query voice are extracted, obfuscated and uploaded by HIVS to the server; the features matching is done by the server using the scheme proposed in this paper. The answers (encrypted voices) are returned to HIVS. Then, these answers are decrypted by HIVS and sent to the user (See Fig. 1).

B. SCHEME CONSTRUCTION
Our solution consists of two parts: (1) voice pre-processing and uploading server; (2) retrieving data using voice commands (See Fig. 1).

1) UPLOADING VOICE
In the voice uploading phase, the data structure is made up of two parts: objects and keywords, which are uniquely stored on the server. The object is the data that the user wants to store on the server. The keyword represents a category or an attribute of the object. More specifically, the objects are stored as encrypted form on the server. The keywords are saved as the form of features on the server. The relationship between object and keyword can be many-to-many, i.e., one keyword can be associated with multiple objects, or one object can be associated with multiple keywords. For example, ''What holiday is today? holiday, today'', the first element ''What holiday is today? '' is used as an query, and the second and the third element ''holiday, today'' is as a keyword for query. If the server receives another voice for ''What is the holiday today?, New Year'', it will add the new keyword ''New Year'' to the query ''What is the holiday today?''. During the uploading process, the object is encrypted into a ciphertext by AES. The keyword is extracted features by MFCC and then these features are obfuscated. Thus, the obfuscation features and the ciphertext are transmitted to the server for privacy protection.

2) RETRIEVING VOICE
In the voice retrieval stage, when a user sends query command to a server to seek an answer, the query command is firstly sent to HIVS including a keyword. After the feature extraction and obfuscation are performed by HIVS, then the feature is sent to the server. The k-nearest neighbors ciphertext granule search (KNNCGS) algorithm proposed in the paper is adopted. The encrypted answer is returned to HIVS. Then it is decrypted by HIVS through AES algorithm, and the plaintext is sent to the user. In the ciphertext retrieval process, the returned ciphertext may be multiple related answers, and the number of answer can be set by the user to improve performance.

III. SCHEME IMPLEMENTATION
In this section, we will discuss how to extract the voice feature, transform the raw data into information granule, encrypt data, and search over encrypted data via granule computing.

A. EXTRACTING VOICE FEATURE
Mel-Frequency Cipstal Coefficients (MFCC) is a set of key coefficients used to establish the Mel Cepstrum. From the segments in the voice signal, we can get a set of cepstrums that are sufficient to represent this voice signal. The Mel-Frequency Cepstral Coefficient is the cepstrum (namely the spectrum of the spectrum) derived from this cepstrum. Unlike the general cepstrum, the frequency band on the Melt's cepstrum is evenly distributed on the Mel scale. That is, such a frequency band will be closer to the human nonlinear auditory system. MFCC is a distinguishable feature in speech signal processing.
Let v be voice. The m-order of the i th frame can be represented as {v i1 , v i2 , . . . , v im }. After extracting feature, the signal of frames form a matrix below: To reduce the complexity, we employ a vector to denote the signal, which is expressed by average value of frame of voice at MFCC below.
In this section, we designed an approach based on adding noise into feature of voice to match.That is, a reversible (m + 3) × (m + 3) confusion matrix A and three random numbers α, β and γ are introduced in order to hide the features and prevent these ones from being revealed to the server. In other words, voice feature is not directly uploaded to a server, but they are done operation with an obfuscation matrix before uploading, and then the result is uploaded to the server. Specifically, A g i T is uploaded to a server firstly, where When we are searching, m j=1 (a ij − v j ) 2 is used to measure the similarity between voice feature v = (v 1 , v 2 , . . . , v m ) and a i = (a i1 , a i2 , . . . , a im ), and this metric can be equivalent to calculate f g i T , where f = (1, −2v 1 , . . . , −2v m , γ , 1). The proof of the approach is given as the follows.
Lemma 1: Given two voice features v and a i , a reversible random matrix A and a series of random number α, β, γ , we let g i = ( m j=1 a 2 ij + α − β, a i1 , . . . , a im , 1, β) and f = (1, −2v 1 , . . . , −2v m , γ , 1). f g i T can be used as a metric of the similarity between v and a i .
the server can adopt f g i T as a metric of distance between f and a i .

C. FROM RAW DATA TO FUZZY GRANULE
Fuzzy granulation is inspired by human granulation and information processing and is on the basis of mathematics. The promotion mode is divided into fuzzy and granular. Among them, fuzzification is to replace a clear set with a fuzzy set. Granulation is that a collection is divided into granules. Fuzzy granulation is composed of two phases: (1) The fuzzy granulation method is used to transform keyword into fuzzy granule. In this process, fuzzy granule, fuzzy granule vector and operators are defined to represent the feature. (2) The δ-neighborhood of fuzzy granule vector is employed to cluster the encrypted data (namely ciphertext granule). Some concepts such as δ-neighborhood ciphertext granule, ciphertext granule vector and related operators are defined to denote the encrypted data.
Definition 2: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀p i , p j ∈ P and ∀r ∈ R, the distance on r between p i and p j is defined by: where d r (p i , p j ) ∈ [0, 1]. According to Lemma 1, d r (p i , p j ) can be metric between p i and p j on the feature r. Definition 3: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀p ∈ P and ∀r ∈ R, fuzzy granule of the plaintext p on an atom feature r can be defined by: The former of the sequence pair is the plaintext, the latter of that is the distance between p i and p j on the feature r, in short, that is d ij = d r (p i , p j ).
Definition 4: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀p ∈ P and ∀r ∈ R, the module of fuzzy granule N r (p) can be defined by: It is easy to get 1 ≤ |N r (p)| ≤ |P|, where |P| denotes the number of elements in P.

2) CIPHERTEXT GRANULATION
We give some definitions of fuzzy granule, fuzzy granule vector, metrics and operators based on fuzzy set in the last section. In this section, on the basis of ciphertext, key and fuzzy granule of plaintext, we define ciphertext granule, ciphertext granule vector, operators and metrics to prepare the presentation of KNNCGS. As shown in the definition 10, we decrypt the ciphertext by the key to get the plaintext. On the basis of fuzzy granule of plaintext, we can define ciphertext granule of δ-neighborhood. After that, ciphertext granule vector, operators and metrics are also defined by ciphertext granule.
Definition 10: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀e ∈ E and ∀r ∈ R, the ciphertext granule of e on the feature r in δ-neighborhood (δ > 0) can be defined by: Definition 12: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀e ∈ E and ∀r ∈ R, the cardinal number associated with ciphertext granule in δ-neighborhood M δ r (e) can be defined by |M δ r (e)|, which denotes the number of elements. It is easy to get: 1 ≤ |M δ r (e)| ≤ |E|. Definition 13: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀e ∈ E and any subset Q ⊆ R (here, Q = {r 1 , r 2 , . . . , r k }, (k ≤ m)), the module of ciphertext granule vector of e on feature subset Q can be defined by: Definition 14: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀e i , e j ∈ E, M δ r (e i ) and M δ r (e j ) are ciphertext granules on r ∈ R in δ-neighborhood. We define four operators, ∩, ∪, − and ⊕ below: Definition 15: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext. For ∀e i , e j ∈ E,M δ R (e i ) andM δ R (e j ) are ciphertext granule vectors on feature set R in δ-neighborhood. We define four operators ∩, ∪, −, and ⊕ below: Definition 17: Let SS = (P, E, R, V , K , θ, f , g) be a searchable system over ciphertext, where P = {p 1 , p 2 , . . . , p n } is a plaintext set, R = {r 1 , r 2 , . . . , r m } is a feature set, and E = {e 1 , e 2 , . . . , e n } is a ciphertext set. For ∀p ∈ P, e p ∈ E, we can define a rule on R as: , e p >. Furthermore, rule library can be defined as: LB R = {lb R (p)|∀p ∈ P}. Search over ciphertext can be converted into reasoning and matching in the rule library LB R .

D. ENCRYPTING DATA
In this paper, we adopt AES for encryption and decryption for the small calculation overhead and a large block of data. The limitation is that a key has to be negotiated between the encryption side and the decryption side in advance, and then transmitted through the secure channel.

E. K -NEAREST NEIGHBORS CIPHERTEXT GRANULE SEARCH 1) k-NEAREST NEIGHBORS FUZZY GRANULE VECTOR
Definition 18: Given a searchable system over ciphertext SS = (P, E, R, F, K ), let Z be a fuzzy granule vector group on R, where k > 0 and k is an integer. For any fuzzy granule vector z ∈ Z , k-nearest neighbors fuzzy granule vector of z can be defined by: A fuzzy granule vector group can be viewed as a set. The k-nearest neighbors fuzzy granule vector group is a subset of fuzzy granule vector group. They are the nearest k granule vectors to z in the fuzzy granule vector group. k-nearest Neighbors Ciphertext Granule Search (KNNCGS) is a decision algorithm based on fuzzy set operation, which is divided into granulation, matching and making decision process. The principle of KNNCGS is discussed below, and the algorithm is given.

2) PRINCIPLE OF KNNCGS
The KNNCGS includes granulation, matching, and making decision processes. The granulation process involves data pre-processing, dividing the training set and the test set. In the training set granulation, the feature fuzzy granulation and ciphertext granulation can form a rule library. Fuzzy VOLUME 8, 2020  granule vector matching process includes: Calculating the distance between test granule vector and all granule vectors in the rule library; Sorting by the distance; Selecting k-nearest rules. The decision process is to judge category of ciphertext granule according to fuzzy granule vector. The principle is as follows.
• Step 1. Pre-processing data -Delete the data with missing values and normalize the data set to the range [0, 1].
• Step 2. Divide 80% of data set as the training set and 20% of that as the test set.
• Step 3. Granulate data according to atom feature extracted by MFCC and obfuscated by obfuscation function and form fuzzy granule, fuzzy granule vector and ciphertext granule to build a rule library.
• Step 4. Searching and matching of fuzzy granule vectors. Take a test fuzzy granule vector, and calculate the distance between the test fuzzy granule vector and that of each rule, then sort the distance by ascend and select the top k fuzzy granule vectors.
• Step 5. Decision. The class of having the largest number of ciphertext granules associated with the k fuzzy granule vectors are selected as the final ciphertext granules (i.e., decision ciphertext granules).
• Step 6. Go to Step 4 (Searching and matching of fuzzy granule vectors) and make the next test granule to decide, until all the test granule are finished. Get all decision ciphertext granules corresponding to all test fuzzy granule vectors.
• Step 7. Return the ciphertext with the corresponding ciphertext granule.

3) k-NEAREST NEIGHBORS CIPHERTEXT GRANULE SEARCH
After giving the principle above, we design the related algorithm, k-nearest neighbors ciphertext granule search (KNNCGS). The part of the algorithm is performed in a server (see Table 2), and the other part of the algorithm is executed in a client (see Table 3).

A. SECURITY ANALYSIS
In our scheme, each sound can be encrypted with a unique key using AES which belongs to a symmetric encryption algorithm. Because of the AES security, the voice can not be decrypted by the adversary. Since each voice can be encrypted with a different key, it can be guaranteed that the same content in different voices will be encrypted into different ciphertext. It is a deterministic encryption that is resistant to selective plaintext attacks. When voice is uploaded to server, stored in server, and downloaded from server, the voice exists in the form of obfuscation feature and ciphertext. The features are calculated by MFCC and processed by obfuscation function, which can be hided and almost irreversible. In other words, the obfuscation features are very difficult to be recovered to the original voices. Therefore, the whole process is secure and reliable.
We analyse the security of our concrete scheme. The proposed scheme is adaptively secure (i.e. satisfies definition in [57]).
Proof: What we need to do is to construct a simulator S = {S 0 , . . . , S q } such that for the adversary A = (A 0 , . . . , A q ), the outputs of Real(k) and Sim(k) are computationally indistinguishable. We construct a simulator S = S 0 , . . . , S q that adaptively produces a vector v = (t , E ) = (t 1 , . . . , t n , M δ R (e 1 ) , . . . , M δ R (e n ) ) where t i indicates the trapdoor of f i and t i = f i Q −1 , (Qarrow r M m+3,m+3 (F), where M m+3,m+3 (F) is a predetermined finite integral matrix group consisting of invertible (m + 3) × (m + 3) matrices over field F.) as the follows: 1. S 0 (1 k , τ (F)): it constructs a simulated A i arrow r M m+3,1 (F) such that for the matrix A |E|×(m+3) = (A T 1 , . . . , A T |E| ). For a matrix A, the rank of A is denoted by Rank(A) = min(|E|, m + 3). So then includes A in A s state st s and outputs (E , st s ). We now claim that A i are indistinguishable from A g i , where A g i = Q g T i . It is evident that the distributions over A i and A g i are identical. Furthermore, since the private-key encryption scheme is secure, each M δ R (e i ) is indistinguishable from a real cipher granule vector.
2. S 1 (st s , τ (F, f 1 )): it solves system of linear equations A f = b 1 , where b 1 indicates the closeness degree between queried word f i and noisy keyword g j . Note that it knows b 1 from the trace of (F, f 1 ). We denote a solution of A f = b 1 by t * (if there exists solution). Let t 1 = t * T that is indistinguishable from a real trapdoor t 1 , since t 1 × A g i = t 1 × A i holds for i ∈ [1, |E|]. S 1 then includes t 1 in st s and outputs (t 1 , st s ). (st s , τ (F, f 1 , . . . , f i )) : S i generates a trapdoor t i in the same way that S 1 does, i.e. by solving the system of linear equations A f = b i . S i then includes t i in st s and outputs (t i , st s ). It is evident that t i is indistinguishable from a real trapdoor t i . This completes the proof.

B. EXPERIMENTAL RESULTS
To measure how well the KNNCGS performed at encrypted voice, we used 300 words of voice as a corpus to experiment, involving English, Chinese and Arabic. Since the value range of the data set is different, the data set needs to be normalized and obfuscated (see Table 2). The features of voice can be fuzzy granulated and form a fuzzy granule vector. Then, we granulated the ciphertext with δ-neighborhood of the fuzzy granule vector to build ciphertext granules. In order to verify the performance of the scheme, we compared KNN adopted in raw data with KNNCGS used in granule form. And we took the accuracy and recall as metrics of performance.
We fist explain criteria of the performance evaluation. In the evaluation, we exhibited the relationship between metrics and the parameters of nearest neighbor and δ neighborhood (See Fig. 2-9).
As shown in Fig. 2, when K = 3 (the parameter of nearest neighbor), the accuracy of KNN was 0.93. In constrast, KNNCGS' accuracy reached peak value 0.951 at δ = 0.55. It improved by 2.26%. The accuracy of KNNCGS was almost higher than that of KNN between δ = 0.05 and δ = 0.85. From δ = 0.85 to δ = 1, with δ rising, KNN's accuracy was higher KNNCGS's. In most cases with K = 3, KNNCGS is better than KNN at accuracy.
When K is 5, the results for different δ are exhibited in Fig. 3. Compared by Fig. 1, KNN     As demonstrated in Fig. 4, when K = 7, KNNCGS achieved top value 0.952 at δ = 0.6 and was increased by 1.38% (KNN's accuracy was 0.939). KNN made an improvement by 0.64% and 0.97% compared by itself with K = 5 and K = 3 respectively. From δ = 0.15 to δ = 0.85, KNNCGS'accuracy was always higher than KNN's accuracy. However, when δ > 0.85, the accuracy of KNNCGS was dropped quickly and decreased by 2.84% compared with its top value.
As shown in Fig. 5, when K reached 11, KNN and KNNCGS both decreased at top value. As far as KNN was concerned, it only reached 0.91 and dropped by 4.41% compared with its peak value. In contrast, KNNCGS's accuracy was 0.932 and decreased by 2.10% at its top value. Compared by KNN, the accuracy of KNNCGS is still higher than that of KNN between δ = 0.25 and δ = 0.75.
The recall rate is another important metric of performance. From Fig. 6 to Fig. 9, we compared the recall rate between KNN and KNNCGS. As shown in Fig. 6 (here, K = 3), KNNCGS achieved 0.961 at its peak value, but KNN was 0.95. KNNCGS improved by 1.12%. KNN was lower than KNNCGS between δ = 0.4 and δ = 0.65. The valley value of KNNCGS was 0.937 (decreased by 1.37%). At δ = 0.1, 0.2, 0.4 and 0.7, KNN and KNNCGS were almost the same.
When K = 7 in Fig. 8, KNNCGS got a recall rate of 0.958 at δ = 0.5 but KNN reached 0.949 (improvement 0.95%). When δ < 0.3, the recall rate of KNN was higher than that of KNNCGS. The metric of KNNCGS was increased quickly between δ = 0.05 and δ = 0.3. The rate of growth was 19.2%. It reflected that the neighborhood parameter δ is important to the results.
The recall rates of both KNN and KNNCGS were decreased when K = 11, as shown in Fig. 9. KNN reached its valley value of 0.906 and had dropped by 4.63% compared with the highest value (when K = 5). Similarly, the maximum recall rate of KNNCGS was down to 3.12% from the highest   historical value. KNNCGS made an improvement by 2.87% at δ = 0.5 compared with KNN. From δ = 0.25 to δ = 0.7, KNNCGS was performing slightly better.
The space cost is to measure the space efficiency of the index data structure. The space cost of the index should be practical compared to the original data size. Search time is to evaluate the search speed of answering on search query over the encrypted similarity sample. It includes the times of extracting features, cluster, fuzzy granule, encryption and decryption. As shown in Table 4, when the dictionary size is 10, the average search time of KNNCGS is more than that of KNN, and the average space cost of KNNCGS is a little more than that of KNN. When K = 25, the average search time of KNN is 0.12 seconds and that of KNNCGS is 0.16 seconds. The search time and space cost of KNN is superior to those  of KNNCGS. The main reason is that KNNCGS involves the granule process compared with KNN.
When the size of dictionary is 30, scheme I and II of [59] were compared with KNNCGS. As demonstrated in Table 5, the security of scheme I and II were both CPA-secure and that of KNNCGS was adaptively secure. KNNCGS achieved the average accuracy of 95%. Scheme I and II were 93% and 94%. It enhanced by 2.05% and 1.06% respectively. The average recall rates of three algorithms were the same and were 93%. The average search time of KNNCGS is 0.17 seconds. It increased by 13.33% and 21.43% respectively. It costed time in granule process. The average space cost of KNNCGS increased by 0.06 MB and 0.07 MB respectively compared with scheme I and II.
Overall, KNNCGS outperforms KNN by adjusting its neighborhood parameter δ. The main reason lies in two aspects. On the one side, fuzzy granulation was considered before searching and it embodied the view of the collective structures of all voices. On the other side, the equivalence class principle was taken into account, which can cluster the encrypted voice according to fuzzy granule vector. The cluster voice can be achieved by KNNCGS. In contrast, KNN only got the optimal solution by calculating raw features.

V. CONCLUSION
This paper has presented the design of a searchable scheme over encrypted voice by using the Granule Computing technique. The voices' features obfuscated and the voices encrypted by AES algorithm were stored in the server. In order to prevent the restoration of voice features, we also use the obfuscated function to further process the features of the voice. The security is improved greatly by binding obfuscated features and encrypted voice. In addition, a series of concept have been defined, such as fuzzy granule, fuzzy granule vector, ciphertext granule, operators and metrics. Based on the defined concepts, both the neighbor fuzzy granule vector and the counting voting strategy were deployed to retrieve the ciphertext. The results were returned as the form of ciphertext granule, i.e. ciphertext equivalence class. Its security was analysed. The experimental results demonstrated that KNNCGS employed in encrypted voice is feasible and secure. Also, its performance is superior to that of KNN given special parameters.
The performance of KNNCGS is very much depended on neighbor parameter δ and the balance of dataset. In the future, we plan to consider the localized granulation rather than the global one, as well as parallel and distributed strategies, in order to improve the performance further and apply the scheme to the research of big data. HUOSHENG HU (Senior Member, IEEE) is currently a Professor with the School of Computer Science and Electronic Engineering, University of Essex, U.K., leading the Robotics Research Group. His research interests include mobile robotics, human-robot interaction, embedded systems, mechatronics, learning algorithms, and cloud computing. He has published over 500 articles in journals, books, and conferences in these areas. He is a Fellow of the IET and InstMC. He currently serves as an Editor-in-Chief for the International Journal of Automation and Computing, and the Editor-in-Chief for the MDPI Robotics Journal.
CHAO TANG was born in Hefei, Anhui, China. He received the M.S. degree from Shanxi University, Taiyuan, China, in 2009, and the Ph.D. degree in artificial intelligence from Xiamen University, Xiamen, China, in 2014. He is currently an Associate Professor with the Department of Computer Science and Technology, Hefei University, China. His research and project works focus on machine learning, computer vision, and human action recognition. VOLUME 8, 2020