Differential Privacy for Weighted Network Based on Probability Model

Weighted network contains a lot of sensitive information and may seriously jeopardize individual privacy. In this paper, we study the problem of differential privacy for weighted network. We found most existing methods add noise to edge weights directly and neglect the structural role of node. These methods perform with low accuracy. To address the above issue, we propose two approaches. One approach describes a differential privacy method for Stochastic Block Model. This private SBM reveals and the structural role of node and respects the privacy of it. Another approach develops a differential privacy method for weighted network through structuring a private probability model. We use Variational Bayes to learn the private model parameters. It adds noise to the parameters of the probability model instead of edge weights, and achieve high data utility. Experiments on real datasets illustrate that our algorithm privately releases weighted network and achieves high accuracy.


I. INTRODUCTION
Social network is a form of dataset consisting of interactions between pairs of individuals. Network data is represented by graph structure where vertices represent individuals and edges represent interactions. Recently, social network is studied by sociologists, economics and informatics. Many networks contain highly sensitive personal information, and releasing them would pose seriously threats to individual's privacy. To respect the privacy of personal information, network data should be released for public with ''sanitization''. Anonymization techniques(e.g., k-anonymity [1] and l-diversity [2]) are traditional methods to ensure network data privacy. Recently, differential privacy has been proposed as a way to address such privacy problem. Unlike the anonymization methods, differential privacy provides strong theoretical guarantees against adversaries with prior knowledge. The standard technique of ensuring differential privacy is to ''sanitization'' the presence or absence of an edge.
The associate editor coordinating the review of this manuscript and approving it for publication was Longxiang Gao .
Unlike the relational data, vertices of network data are pairwise related and play latent structural roles in generating the network's structure. Community is a common representation of such structural role, and identifies a network partition which groups together vertices with similar structural roles. Such structural role is important to the structure of network. It is necessary to respect the privacy of structural role. However, current locus of differential privacy of network is around topological structure and neglect the structural role. These methods cannot generate synthetic network which has similar structural features with origin network. It is difficult to analyze the cluster features using these private synthetic network. In this paper, we first propose a differentially private SBM algorithm called SSN(Sufficient Statistic Noisy). The stochastic block model (SBM) [3], [4] is a popular generative model for learning community structure of unweighted networks and it presents the connecting probability of pairwise interactions among n vertices. Each vertex belongs to one of K latent groups and the probability of each edge only depends on the group memberships of the vertices. Vertices in the same group play similar structural roles and are equivalent in generating the network's structure. Thus, we could respect the privacy of structural roles through the differentially private SBM algorithm.
For unweighted networks, the presence or absence of edge is represented as a binary variable. However, most real-world networks are weighted networks which network edges have weights. Christopher Aicher et al. [5] introduced the Weighted Stochastic Block Model (WSBM), a generalization of the SBM for weighted networks. WSBM uses an efficient variational Bayes approach to learn the parameters and it handles one technical difficulty in fitting in with weight distributions of edges, such as the degeneracy in the likelihood calculation. What's more, WSBM is important to our differentially private weighted network releasing method.
Perturbing the edge weights directly is a common approach to release differential privacy weighted network. However, this kind of approaches incur excessive noise. As the sensitivity of direct perturbation is the maximum edge. When most of edge weights are much less than the maximum edge, it results that the sensitivity to be prohibitively high and suffers poor performance. In this paper, we propose a differential privacy method for weighted networks, called VB-WNDP(i.e.,Variational Bayes-Weighted Network Differential Privacy). Firstly, we use the idea of SSN algorithm to protect the privacy of structural role. Then, we construct a probability model of weighted network and use Variational Bayes to learn the model parameters. We propose a method to add noise to the model parameters in order to make the model satisfies differential privacy. At last, we generate the sanitized weighted network through synthetic network generation. VB-WNDP not only offer better data utility, but also protect the the privacy of structural roles.
In summary, we present several contributions: (1) We introduce a differentially private SBM algorithm named SSN. This technique makes SBM satisfying differential privacy and protects the privacy of structural role. (2) We develop a differential privacy method for weighted network named VB-WNDP. This method uses the idea of SSN to protect the privacy of structural role and constructs a private weighted network probability model to release the private weighted network. (3) Through formal privacy analysis, we prove that SSN and VB-WNDP both satisfy -differential privacy. We experimentally study over real datasets, and the results demonstrate that SSN and VB-WNDP perform with high accuracy.
Our paper is organized as follows: Section II provides a literature review on differential privacy for networks. Section III presents necessary background on differential privacy and SBM. Section IV describes the differentially private SBM algorithm SSN. Section V presents the weighted networks differential privacy method VB-WNDP. Section VI reports the comprehensive experimental results. Section VII concludes the paper.

II. RELATED WORK
Many existing works about social network differential privacy focus on social network analysis. These methods output some network statistics under differential privacy such as degree distribution, subgraph number and clustering coefficient. Dwork et al. [6] added noise to outcome directly and answered the queries under differential privacy. Hay et al. [7] proposed a differentially private method in a post-processing phase to compute the consistent input most likely to have produced the noisy output. They used this to estimate the private degree distribution. Karwa et al. [8] expanded this concept to calculate the k-star count of network. Zhang et al. [9] analysed the statistics through a ladder function and reduced the sensitivity effectively. Cheng et al. [10] presented a two-phase differentially private frequent subgraph mining algorithm called DFG. In DFG, frequent subgraphs are privately identified in the first phase, and the noisy support of each identified frequent subgraph is calculated in the second phase. Ding et al. [11] published the triangle counts satisfying the node-differential privacy with two kinds of histograms: the triangle count distribution and the cumulative distribution. Sun et al. [12] studied fundamental problems related to extended local view. They formulated a decentralized differential privacy scheme named DDP, which requires that each participant consider not only her own privacy, but also that of her neighbors involved in her ELV. They also designed a multi-phase framework under DDP that enables an analyst to accurately estimate subgraph counts.
Differentially private social network releasing also draws attention. Sala et al. [13] introduced a differentially private graph model called Pygmalion for publishing social network. Pygmalion extracts a graph structure into private dK -graph and generates a synthetic graph. Mir and Wright [14] used maximum likelihood estimation to privately estimating the parameters of stochastic Kronecker graph model. Xiao et al. [15] proposed a differentially private network publishing method HRG-MCMC. They computed an estimator of graph in the hierarchical random graph(HRG)model under differential privacy, and sampled possible HRG structures in the model space via Markov chain Monte Carlo (MCMC) witch satisfies the exponential mechanism. Qin et al. [16] investigated techniques to ensure local differential privacy of individuals while collecting structural information and generating representative synthetic social graphs. They proposed LDPGen which incrementally clusters users based on their connections to different partitions of the whole population and adapted existing social graph generation models to construct a synthetic social graph. Chen et al. [17] presented a method for publishing differentially private synthetic attributed graphs, which is able to preserve the community structure of the original graph without sacrificing the ability to capture global structural properties.
Many existing works also focus on weighted network privacy. Liu et al. [18] identified weighted 1 * -neighborhood attacks and defined probabilistic indistinguishability to resist this attack. They proposed a HIGA scheme to generate a probabilistically indistinguishable social network. Maria Skarkala et al. [19] presented a clustering-based k-anonymization technique for weighted network. This method groups nodes with similar sets of neighbors and their connections into supernodes and superedges, respectively. Chen et al. [20] proposed k-histogram-inverse-l diversity to investigate the sensitive label privacy disclosure problem in weighted graph. Liu et al. [21] proposed privacy preserving methods using the centrality based on complex network theory to protect privacy of virtual assets.
Several works focus on weighted network differential privacy. Li et al. [22] proposed the Merging Barrels and Consistency Inference strategy to protect weighted social graphs. They merged the barrels with the same count into one group to reduce the noise required. They also did consistency inference according to original order of the sequence as an important postprocessing step to keep most of the shortest paths unchanged. Wang and Long [23] proposed a modified algorithm LMBCI to reduce the more substantial error MBCI generated. Qian et al. [24] investigated the problem of publishing the topological information with the weight distribution of the weighted graph. They proposed two clustering approaches based on sequence-aware and local density to aggregate histogram.
According to the above mentioned works, we find that only a few works focus on weighted network differential privacy. These works neglect the structural role and incur excessive noise on account of adding noise to weights. Hence, we introduce a differential privacy method for weighted network which could solve above problems.

A. DIFFERENTIAL PRIVACY
We model an input network dataset as a graph G = (V , E), where V is the set of vertices and E is the set of edges. Given a graph G, differential privacy [6] insures the outputs to be approximately same even if any edge is arbitrarily added or deleted in the graph. Thus, the presence or absence of any edge has a negligible effect on the outputs. We define two graphs . The degree of the node is denoted as d (·). -differential privacy is defined as follows: Definition 1 ( -differential privacy): A randomized algorithm A is -differential privacy if for any two neighboring graphs G 1 and G 2 , and for any output O ∈ Range (A), Differential privacy are based on the concept of global sensitivity of a function f . It is used to measure the maximum change in the outputs of f when any edge in the graph is changed. The global sensitivity of f is defined as Differential privacy can be achieved by Laplace mechanism and exponential mechanism. The Laplace mechanism is mainly used for functions whose outputs are real values. Differential privacy can be achieved by adding properly noise drawn randomly from Laplace distribution to the true answer.
Theorem 1 (Laplace Mechanism): [6] For any function f : Laplace variables with scale parameter f . The exponential mechanism is mainly used for functions whose outputs are not real numbers. The main idea is to sample the output data O from the output space O according to the utility function u. The global sensitivity of Theorem 2 (Exponential Mechanism): [25] Given a graph G and a utility function u : (G × O) → R, the arithmetic A whose output is with probability proportional to exp ·u(G,O) 2 u satisfies -differential privacy. Theorem 3 (Sequential Composition 1): [26] If each arithmetic A i provides i -differential privacy, a sequence of (A 1 (D) , A 2 (D) , . . . , A n (D)) over the same database D provides n i=1 i -differential privacy. Theorem 4 (Sequential Composition 2): [27] Any subset D iter sampled from D satisfies each data point is included independently with probability p. If algorithm A (D iter ) satisfies iter -differential privacy, A (D) satisfies log (1 + p ((e iter ) − 1))-differential privacy.

B. STOCHASTIC BLOCK MODEL
The adjacency matrix of social network contains binary values A ij which represents edge existences, i.e., A ij ∈ {0, 1}. K denotes a fixed number of latent groups and each vertex belongs to one of the K groups. The vector z represents the group label of each vertex, i.e., z i ∈ {0, 1, 2, . . . , K}. Vertices in the same group play similar structural roles and connect with vertices from other groups in the same distribution. The variable l r represents the probability of vertices belongs to group r, and the element θ z i z j in K -by-K matrix θ represents the connection probability of groups which vertex i and vertex j belongs to respectively. π r represents the prior distribution of l r and satisfies K r=1 π r = 1. In SBM, {A ij } represents the observed data, {z i } represents the latent data which can not be observed directly, = {θ, π} represents the parameters of the model. The likelihood function of SBM is 80794 VOLUME 8, 2020 where s (·) are sufficient statistics and s (x) = {x, 1}; n (·) are natural parameters and n (x) = log x 1−x , log (1-x) .

IV. DIFFERENTIALLY PRIVATE SBM A. DIFFERENTIALLY PRIVATE SBM DESCRIPTION
This section shows the details of differentially private SBM algorithm. During the parameter learning process, maximum likelihood estimation is used normally. EM algorithm introduces a probability distribution over latent variables to give rise to a lower bound on log likelihood. EM algorithm iteratively alternates between the parameters and the probability distribution over the latent variables, and its iterative process is E-step: Given the parameters {θ, π}, output the latent variables {z}.
To satisfies the differential privacy, a straightforward approach is to add perturbation noise to both the parameters and the latent variables directly in each iteration. However, this approach may produce much cumulative noise and suffer poor performance. The root cause is: (1) The parameters of each vertex in each iteration produce noise respectively. (2) Latent variables in each iteration produce noise. Thus, we propose a differentially private SBM algorithm called SSN, which uses Variational Bayesian EM(VBEM) to compute the model parameters and obtain the differentially private SBM.
In VBEM, latent variables and model parameters are both treated as random variables. Their posterior distributions Pr (z, θ, π | A) are learned. However, the posterior distribution is generally difficult to calculate. Instead, we use a factorizable distribution q (z, θ, π ) = q (θ, π) · N q (z) to approximate the posterior distribution. As SBM falls in the conjugate-exponential (CE) family, the iterative process of VBEM is composed by updating the parameters of CE family: VBE: where CE family is expressed as q (z i ) ∝ exp n · s A ij , z i , and the expected natural parameters is expressed asn = n (θ, π) q(θ,π ) . VBM:As the prior over the parameters are conjugate to q (θ, π), q (θ, π) can be expressed as: where η = η + N and ν = ν + Ns A ij are the hyperparameters of prior; g (θ, π) = π z i i ; h η , ν is a normalizing constant. The expected sufficient statistics is expressed ass A ij = s A ij , z i q(z i ) . To satisfies the differential privacy in each iteration, we need to add noise to q (z i ) and q (θ, π). For q (θ, π), we update ν by calculating expected sufficient statistics s A ij , and then compute q (θ, π). As the algorithm needs to look at the original data A ij when computings A ij , we need to add noise tos A ij . When we compute q (z i ), the algorithm needs to look at the original data A ij directly. So we also need to add noise to q (z i ) directly. However, it produces a excessive amounts of additive noise when adding noise to q (z i ) directly. The reason is that it is necessary to add noise to the latent variables of each vertex and we do not need to output the latent variables during the process of iteration.
To this end, we introduce our SSN algorithm. During the iteration process, we only need to add noise tos A ij . The output of computing q (z i ) is only treated as the input of computings A ij , not other variables. And it is not necessary to look at the original data A ij when computing the natural parameters n (θ, π). Thus, it is not necessary to add noise to n (θ, π). In Fig. 1, we show the framework of SSN.

B. THE GLOBAL SENSITIVITY OF SSN ALGORITHM
As discussed above, SSN algorithm satisfies the differential privacy by adding noise to the sufficient statistics during each iteration. The global sensitivity of the sufficient statistics is the maximum difference of sufficient statistics when any vertex and its adjacent edges both change. More specifically, it equals to max A,A s (A) −s A , where A is the neighbor network. The K × K expected sufficient statistics of SBM set corresponds to the K × K bundles.
We assume that P is the edge set between arbitrary groups a and b, and the size of P is p. Q is the number of changing edges of P when any vertex changes, and the size of Q is q. We know that q is less than the number of vertices in group a and b. The global sensitivity of the sufficient statistic between group a and group b is expressed as: where l i,z i is the probability that vertex i belongs to group z i , and satisfies l i,z i 1. As a result, the global sensitivity max A,A s (A) −s A 2q p .

V. DIFFERENTIAL PRIVACY FOR WEIGHTED NETWORK
We now formally describe the differential privacy weighted networks publishing method VB-WNDP. Unlike the methods which adding noise directly to the weights, VB-WNDP uses the idea of partition in WSBM. The weights in the same group-group relationship obey the same distribution and have the same parameters. In this paper, we model the edge weights with the normal distribution and the edge weights are real-valued. The parameter θ z i z j represents the existence probability of an edge between group z i and group z j . It only depends on the group memberships of vertices i and j. It is parameterized by a mean and variance θ z i z j = µ z i z j , σ 2 z i z j , and the likelihood is where the sufficient statistic s = x, x 2 , 1 , and the natural parameter n = µ σ 2 , − 1 2σ 2 , − µ 2 2σ 2 . We define r as the K × K indexes between groups. Thus, the parameters = {θ, π} can be represented as = { 1 , 2 , . . . , r }. The sufficient statistic and the natural parameter are s r and n r respectively. The expected sufficient statistic can be represented ass r = 1 p r ij (zi,zj)=r s A ij · l i,z i · l j,z j , where p r is the number of edges in r.
We add noise to satisfy the differential privacy by using the idea of SSN algorithm. The variables should be perturbed are merely the expected sufficient statistics. We add Laplace noise to the expected sufficient statistics as where Y r ∼ Lap s .

A. SENSITIVITY COMPUTATION OF VB-WNDP
We separate the edge into the edge existence and the edge weight respectively. For the edge existence, the number of edge represents the edge existence, specifically 1 represents the existence of edge and 0 represents the inexistence. The linear summation of egde weights represents the number of edges. When we use the node differential privacy, we could use the linear summation of egde weights to compute the sensitivity. However, for the edge weights with integer, the linear summation of edge weights neither represents the number of edges nor the maximum change of the number of edges which the existence or the inexistence of any vertex leads to. What's more, it also do not represents the maximum change of a single edge weight. As a result, we cannot use the linear summation of edge weights to compute the sensitivity when the node differential privacy is used. So we use the edge differential privacy and the neighbour network only changes an edge. We could use the change of an edge weight to compute the global sensitivity. The global sensitivity ofs r is represented as s r = max A,A s r (A) −s r A . We assume that the neighbour network A changes the maximum edge weight A r 0 in r. The global sensitivity ofs r is As s = x, x 2 , 1 is a decreasing function, we could get s A r 0 s A ij (zi,zj)=r excluding A r 0 . So the global 80796 VOLUME 8, 2020 sensitivity is

B. VB-WNDP ALGORITHM DESCRIPTION
The weighted network differential privacy algorithm VB-WNDP is shown in Algorithm 1. The likelihood has the form of an exponential family and we could compute the sufficient statistics and the natural parameters. We use the idea of SSN algorithm to satisfy the differential privacy. In VBEM, we aim to compute an approximation to the posterior distribution. In conjugate-exponential (CE) class, we could update the hyperparameters (η, ν) during the iteration process to get the approximation to the posterior distribution. During the iteration process, we add noise to the expected sufficient statistics and use them to update the hyperparameter ν. The iteration process operates until the hyperparameters converge. At last for each pair of vertices, we could sample the edge weight through the noisy parameters of exponential family, and then get a sanitized synthetic network. The process of parameter learning is shown in Algorithm 2. We first initialize the latent variables l. Vertices are divided in each group with the same probability and we set the initial value of the latent variables as l = 1 k (line 1). As the model parameters are divided by K × K bundles, the privacy parameter is divided into r = log K 2 · (e − 1) + 1 by the sequential composition property of differential privacy (line 2). For computing the expected sufficient statistics of each edge bundles,we need to compute the sufficient statistic of each edge in r (line5-7). To satisfy the differential privacy, we need to add Laplacian noise tos r . The global sensitivity ofs r is denoted as s r = 2 p r s A r 0 by using (10) (line8-9). Based on the conjugate property of exponential family, we update the hyperparameter ν r bys r and then compute the posterior distribution of the parameters (θ, π) (line10-11). To compute the latent variables, we need to compute the expected natural parametersn r (line 12). As it is unnecessary to look at the data A directly when computing ν r andn r , we only need to add noise to the result of computings r . Suppose we update ν until convergence takes O (J ) time, all the iterative process takes O JK 2 . When computing the latent variable l i,z i of each vertex, vertex i must sum over its connected vertices in r. Then we Algorithm 2 The Process of Parameter Learning Input: Input network A, group number K , privacy parameter Output:Sanitized model parameters (θ, π) 1. Initialize l = 1 K . 2. Divide privacy parameter into 1 , 2 , . . . , K 2 , where r = log K 2 · (e − 1) + 1 . 3. repeat 4. for r = 1, 2, . . . , K 2 do 5.
Compute the edge number p r in bundle r. 6.
Compute the sufficient statistic S A ij of each edge in r.

7.
Compute the expected sufficient statistic s r = 1 p r ij (zi,zj)=r s A ij · l i,z i · l j,z j .

8.
A r 0 ← maximum edge in r, s r = 2 p r s A r 0 . 9.
Add Laplacian noise tos r and get s r =s r + Y r , where Y r ∼ Lap s . 10.
get l i,z i in exponential family(line14-16). Moreover, we do not need to add noise to l i,z i directly. The reason is that it is unnecessary to output l i,z i most of the time. It will take O (JN ) time for all N vertices for all the iterative process.

C. PRIVACY ANALYSIS OF VB-WNDP
Taken with the sequential composition theorem of differential privacy, we can prove that VB-WNDP ensures -differential privacy.
Theorem 5: VB-WNDP satisfies -differential privacy. Proof: Suppose the hyperparameter ν and the latent variable l converge taking J times. In Algorithm 2, the iterative process of each bundle satisfies iter -differential privacy. Based on Theorem 4, each iterative process satisfies log 1 + 1 K ×K ((e iter ) − 1) -differential privacy, all the iterative process satisfies J · log 1 + 1 K ×K ((e iter ) − 1)differential privacy. After getting the model parameters, it does not consume any privacy budget when we output the synthetic network. Hence, VB-WNDP satisfies -differential privacy, and = J · log 1 + 1 K ×K ((e iter ) − 1) .

VI. EXPERIMENTAL RESULTS
In this section, we evaluate the performance of the algorithm we propose on several real-world networks. As differential privacy need to produce random noise, we measure the accuracy of the result by the median relative error where we run the Laplace mechanism for 10 times. (2) We evaluate the utility of VB-WNDP algorithm over three real-life weighted network datasets, namely Bison [28], Macaque [29] and Residence hall [30]. Bison describes the usual aggressive behaviors (fighting, nod-threats, broadside threats, head-on threats, rush threats and supplanting) were recorded among 26 males in a herd of American bison. Observations were recorded for 12 hours per day from July 25 through August 14, 1972 on the National Bison Range in Moiese, Montana. A node represents a bison and an edge represents dominance of the left bison over the right bison. Macaques records dominance of the row animal over the column animal in a colony of 62 adult female Japanese macaques (Macaca fuscata fuscata). They are known as the "Arashiyama B group". Records were made during the non-mating season, April to early October, 1976. A node represents a macaque and a directed edge A to B represents dominance of macaque A over macaque B. Residence hall collects friendship data among the 217 residents living at a residence hall located on the Australian National University campus. A node represents a person and an edge represents the friendship. The statistics of these data is shown in Table 2.

B. EVALUATION OF SSN
To show the utility of SSN algorithm,we compare the Normalized Mutual Information (NMI). NMI is a kind of measure to score the accuracy of community detection. The NMI is  represented as: where H (X ) is the Shannon entropy of X, and I (X , Y ) = H (X ) − H (X | Y ). It takes a value close to one if the assignments are identical and zero if they are uncorrelated. We compare with a straightforward method which adding noise to the model parameters {θ, π} directly during the iterative process of SBM. We allocate the privacy budget as follows: 0.5,1.0,1.5,2.0 and 2.5. From Fig. 2, we can see that SSN outperforms the straightforward method. When privacy budget is relatively large, its NMI always stays high. With the increase of , the NMI rises. As increases, the scale of noise reduces. From Figs. 2(a) and 2(b), we can also see NMI increases gradually with increases. However in Figs. 2(c), we can see NMI increases massively with increases. The reason is that the scale of Karate is smaller than Adjnoun and Football and it leads to a greater influence on NMI when we add more noise.

C. EVALUATION OF VB-WNDP
To show the utility of VB-WNDP algorithm, we compare the relative error of edge weights and evaluate the relative error for VB-WNDP under the group number K.. We represent the relative error of edge weights as: whereÃ ij is the differentially private output and m is the number of edges. In Fig. 3, we evaluate the relative error of edge weights, comparing with two methods. The main idea of the first one method is adding noise to the edge weights directly, we named it Lap-edge. Another method is MB-CI. What's more, we use the method which generating synthesis networks by SBM without differential privacy as a base. We allocate the privacy budget as follows: 0.5,1.0,1.5,2.0 and 2.5.
From Fig. 3, we can see that VB-WNDP outperforms both Lap-edge and MB-CI. When privacy budget is relatively large, its relative error always stays low. With the increase of , the relative error increases. As increases, the scale of noise reduces.
In Fig. 4, we show how the group number K affects the output of the relative error of edge weights. In this experiment, we set the privacy budget to be 0.5, and we use the method which adding noise to the edge weights directly as a base. We set the group number K as follows: 2,4,6 and 8. Form Fig. 4, we can see, adding noise to the edge weights directly generates poor results. With the increase of K , the relative error decreases. When K increases, the level we divide the group is higher. As each group-group partition has independent parameters, the noise we need to add is independent to each other. By dividing the vertices into a greater group number, the parameters are more accurate.

VII. CONCLUSION
In this paper, we investigate the problem of differential privacy for weighted network. We observe that the structural role of the node affects the topological structure of the network and it is necessary to take it into account in differential privacy for weighted network. We introduce a differential privacy algorithm for stochastic block model named SSN to solve the problem. By leveraging such technique,we also design a differential privacy method for weighted network named VB-WNDP. It can improve the utility of the method which adding noise to edges directly. In particular, VB-WNDP establishes a probability model of weighted network through Variational Bayes. To improve the accuracy, we add noise to sufficient statistic instead of the model parameters during iteration process. Privacy analysis and the results of extensive experiments on real datasets show that our algorithm can achieve a high data utility.