Trust Computation in Online Social Networks Using Co-Citation and Transpose Trust Propagation

Evaluating trust and distrust between users in online social networks is an important research problem. To address this problem, we provide a method for estimating continuous trust /distrust value between unconnected users. Our method is based on co-citation and transpose trust propagation. We determine on average how differently two users trust or are trusted by other users, and how differently a user trusts another user from how it is trusted by that user. Using these differences, we estimate four partial trust estimates and compute the final trust value from trustor to trustee as the weighted average of these partial estimates. We propose a basic framework that maximizes accuracy, robustness and coverage and show how we can further improve the accuracy at a lower coverage. We perform experiments on real world trust related networks that show that our proposed method outperforms recent state of the art trust computation methods in terms of accuracy and robustness on commonly used datasets.


I. INTRODUCTION
In online social networks (OSNs) trust plays an important role in user activities. It allows users to distinguish between reliable and malicious users and content produced by them. Knowing the level of trust or distrust between users is also valuable for OSN service providers, as it can be used for tasks such as suggesting friends, detecting malicious or spam users and community detection. Several e-commerce platforms like Epinions, eBay and Amazon too have a social network component where trust helps users make decisions regarding reliability of reviews and product purchases [1]. Trust has also been used to improve performance of recommender systems [2] and collaborative filtering algorithms [3]. For users that are directly connected by friendship or like/dislike links trust can be estimated by analyzing their past interactions, profile similarity, or by obtaining explicit trust ratings given by the users. However, since most users do not have a direct link or previous interaction with each other, estimating trust between these unconnected users is an important research problem.
The associate editor coordinating the review of this manuscript and approving it for publication was Xiao Liu .
To evaluate trust between unconnected users, most earlier algorithms use transitive trust propagation, i.e. if user A trusts user B and user B trusts user C then A should have some degree of trust on C. By using chain of trust links or paths in the trust graph, the level of trust is computed. These algorithms need to extract all, or a very large number of paths and are therefore not efficient on large networks. They can also suffer from lack of robustness i.e. the accuracy may decrease if some parts of network are not visible since missing edges makes it harder to find paths. Moreover, since distrust is not transitive, it's not straightforward to use it, hence most trust computation algorithms choose to ignore distrust.
Recently with availability of more datasets with both trust and distrust information, a few algorithms [15]- [17] have been developed that estimate both trust and distrust as a continuous value. However, some of these algorithms are not efficient to be applied on very large networks or are not based on trust propagation. Their accuracy and robustness can be also further improved. For this purpose, we use co-citation and transpose trust propagation and develop a method for estimating continuous trust/distrust value that is more accurate and robust then other recent existing algorithms and is also efficient to be applied on large networks. Furthermore, our method demonstrates the use of co-citation and transpose trust propagation operations (originally proposed for binary trust) for continuous trust/distrust estimation. To estimate trust/distrust value we combine information from neighboring users of trustor (evaluating user), trustee (user being evaluated) and the trust from the trustee to trustor (if available). More specifically, we use information from four sources: 1. Trustee's in-neighbors: We find how differently trustee's in-neighbors and the trustor trust some of the other users and use this difference along with trustee's in-neighbors trust of trustee to find a partial trust estimate based on trustee's in-neighbors.
2. Trustor's out-neighbors: We find how the trustor's outneighbors and the trustee differ in being trusted by some other users. This difference and the trustor's trust in its outneighbors is used to estimate a partial trust estimate based on trustor's out-neighbors.
3. Trustor's reciprocal neighbors: We find how differently the trustor trusts some of its reciprocal neighbors from the trust it receives from them. We then use this difference and the trust from trustee to trustor to get another partial estimate.
4. Trustee's reciprocal neighbors: We find how much trustee's received trust from some of the reciprocal neighbors differ from the trust it assigns to them. We use this difference and trust from trustee to trustor to find a partial trust estimate.
The final trust is the aggregation of the four partial trust estimates. To verify our method, we perform a series of experiments on multiple real-world trust networks datasets which show that our framework performs better in terms of accuracy and robustness then other recent trust computation methods.
The rest of the paper is organized as follows. Section 2 reviews some related works regarding trust computation in OSNs. Section 3 defines the problem and notations used. In Section 4 we describe the details of our trust estimation framework and algorithms. The experimental results are presented in Section 5. Section 6 concludes the paper.

II. RELATED WORKS
Considering the importance of trust in online social networks a large amount of research has been done in evaluating trust between users. References [4], [5] survey different trust evaluation methods for online social networks. Trust computation methods can be categorized based on whether they include distrust or not. Some of the early works that do not include distrust include Tidal trust [6], Mole Trust [7], Eigen Trust [8]. Tidal trust propagates trust recursively from trustee to trustor along strongest shortest path. Mole trust first removes cycles from graph and then calculates trust of nodes at distance 1,2 and so on up to a maximum depth. Eigen Trust works in a similar way to page rank algorithm and assigns a global trust value to each node. In SW trust [9] authors use adjustable breath first search and small world characteristics of social networks to extract a small trusted graph and make existing trust evaluation methods more efficient. Kim and Song [10] evaluate minmax and weighted mean strategies using reinforcement learning for predicting trust. In [11] we use land mark-based method to make minmax trust computation strategies efficient for large graphs. Jiang et al. [12] convert the trust computation problem into a generalized network flow problem and present a modified flow-based algorithm.
Methods that include distrust include [13] where Mishra et al. propose global matrices bias and deserve. Bias of a trustor represents truthfulness or propensity to trust/distrust and deserve or prestige represents how much a trustee would receive trust/distrust form an unbiased trustor. Yao et al. [14] propose a trust inference model that integrates transitivity, trust bias and multi aspect property of trust for inferring binary and continuous trust score and trust/distrust signs. More recently Kumar et al. [15] present Fairness and Goodness global matrices in weighted signed networks that can be used to compute local trust by multiplying Fairness of trustor and Goodness of trustee. In [16] authors present semiring based trust aggregation method that handles both trust and distrust to infer trust for trust-based recommender systems. Akilal et al. [17] present a fast and robust trust computation method based on controversy, eclecticism, and reciprocity that handles both trust and distrust. They use trusting and being trusted pattern of trustor and trustee to compute trust from trustor to trustee. Our method improves the accuracy and robustness of these recent methods and is also based on trust propagation operations i.e. co-citation and transpose trust propagation.

III. PROBLEM DEFINITION AND NOTATIONS
We represent the trust network as a weighted directed graph G (V, E, W). V is the set of nodes representing users in the trust network. E is the set of directed edges representing trust relationships between users. W is the weight function that assigns a weight to each edge denoting the level of trust. The edge weights for networks having both trust and distrust links are from the interval [-1,1] with positive values representing trust and negative values representing distrust. For trust, larger values (closer to 1) represent stronger trust. For distrust, smaller values (closer to -1) represent stronger distrust. Networks having only trust links have edge weights from the interval [0,1] where larger values represent stronger trust. Throughout the paper when we refer to trust estimation it refers to trust/distrust i.e. when the estimated value is positive it is trust, when it is negative it is distrust. Given a trust network and two users having no direct edge, a trustor user S and trustee user T, the trust computation problem is to find the most accurate level of trust from S to T. In graph theory terms we have to predict the weight and sign of edge from trustor node to trustee node i.e. w(S,T). This is continuous trust prediction problem and is different from binary trust prediction where objective is to determine if S should trust T or not.
To compute trust between unconnected users, trust computation methods use trust propagation operations. These operations are transitive/direct propagation, transpose trust, VOLUME 8, 2020  co-citation and trust coupling [18]. Figure 1 shows these propagation operations. Most methods rely on transitive propagation and very few methods are based other three propagation operations. Our method is based on transpose and co-citation trust propagation. Table 1 describes some of the notations used in the paper.

IV. TRUST ESTIMATION FRAMEWORK
To compute the trust our basic idea is that for two users A and B, we can find on how differently they trust, or are trusted by other users. We can also find how differently a user trusts its reciprocal neighbors from trust it receives from them. This information about differences can be used to estimate unknown trust values.
To estimate trust from trustor to trustee we use four sources of information. From each source we compute a partial trust estimate (PTE) and then use these PTEs to compute final trust. Our trust estimation framework has two versions a basic version which maximizes the coverage i.e. pairs of users for which trust can be estimated. This version is robust to missing edges and has maximum aggregate accuracy as shown in the experimental results section. The second version is the increased accuracy version that further improves accuracy at a lower coverage by applying some restrictions.
To compute the trust level from trustor to trustee, most of the important information can come from in-neighbors of the trustee as they have direct knowledge about trustworthiness of trustee. But since trust is subjective i.e. different users may have different level of trust in the same user, we will not get an accurate estimate if we directly use the trustee's in-neighbors trust ratings of trustor. However, if we knew how much each in-neighbor of trustee and the trustor differ about trusting the trustee, we could use that trustee's in-neighbor's trust in a more accurate way. We can find an approximate estimate of this difference by using trustee's in-neighbors and trustor's trust ratings of some other common users.
Consider Figure 2. We show a section of network containing trustor S and trustee T and one in-neighbor of T i.e. B. We show one in-neighbor only for clarity, generally a trustee would have several in-neighbors. To use B's trust of T in our framework, we find difference between B and S about trusting other users. For this we observe the trust of B and S for common users whose trust rating from B is close to B's trust rating of T, i.e. the absolute difference between their trust from B and B's trust in T is less than or equal to a constant ε. The reason of choosing these nodes is that they can be considered similar to T from B's perspective, and the difference of trust in these similar nodes is more relevant to estimating difference of trust about T. Let the set of common nodes between S and B for estimating S's trust of T using B be denoted as C 1 (B)  We use ε as 0.5 for all our datasets. In Figure 2 the set C 1 (B) includes D, E and F but not C since absolute difference between B's trust in C and B's trust in T is greater than 0.5. This scenario is similar to collaborating filtering problem, however rather than using some correlation matric to find similarity between S and B, we find the average difference in S and B's trust value for nodes in C 1 (B) which we denote as D 1 (B).
represents on average how much higher or lower than B, user S trusts users that similar to T, and therefore can be considered an approximate estimate of how much higher or lower than B, S should trust T. We add D 1 (B) to B's trust rating of T to get the estimate of S's trust for T that uses B.
Some of the in-neighbors of the trustee will not have any similar common user with the trustor i.e. for some in-neighbors x, C 1 (x) is empty. We could ignore these in-neighbors and only use those where |C 1 (x) | >0, however this will increase the number of users x for which no in-neighbor with |C 1 (x)| >0 exists and in turn increase pairs of users for which trust cannot be estimated. This effect will be stronger when more edges are missing, so robustness will decrease. Therefore, in our basic framework we include in-neighbors x where  From the trust propagation perspective this estimate is can be considered a form of co-citation trust propagation. As shown in Figure 2, according to co-citation propagation if A trusts B and C, and D trusts B, then D may also trust C because both A and D have similar views on B. In our method for computing E 1 (B), trustee's in-neighbor B trusts D, E, F and T whereas trustor S trusts D, E, F and so S may also trust T. The partial trust estimate based on in-neighbors of trustee is the average of trust values estimated from each in-neighbor.
2) PTE 2 : TRUSTOR'S OUT-NEIGHBORS Apart from the in-neighbors of trustee, another important source of information are the direct out-neighbors of the trustor node as they indicate the trusting behavior of trustor i.e. how much trust/distrust trustor assigns to different types of users. In order to use trustor's trust for its out-neighbors to estimate trustor's trust for the trustee, we need to know how differently the out-neighbors and the trustee are trusted by the trustor. We estimate this difference by observing the trust assigned to the out-neighbor of trustor and trustee by some other common nodes. Consider Figure 3. We show a section of network showing one out-neighbor of trustor i.e. A. To use the trustor's trust of A in our framework we find the difference in trust received by A and T from common users i.e. C, D, E and F. Like the trustee's in-neighbors case, we use only those users that show similar trusting behavior as S, i.e. those who's trust in A is close to S's trust in A. So, for estimating S's trust in T using trustor's out-neighbor A, the set of common nodes is given as In Figure 3, C 2 (A) includes C, D and E but no F since difference between F's trust in A and S's trust in A is greater than 0.5. To find the difference between A and T about being trusted by other users for estimating S's trust in T, we find the average difference in trust rating of A and T by users in C 2 (A).
D 2 (A) estimates on average how much higher or lower, nodes in C 2 (A) trust T than A. This can be considered as an approximate estimate of how much higher or lower S should trust T compared to A. The estimate based on trustors out-neighbor A denoted by E 2 (A) is given as To keep our framework robust, we also include those outneighbors x of trustor for which C 2 (x) is empty. So, if difference cannot be obtained, we define D 2 (x) = 0. VOLUME 8, 2020  This estimate E 2 (x) is also a form of co-citation trust propagation since users in C 2 (x) i.e. C, D, E each trust both A and T whereas S only trusts A, so according to co-citation propagation S may also trust T.
The partial trust estimate based on out-neighbors of trustor is the average of trust values estimated using each outneighbor.

3) PTE 3 : TRUSTOR'S RECIPROCITY
Although trust is not symmetric i.e. if a user A trust user B, it doesn't mean B also trusts A or the level of trust is same in both directions. However, trust is not random, and many trust related networks exhibit a certain degree of reciprocity. Especially, e-commerce related trust networks have high degree of reciprocity. According to transpose trust propagation if A trust B then B may also trust A. We can use the reciprocity in our framework to improve accuracy and robustness. We can use reciprocity to predict trust from S to T only if there is a reciprocal trust link from T to S. Consider Figure 4. It shows a section of network containing S, T, edge (T,S) and reciprocal neighbors of S and T. To estimate S's trust of T based on trustor's reciprocity and T trust rating of S, we need to know how differently S would trust T from T's trust of S. We find an approximate difference by using S's trust to and from other reciprocal neighbors. Here also we only use those reciprocal neighbors of S who's trust rating of S is close to T's trust rating of S since those have a more similar trusting behavior to T. The set of such common reciprocal neighbors is denoted by C 3 (S) The difference in trust that S assigns to its reciprocal neighbors from the trust it receives from them is then given as As in case of D 1 (x) and D 2 (x) if there are no similar reciprocal neighbors, we assign D 3 (x) = 0 so that we can still use T's trust rating of S in our trust prediction framework. The partial trust estimate using trustor's reciprocity is

4) PTE 4 : TRUSTEE'S RECIPROCITY
In a similar way to trustor's reciprocity we can use trustee's reciprocity in our trust estimation framework if edge from T to S exists. We use reciprocal neighbors of trustee that have similar trust rating from T as T's trust rating of S to find the average difference in trust received by T from the trust it assigns to them.
If there are no similar reciprocal neighbors we assign D 4 (T) = 0 so that we can still use the edge (T,S) in our trust prediction framework. The partial trust estimation using trustor's reciprocity is

5) FINAL TRUST
The final trust is the weighted average of the partial trust estimates that have defined values. If all partial estimates are undefined final trust assigned is 0. (or any other default value like average trust rating etc.). For our datasets we keep the weights of all defined partial estimates as 1. let w i be the weight for partial trust estimate PTE i for i = 1,2,3,4 where undefined * 0 = 0

B. INCREASED ACCURACY
In the basic framework we maximize the accuracy while ensuring trust is computable for maximum pairs of users i.e. the coverage is also maximized. However, we can further increase the accuracy of trust prediction albeit with a lower coverage. We can apply conditions to restrict which of the partial trust estimates are used or not used for final trust computation.
As mentioned earlier in basic frame work we define D i (x) = 0 when |C i (x)| = 0 in Eq. (2),(6),(10), (13) to maximize pairs for which trust can be computed. However, now since we want to increase accuracy, we define D i (x) as undefined if |C i (x)| = 0 and therefore not use the information associated with that neighbor node. (or reciprocal edge for D 3 (x) and D 4 (x)).
For trustee's in-neighbors and trustor's out-neighbors we only use the PTE if it is based on at least a minimum number of estimates (users) and the variation among the estimates is small.
For trustee's in-neighbors case let indef(T) be the set of in-neighbors x for which |C 1 (x)| > 0 and therefore D 1 (x) is defined. We apply condition on |indef(S)| and standard deviation of individual estimates E 1 (x).
PTE 1 would be defined as Similarly, for trustor's out-neighbors case, outdef(S) is the set of out-neighbors x for which |C 2 (x)| > 0 and D 2 (x) is defined. We apply condition on |outdef(S)| and standard deviation of individual estimates E 2 (x) |outdef (t) | ≥ tn and std 2 ≤ ts undef, otherwise For the trustor reciprocal case we get the partial trust estimate from the difference D 3 (S) and the reciprocal trust. The difference D 3 (S) will be defined only if |C 3 (S)| > 0 and we also apply a restriction on variation in individual differences. Since D 3 (S) is the mean of individual differences Similarly, for trustee's reciprocal neighbors D 4 (T) will be defined only if |C 4 (T)| >0 and we apply condition on variation of individual differences.
The final trust will be computed in same way as in the basic framework using Eq. (15) and (16). Alg.1-4 are used to compute PTEs and weights. As it can be seen in Alg.1-4 if a PTE is defined the weight will be 1 otherwise it retains its initial value of 0. As described in in Eq. (16), in line 7,8 of Alg.5 if all weights are 0 final trust will be 0 otherwise it is the average of defined PTEs. We only show the pseudocode for basic framework case as it its straightforward to include the restrictions for increased accuracy case.

V. EXPERIMENTAL RESULTS
To verify the accuracy of our framework in predicting trust, we performed a series of experiments on four real world trust related datasets. We compare the performance of our framework with other recent trust prediction methods. Our framework out performs existing methods in terms of accuracy and robustness to missing edges. In experiment 1 and 2 we use our basic framework without any additional restrictions to compare our method with existing methods whereas in

A. DATASET DESCRIPTION
We used four real-world trust related network datasets that have been used by various researchers to evaluate their trust computation methods. The Bitcoin and Advogato networks have explicit trust values as edge weights whereas the WikiRFA network has implicit trust values. Table 2 shows the size of each dataset. These datasets are obtained from two bitcoin exchanges Bitcoin OTC and Bitcoin Alpha. These websites allow users to rate others according to how much they trust them. The scale is from +10 to −10. +10 indicates maximum trust while −10 is maximum distrust. The data is downloaded and scaled to interval [-1,1] by [15] and available at [19].

2) ADVOGATO
Advogato is an online social network for open source software developers. It allows members to rate the ability of other members at four levels. Observer, Apprentice, Journeyer, and Master. These ratings are taken as a level of trust in another member's ability. We map Observer, Apprentice, Journeyer, and Master to 0.1,0.4,0.7 and 0.9 respectively. This dataset has been widely used in evaluating trust related matrices and algorithms. We use the snapshot of the network taken at 2014-07-07 [20].

3) WIKIPEDIA RFA
This is Wikipedia request for adminship dataset where an edge (u,v) represents a vote of u for v to be become an administrator. The weight is −1 for negative, 0 for neutral and +1 for positive. The comments in the votes are analyzed by performing sentiment analysis by [15] using VADER sentiment engine. The difference of positive and negative sentiment score is used as the weight which lies in interval [−1,1]. The dataset has been used to evaluate fairness and goodness [15] and TOW [17] and is available at [19].

B. EXISTING METHODS
We compare the performance of our method with following existing methods

1) RECIPROCAL
This is the simplest method in which the estimated trust from S to T is equal to weight of edge (T,S) if it exists or 0 otherwise.

2) MIN-MAX ALL
According to this method the strength of trust path is equal to the weight of its minimum weight edge and strongest path is path with maximum minimum weight edge. The in-neighbor of T having strongest path from S is considered as the most reliable in-neighbor and its direct trust to T is compared with strength of strongest path. The minimum of the two is the estimated trust value. In case of most reliable multiple in-neighbors we use the maximum of estimated trust values from those in-neighbors as the final trust value. This aggregation strategy is used in [9]- [11].

3) MIN-MAX SHORTEST
This method is similar to Min-Max all however only the shortest paths are considered when finding the strongest path from S to in-neighbors of T.

4) FAIRNESS AND GOODNESS
Reference [15] introduce two global matrices fairness and goodness of nodes in weighted signed networks. Fairness is how fairly a user rates other user's likeability or trust whereas goodness indicates how much a user is liked/trusted by others. Trust from S to T is product of fairness of S and goodness of T.

5) TOW(TUG OF WAR)
Reference [17] Describe three characteristics of nodes i.e. controversy, eclecticism, and reciprocity. Trust is treated as three-way tug of war between controversy of trustor, eclecticism of trustee and reciprocity of trustor.

C. EVALULATION MATRICES
To evaluate our proposed method, we use the following two measures

1) ROOT MEAN SQUARE ERROR
It is the square root of average squared difference between the actual trust values and the predicted or estimated trust values. Smaller value indicates higher accuracy.
It measures correlation or trend between the predicted and the actual trust value. Its value lies between -1 and 1 with value closer to 1 indicating higher correlation.
The first experiment performed was leave one out. This is experiment is commonly used to evaluate the accuracy of continuous trust prediction and edge weight prediction algorithms [15], [17]. The accuracy of predicted trust value for a given pair of users is compared to the actual weight of edge in the network. For each network we remove an edge, apply our trust estimation framework and other existing methods on the network without that edge, and compare the predicted trust value with the actual trust value i.e. weight of the removed edge. We repeat this process for every edge in the network to find the average RMSE and PCC.   There may be situations where parts of network may not be visible due to privacy reasons or nodes and their trust ratings are marked as unreliable or malicious and therefore cannot be used. The trust prediction method should be robust in case of missing edges. To evaluate and compare the robustness of our framework with other methods we performed leave N% out experiment. In this experiment for each dataset we randomly remove 10%, 20%, up to 90% of edges and then use the remaining network to predict the weight of the removed edges. Since the edges removed are random, we repeat each this process 20 times and find the average RMSE and PCC. We apply this experiment on other methods and compare the results with our method. Figure 5 shows the results. As it can be seen the performance of our method degrades slowly and is better than then other methods both in terms of RMSE and PCC especially on the bitcoin datasets.

F. EXPERIMENT 3
In this experiment we evaluated increased accuracy version of our framework. As mentioned earlier there is a tradeoff between accuracy and coverage. We tried different values of threshold tn and ts and measured the accuracy i.e. RMSE, PCC and coverage. Table 4 shows the results with tn = 5 and tn = 10 when ts = 0.2 for Bitcoin and WikiRFA datasets, and 0.15 for Advogato dataset. Since WikiRFA and Advogato dataset have low reciprocity, in this experiment we don't use PTE 3 and PTE 4 for these datasets i.e. we set the weights w 3 and w 4 as 0. Comparing RMSE and PCC obtained in this experiment to that of Table 3 for bitcoin datasets, our method gives much better accuracy than any other method at good coverage of more than 80%. For Advagato and WikiRFA datasets, although the coverage gets low when we increase the accuracy it still shows that we can increase accuracy on these datasets too by applying the restrictions.

VI. CONCLUSION
In this paper we propose a method for accurate and efficient estimation of trust and distrust in online social networks. Based on idea of co-citation and transpose trust propagation we show that difference between two users in trusting or being trusted by other users can be used for accurate trust estimation. Our method gives better accuracy in terms of RMSE and PCC then existing methods.