Multi-View Robust Tensor-Based Subspace Clustering

In this era of technology advancement, huge amount of data is collected from different disciplines. This data needs to be stored, processed and analyzed to understand its nature. Networks or graphs arise to model real-world systems in the different fields. Early work in network theory adopted simple graphs to model systems where the system’s entities and interactions among them are modeled as nodes and static, single-type edges, respectively. However, this representation is considered limited when the system’s entities interact through different sources. Multi-view networks have recently attracted attention due to its ability to consider the different interactions between entities explicitly. An important tool to understand the structure of multi-view networks is community detection. Community detection or clustering reveals the significant communities in the network which provides dimensionality reduction and a better understanding of the network. In this paper, a new robust clustering algorithm is proposed to detect the community structure in multi-view networks. In particular, the proposed approach constructs a 3-mode tensor from the normalized adjacency matrices that represent the different views. The constructed tensor is decomposed into a self-representation and error components where the extracted self-representation tensor is used to detect the community structure of the multi-view network. Moreover, a common subspace is computed among all views where the contribution of each view to the common subspace is optimized. The proposed method is applied to several real-world data sets and the results show that the proposed method achieves the best performance compared to other state-of-the-art algorithms.


I. INTRODUCTION
Recently, the world has witnessed a great and rapid development in data science. This development has led to a massive availability of structured data where the structure reflects crucial information about the data. Over the past decades, graph and network theory has emerged to analyze and interpret this structure [1], [2]. In fact, real-world systems can be modeled as graphs where objects and interactions between them are modeled as nodes and edges of the graph, respectively [3]. Early work adopted simple or static graphs to model systems where this is considered inadequate for complex systems, especially when the data is collected from The associate editor coordinating the review of this manuscript and approving it for publication was Chao Tong . multiple resources or views [4], [5]. In particular, each view carries unique information about the system and reflects different interactions. This type of systems can be modeled properly as a multi-view network [6], [7].
Learning from multi-view network-type data has become an essential and advanced task in machine learning [8].
In multi-view network representation, the interactions between the nodes are considered explicitly for each view. One of the most popular approaches to understand the structure of multi-view networks is community detection [9]. Community detection aims to reduce the dimensionality of the network and partition it into a set of clusters or communities, where the nodes within the cluster are strongly connected with each other and sparsely connected with nodes from other clusters [10]. Early work in clustering multi-view networks adopted Single-View Clustering (SVC) techniques to reveal the structure of the network [11]. The existing SVC techniques that are widely used can be divided into two classes. First, aggregation of all views to create a single network, then apply classical SVC techniques [12]. Second, applying any conventional SVC technique to each view separately, then use ensemble or consensus clustering to determine the final community structure [13]. Examples of SVC include Low-Rank Representation (LRR) [14] and SubSpace Clustering (SSC) [15]. However, SVC approaches are insufficient to reveal the structure of the multi-view network since each view has its own statistical properties that are not considered properly during the aggregation process. Consequently, Multi-View Clustering (MVC) approaches have recently arisen to overcome the disadvantages of single-view clustering [16], [17]. These approaches include multi-view clustering based on LRR (MVC-LRR) [17], [18], [19], multi-view clustering based on robust principal component analysis (MVC-RPCA) [20], [21], [22] and multi-view clustering based on graphs (MVC-G) [23], [24], [25].
Under MVC-LRR, In [17], a Low-rank Tensor constrained Multi-view Subspace Clustering (LT-MSC) approach was proposed to recover the data structure via exploiting the complementary information of different views. Another technique that was proposed in [16] called Tensor Singular Value Decomposition for Multi-view Spectral Clustering (t-SVD-MSC). This approach studied how to obtain the low-rankness of the representation tensor using the Tensor Nuclear Norm (TNN) instead of the sum nuclear norm. On the other hand, the authors in [26] introduced a Diversity-induced Multi-view Subspace Clustering (DiMSC) approach, which studied the multi-view clustering based on Hilbert-Schmidt Independence Criterion (HSIC). The HSIC is used to measure the dependence of variables by mapping the variables into a reproducing kernel Hilbert space. Furthermore, the authors in [27] introduced an Exclusivity-Consistency regularized Multi-view Subspace Clustering (ECMSC) algorithm. In particular, ECMSC studied the multi-view clustering by using the complementary information between different representations using a position-aware exclusivity term. In addition, the authors in [19] proposed a Split Multiplicative Multi-View Subspace Clustering (SM 2 SC) approach, which is based on multiplicative multi-view clustering and a variable splitting scheme. However, the proposed methods under MVC-LRR used the original data set, i.e. the data samples and their features to recover a clean tensor. Moreover, they need to follow a two-step strategy to construct the affinity matrix from the recovered clean tensor which is used to detect the network's structure. Unlike our proposed algorithm that uses the multi-view network as an input and solves for the clean tensor and the community structure through a single optimization problem.
Under MVC-RPCA, the authors in [28] proposed a single view clustering based on robust graph learning. The proposed model vectorized all views in columns to create a single matrix and then applied RPCA to recover the low-rank matrix, which is used to detect the network's structure. For high dimensional array, the authors in [20] proposed an Essential Tensor Learning for Multi-view Spectral Clustering (ETLMSC) approach that considers high-dimensional data, i.e. tensors. ETLMSC constructs the adjacency tensor using random walk and the transition probability matrix definitions. Also, tensor RPCA is used to recover the low-rank component, which is used to detect the structure's network using Markov chain spectral clustering. However, ETLMSC follows a two-step strategy to construct the affinity matrix from the recovered clean tensor which is used to detect the network's structure. In [22], the authors studied the different possible errors in the multi-view features and proposed an error robust multi-view spectral clustering model. In [29], the authors proposed a multi-view subspace clustering model based on the transition probability matrix learning and a nonconvex low-rank tensor approximation instead of a convex nuclear norm. For more MVC-RPCA, please refer to [21] and [30].
Under MVC-G, in [31], the introduced model studied the multi-view clustering based on the partition of the graph into specific clusters directly by learning the local manifold structure of the similarity matrix. Moreover, this method can automatically estimate the factors after finite iterations to deal with real-world applications. On the other hand, a parameter-free algorithm was proposed in [32] called Adaptive Weighted Procrustes (AWP), which is based on weighting each view with its clustering capacities. Another spectral clustering-based approach was proposed in [33], namely Co-regularized multi-view Spectral Clustering (CoRegSC). The CoRegSC approach studied the multi-view clustering, assuming that the clustering is consistent across the views. In [24], a Multi-View for Graph Learning (MVGL) method was proposed. The purpose of MVGL is to enhance the quality of the graph by learning a global graph from different single-view graphs. The authors in [34] proposed a Multi-view Consensus Graph Clustering (MCGC) approach to minimize the disagreement between all views and constrain the Laplacian matrix rank. In addition, the introduced model in [35] called Weighted Multi-view Spectral Clustering (WMSC) was proposed to model the weights of all views using spectral perturbation, where the clustering results on each view are close to the consensus clustering result. In [36], the authors proposed a Non-negative Matrix Factorization with Co-orthogonal Constraints (NMF-CC). The NMF-CC model aimed to capture the intra-view diversity matrices to achieve the best clustering representation. In [37], the authors proposed a Consensus Graph Learning for multi-view clustering (CGL). The CGL method was proposed to weight the nuclear tensor norm by assigning different singular values with different weights to improve the flexibility of the nuclear tensor norm in the low-rank approximation problem. Also, the consensus similarity graph was constructed from the normalized multi-view spectral embedding matrices using an adaptive neighbor graph learning manner to study the network's structure. In [38], a Multiplex Cellular communities for Tissue Phenotyping (MCTP) was proposed to cluster the multi-dimensional cellular based on symmetric nonnegative matrix tri-factorization. In [39], a Common Subspace Fusion (CSF) model was proposed to track the objects based on a low-rank response map representation of various features and trackers. In [40], the authors proposed a Consensus and Complementary information for Multi-View data (2CMV). More precisely, the 2CMV model studied the consensus and complementary information from all views based on Coupled Matrix Factorization (CMF) and NMF. At the same time, the authors in [41] proposed a Diverse Manifold for Multi-Aspect Data (DiMMA) to study the diverse manifold for data clustering by using distance information of data points from different views. However, the aforementioned methods under MVC-G are sensitive to noise and their performance decays in the presence of noise and outliers. Table 1 summarizes some examples of some SVC and MVC methods and introduces the proposed method.
In this paper, we propose a novel robust tensor-based multiview clustering approach to detect the community structure in multi-view networks. In particular, the proposed objective function benefits from prior work in tensor decomposition and subspace decomposition to create a unified framework that jointly extracts a clean low-rank adjacency tensor from the original corrupted multi-view network and use it to compute a common subspace across all views. The contributions of the proposed method are summarized as follows: 1) The proposed algorithm can efficiently recover the low-rank representation component from the noisy corrupted tensor by minimizing the tensor nuclear norm. In particular, the recovered low-rank tensor is a clean representation of the original network where we use it to detect the community structure of the network. 2) The proposed approach endorse the L 2,1 -norm to minimize the error in order to soothe the noise effect.
3) The proposed approach adopts Tucker decomposition to obtain a common subspace across all views where the contribution of each view to the common subspace is optimized. The resultant common subspace is then used to obtain the nodes' cluster assignment. 4) The proposed objective function is solved efficiently using an iterative alternating approach. In particular, the low-rank self representation tenor is obtained using Alternating Direction Method of Multipliers (ADMM) and the common subspace is estimated using High Order Orthogonal Iteration (HOOI). 5) The performance of the proposed approach is evaluated by conducting extensive experiments on multiple realworld multi-view networks and the results show that MV-RTSC outperforms other existing state-of-the-art algorithms. The organization of this paper is as follows. In Section II, the background and notations are summarized. The proposed method is introduced and optimized in Section III. The results and discussions are presented in Section IV. Finally, we conclude our work in Section V.

II. BACKGROUND AND NOTATIONS A. NOTATIONS AND PRELIMINARIES
In this section, the notations and basic operations used throughout the paper are presented. In this paper, we use {A, A, a, a ij .} for tensor, matrix, vector and scalar elements, respectively. Let A be a matrix, then the Frobenius norm of A can be calculated as where Q l and Q r are the left and right singular vectors matrices, respectively. is a diagonal matrix with the singular values of A on its diagonal, where σ i (A) is defined as i th singular value of A. The nuclear norm of A is defined as the summation of its singular values, i.e. A * = i σ i (A).
For high dimensional arrays, let A ∈ R n 1 ×n 2 ×n 3 be a 3-mode tensor, then the slice of A is 2D while the fiber of A is 1D. In MATLAB notations, A(j, :, :), A(:, j, :) and A(:, :, j) are used to denote the j th horizontal, lateral and frontal slices of A, respectively. In this paper, we denote the j th -frontal slice of the tensor by A (j) . Also, A(:, i, j), A(i, :, j) and A(i, j, : ) are used to denote mode-1, mode-2, and mode-3 fibers, respectively. The Fourier transform of A can be calculated

B. GRAPH THEORY
A simple graph can be defined as a triplet, G = {V, E, A} where V is the set of vertices and E is the set of edges.
The matrix A ∈ R n×n is known as the adjacency matrix which indicates the similarities between each pair of vertices, where A is a undirected symmetric nonnegative matrix and a ij ∈ [0, 1]. The degree of any vertex is defined as d i = n j=1 a ij , and the degree matrix D, is defined as the diagonal matrix with {d 1 , d 2 , . . . , d n } on its diagonal [42], [43]. The normalized adjacency matrix is defined as A N = D (−0.5) AD (−0.5) , which is a positive semi-definite matrix.
A multi-view network, G, with n nodes and V views can be defined as a set of static or simple graphs that reflects the interactions between nodes in the different views, i.e.

C. TENSOR OPERATIONS
Let A ∈ R n 1 ×n 2 ×n 3 be a 3-mode tensor, then the block circulant matrix can be computed as follows: where A (j) represents the j th frontal slice of the tensor A. The block vectorizing operation: and the block vectorizing opposite, bvfold, operation is given as: The block diagonal matrix of a tensor A is given by: and the block diagonal opposite, bdfold, operation is given by: Tensor Product: Let J 1 ∈ R n 1 ×n 2 ×n 3 , and J 2 ∈ R n 1 ×n 4 ×n 3 . The tensor product (t-product) J 1 * J 2 ∈ R n 1 ×n 4 ×n 3 tensor: Tensor Transpose: Let A be a 3-mode tensor with dimension n 1 × n 2 × n 3 then the tensor transpose has a dimension of n 2 × n 1 × n 3 .
Identity Tensor: The tensor I ∈ R n 1 ×n 2 ×n 3 is called identity if and only if all slices are equal to zero except the frontal slice is an n 1 × n 2 identity matrix.
Orthogonal Tensor: A tensor A is orthogonal if it satisfies the following condition: Tensor SVD (t-SVD): The t-SVD of A ∈ R n 1 ×n 2 ×n 3 is defined as: where Q l ∈ R n 2 ×n 2 ×n 3 and Q r ∈ R n 1 ×n 1 ×n 3 are the left and right singular vectors tensors, respectively. The tensor S ∈ R n 1 ×n 2 ×n 3 is an f-diagonal tensor in Fourier domain and its contains the singular values on its diagonal.

T-SVD-Based Tensor Nuclear Norm (T-SVD-TNN):
Let A ∈ R n 1 ×n 2 ×n 3 , then the T-SVD-TNN defined as: where| · | is the absolute value, and A f is the SVD tensor of A in the Fourier domain. Tucker Decomposition: Assume A ∈ R n 1 ×n 2 ×n 3 is a 3-mode tensor, the Tucker decomposition is used to provide the orthogonal subspace along each mode of A. The tucker decomposition of tensor A is given as [44]: where C r ∈ R r 1 ×r 2 ×r 3 is the tensor core, is the orthogonal matrix along the i-mode, and r i is the rank of the orthogonal i th matrix where r i ≤ n i . Tucker decomposition problem can be solved using HOOI.
Multi-view Networks Construction: Given a multi-view data set, n ] ∈ R d v ×n , has n samples and d v features. In this paper, an adjacency tensor is constructed from the multi-view data set. This is achieved by first constructing a similarity matrix, A (v) ∈ R n×n , for each view where the similarity matrix can be constructed using different similarity measures. In this paper, the similarity measure that is adopted to construct the similarity matrices is the Gaussian kernel similarity [42] and can be defined as: where α, β ∈ {1, 2, . . . , n}, . 2 is the L 2 -norm, and σ is used to control the width of the neighborhoods. After constructing a set of similarity matrices, , a 3-mode tensor, A ∈ R n×n×V , is constructed by concatenating the constructed adjacency matrices where each adjacency matrix represents a frontal slice in the adjacency tensor, A(:,

D. SPECTRAL CLUSTERING
In graph theory, spectral clustering is an important method to reveal the clusters in static networks. An efficient solution is provided to the relaxed versions of the cut problem by spectral clustering. More precisely, it solves the following optimization problem [42]: where L N is the normalized symmetric Laplacian matrix and defined as L N = I − A N . Eq. (12) can also be written as follows: The normalized Laplacian matrix, L N , and the normalized adjacency matrix, A N , are always positive semi-definite, then VOLUME 10, 2022 optimizing Eq. (13) is equivalent to optimizing the following problem: The solution of optimization problem in Eq. (12) can be found by computing the matrix U as the matrix that contains the k eigenvectors that correspond to the smallest k eigenvalues of L N . Whereas, the solution of optimization problems in Eq. (13) and Eq. (14) can be found by choosing the matrix U as the matrix that contains the k eigenvectors that correspond to the largest k eigenvalues of A N . The nodes' community assignment is then determined by applying kmeans to U [45]. In recent work [46], we have shown that minimizing the normalized cut across a multi-view network can be equivalently written as: This optimization problem can be reformulated in terms of Tucker decomposition as: where A N ∈ R n×n×V corresponds to the tensor representation of the adjacency matrices across views. The optimization problem in Eq. (16) can then be solved efficiently using HOOI [47].

III. MULTI-VIEW ROBUST TENSOR-BASED SUBSPACE CLUSTERING (MV-RTSC)
In this paper, we propose a novel approach to detect the community structure in multi-view networks. In particular, a 3-mode tensor is assembled from the normalized adjacency matrices that represent the different views of the network. The constructed tensor is decomposed into a self-representation and error components where the extracted self-representation tensor is used to detect the community structure of the multiview network. The algorithm is explained in details in the following subsections.

A. PROBLEM FORMULATION
Given a corrupted multi-view adjacency tensor, A ∈ R n×n×V that represents a corrupted multi-view network, G, the objective of the proposed algorithm is to detect the community structure of the input multi-view network.
In order to account for noise or missing edge value, a low-rank approximation of the corrupted adjacency tensor is extracted and used to obtain a common subspace, U ∈ R n×k , across all views. In particular, this common subspace will be obtained by optimizing the contribution of each view to the common subspace. The resultant common subspace can then be used to determine the community assignment of each node in the network. In order to attain this objective, we propose minimizing the following objective function: (1) ; E (2) ; . . . ; The proposed objective function consists of multiple terms, where each term is included to meet a certain goal as follows: 1) The first two terms represent the tensor LRR where the first term, Z , is included to recover the selfrepresentation tensor, Z ∈ R n×n×V , of the original adjacency tensor, A, by using the T-SVD-TNN. The tensor Z can be constructed by using the function (·) that converts the self-representation matrices into a 3-mode tensor. The function, (·), is also used to rotate the tensor Z where its dimension will be n × V × n [16]. The main advantage of the tensor rotation is to reduce the computational complexity of updating Z. The matrix n ] represents a clean version of the corrupted adjacency matrix, A (v) . In particular, each z (v) i presents a new representation of the node i which refers to the sample i. Ideally, the low-rank representation of each node corresponds to a combination of all the other nodes that belong to the same subspace [15] which leads to a block diagonal structure in the clean adjacency matrix that is constructed from Z (v) . In fact, it was proved in [14] that as the adjacency matrix gets closer to a block diagonal structure, better clustering results can be achieved. The second term, E 2,1 , is considered to detect the outliers of the original tensor A, by using the L 2,1 norm of the matrix E ∈ R nV ×n which is obtained by concatenating the error matrices that correspond to each view vertically. The L 2,1 norm of the matrix E can be defined as E 2,1 = m i=1 n j=1 e 2 i,j , where m = nV [48].
2) The third term represents the subspace decomposition.
The main goal of this term to estimate the common subspace across all views, U, where Z is the normalized adjacency matrix of the self-representation matrix of each view. w is the weighting vector across views where w v , v = {1, . . . , V } weighs the contribution of each view to the common subspace.

B. OPTIMIZATION OF THE PROPOSED MODEL
The solution of the proposed model can be obtained by following an iterative alternating approach. In particular, fixing the common subspace between all views U, the variables Z and E can be obtained by ADMM, where the convergence of ADMM is guaranteed [16], [49], [50]. After recovering the self-representation tensor, Z, the common subspace, U, can be updated efficiently using HOOI.
This problem can be solved by separating the tensor decomposition problem from the subspace decomposition problem and alternating between the two optimization problems. Once Z is obtained U and w will be optimized using HOOI. The solution of the optimization problem introduced in Eq. (17) starts with introducing an auxiliary variable, B, to provide variables' separability where the optimization problem will be reformulated as follows: (1) ; E (2) ; . . . , E (V ) ].
Eq. (18) can be solved by dividing it into multiple subproblems and optimizing each variable while fixing the others as follows: 1) B-Subproblem: In order to update B l+1 , we fix all the other variables and consider the terms with B only as follows: The solution of Eq. (19) can be found in Appendix A. 2) E-Subproblem: To update E l+1 , we keep the terms that only includes E in Eq. (18) and fix Z (v) . The Esubproblem can be formulated as: (1) ; E (2) ; . . . , E ( l+1 -subproblem can be defined as: The solution of Eq. (21) is given in Appendix C where the solution of each Z (v) is computed separately. 4) U-Subproblem : Considering the terms with U in Eq.
The solution of Eq. (22) can be found in Appendix D.

5) Updating Lagrange multipliers and penality parameters: The Lagrangian multipliers Y
(v) 1 l+1 , Y 2l+1 , and the penalty parameters γ 1l+1 , γ 2l+1 , are updated as follows: where τ is set to be 2. The stopping criteria of the proposed method are defined as: where is defined as the tolerance. Finally, k-means is applied to the normalized common subspace, U norm , to obtain the final clustering labels. The pseudo code of the proposed algorithm is summarized in Algorithm 1.

C. COMPUTATIONAL COMPLEXITY OF THE PROPOSED METHOD
The computational cost of the proposed method is mainly due to the cost of updating E, B, and U. For each iteration, the computational cost of updating E and B is equal to O(Vn 2 ) and O(Vn 2 log(n) + V 2 n 2 ), respectively. Moreover, the computational complexity of HOOI is O(Vn 3 ). Consequently, the total computational complexity of the proposed method is O(L(Vn 2 log(n)+V 2 n 2 +Vn 3 )) which is approximately equal to O(LVn 3 ), where V , L and n represent the number of views, iterations, and nodes, respectively.

IV. RESULTS AND DISCUSSIONS
Extensive experiments have been conducted to evaluate the performance of the proposed method, MV-RTSC. In particular, MV-RTSC is applied to several well-known realworld multi-view networks, and its performance is compared to other existing state-of-the-art SVC and MVC methods. The SVC methods include LRR 1 [14] and SSC 2 [15]. The MVC methods include LT-MSC 3 [17], t-SVD-MSC 4 [35], NMF-CC [36], CGL 9 [37], MCTP [38], CSF [39], 2CMV 10 [51], and DiMMA 11 [41]. The performance evaluation of the different algorithms is conducted in terms of the most famous quality metrics, including ACCuracy (ACC), Normalized Mutual Information (NMI), Adjusted Rand index (AR), F-score, Precision, and Recall [52], [53]. The values of all evaluation metrics are normalized between [0, 1]. Higher values of the quality metrics refer to better clustering results. All experiments are conducted using MATLAB 2020a on a desktop with the specifications Intel(R) Core(TM) i7 with RAM of 16GB. The MATLAB code of the proposed MV-RTSC approach is available online: https://github.com/wardat99/MV-RTSC

A. EXPERIMENTAL SETTING
In order to test the performance of the proposed algorithm, several famous real-world multi-view data sets are adopted. The multi-view networks are constructed following Section II-C where all adjacency matrices are nonnegative, symmetric and undirected. The data sets we use in our experiments are: • MSRC-V1 is a data set consisting of 210 images of different scenes, including cars, trees, bicycles, faces, cows, buildings and airplanes with 7 communities [54]. Each one of the images is represented by 5 features including Color Moment (CM), Histogram of Oriented Gradient (HOG), GIST, Local Binary Pattern (LBP) and Centrist features (CENTRIST).
• Newsgroups (NGs) 12 is a text data set that contains 500 newsgroups documents that are collected from 20 newsgroups data sets, with 5 communities.
• BBCSPORT 13 is a text data set which consists of 544 news articles from the BBC Sport website with 5 communities including athletics, cricket, football, rugby, and tennis [55].
• BBC4view 13 is a text data set that consists of 685 documents from the BBC website with 5 communities including business, entertainment, politics, sport, and technology [55].
• Flowers 14 is a data set containing 1360 flower images belonging to 17 communities with three different visual features, including color, shape and texture.
• COIL-20 15 is a data set from the Columbia object image library that contains 1440 images with 3 views and 20 communities with three features including intensity, LBP, and Gabor.
• UCI 16 is a handwritten digits images data set which contains 2000 images from the UCI machine learning repository with 10 communities (included digits are 0 − 9) [56]. It consists of three features including 216 profile correlations, 76 Fourier coefficients and 6 Karhunen-Loéve coefficients of the character shapes.
• CiteSeer 17 is a text data set that contains of 3312 documents described by 2 views (citation links and word presence), and 6 communities.
• Scene-15 is a scene data set which consists of 4485 indoor and outdoor scene images with 15 communities and 3 views including Pyramid Histograms Of visual Words (PHOW), LBP, and CENTRIST. [57].
• Handwritten digit (Hdigit) 18 is a handwritten digits images data set which contains 10000 images with 2 views including MNIST Handwritten Digits and USPS Handwritten Digits and 10 communities (includes the digits from 0 to 9).  TABLE 3. Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on MSRC-V1 data set. We set µ = 0.2 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined.

TABLE 4.
Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on NGs data set. We set µ = 0.5 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined.

TABLE 5.
Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on BBCSport data set. We set µ = 4 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined. VOLUME 10, 2022 TABLE 6. Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on BBC4view data set. We set µ = 8 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined.

TABLE 7.
Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on Flowers data set. We set µ = 4 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined. TABLE 8. Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on COIL-20 data set. We set µ = 9 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined.

TABLE 9.
Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on UCI data set. We set µ = 5 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined.

TABLE 10.
Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on CiteSeer data set. We set µ = 7 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined. TABLE 11. Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on Scene-15 data set. We set µ = 7 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined.

B. CLUSTERING RESULTS
The performance of the proposed approach in clustering multi-view networks is compared to the state-of-the-art algorithms using multiple clustering evaluation metrics. The performance comparison is conducted in terms of ACC, NMI, AR, F-score, precision, and recall. For an unbiased VOLUME 10, 2022 TABLE 12. Clustering performance comparison between the different methods using multiple quality metrics averaged over 10 runs (Mean±Standard deviation) on Hdigits data set. We set µ = 6 and λ = 0.1 for MV-RTSC. The best performance is boldface and the second best performance is underlined.   comparison, each experiment is repeated ten times, and the average values and standard deviation of the different quality metrics are reported in Tables 3−12. The best and second-best performance results are denoted by bold font and underlined, respectively. The methods under comparison can be divided into SVC and MVC methods. As it can be seen from Tables 3−12, the proposed algorithm achieves the highest scores in terms of the different evaluation metrics for all data sets. In addition, it outperforms many recent methods, including t-SVD-MSC, ETLMSC, NMF-CC, CGL, 2CMV, and DiMMA, especially in BBC4views, Flowers, and Hdigit data sets. For example, the proposed method improves around 16.5% and 11.3% in terms of NMI over t-SVD-MSC in BBC4views and Flowers data sets, respectively. On the other hand, inline with previous studies [16], [20], we observe that the performance of many MVC methods is better than the SVC methods in most data sets and the proposed MV-RTSC achieves the best result over SVC methods in all data sets. For example, in the Scene-15 data set, the proposed method improves the performance over LRR and SSC by 43.7% and 43.8% in terms of ACC, respectively. Furthermore, we notice that the tensor-based optimization methods accomplish better performance than the matrix-based optimization methods, such as t-SVD-MSC, ETLMSC, DiMSC, and LT-MSC. Moreover, the proposed approach, which relies on decomposing the multi-view adjacency tensor, achieves the best results compared with the methods that decompose the original data set tensor into clean and error components such as t-SVD-MSC, DiMSC, and LT-MSC methods. For instance, in the UCI data set, the proposed method achieves better performance over t-SVD-MSC by 4.5% and 6.8% in terms of ACC and NMI, respectively. Besides, in the COIL-20 data set, the proposed method achieves better performance over t-SVD-MSC by 9.7% and 9.5% in terms of ACC and NMI, respectively. This illustrates that decomposing the adjacency tensor leads to better results compared to the algorithms that decompose the data tensor. Finally, even though, some of the existing algorithms achieve good results in detecting the community structure in some networks, none of them maintained the good performance among all the networks similar to the proposed MV-RTSC.  Table 13. The proposed MV-RTSC takes ≈ 7 − 11 iterations to converge. As it can be seen from the table, the running time of the proposed algorithm is higher than some existing

D. PARAMETERS SELECTION
Two regularization parameters are included in the objective function of the proposed approach, MV-RTSC, namely µ and λ. µ is used to control the L 2,1 -norm of the error component, and λ is used to penalize the subspace decomposition term. In order to study the effect of the regularization parameters on the performance of the proposed approach, the two parameters are tuned in the range [0.01, 10]. In particular, the consequence of varying one of the parameters is investigated while fixing the other one. Fig. 1 and Fig. 2 show the influence of tuning µ and λ in terms of ACC and NMI for some data sets, respectively. As it can be concluded from the figures, the best range for λ and µ that leads to the best performance of MV-RTSC is λ ∈ [0.1, 0.5] and µ ∈ [3,10].
Another parameter that is considered in this paper prior to applying the proposed approach is the parameter σ in the Gaussian kernel similarity defined in Eq. (11). The parameter σ is used to control the width of the neighborhoods during the construction of the multi-view network. In particular, if the value of σ is chosen too small, the constructed graph will be very sparse, i.e. similarities between nodes will be close to zero. Whereas, the constructed graph will be fully connected,i.e. similarities between nodes will be close to one, if the value of σ is chosen to be too large. Fig. 3 shows the effectiveness of tuning the parameter σ on the results for some data sets. In order to construct a meaningful graph, we tuned σ ∈ [1,50] and the recommended range for σ is σ ∈ [1,30].

V. CONCLUSION
In this paper, a novel robust tensor-based approach is proposed to detect the community structure in multi-view networks. The proposed MV-RTSC presents a unified framework that combines tensor and subspace decomposition terms to reveal the community structure in multi-view networks. In particular, the proposed approach recovers a clean adjacency tensor from the noise corrupted adjacency tensor which is then used to compute a common subspace across all views. More specifically, this common subspace is computed by optimizing the contribution of each view through Tucker decomposition. Several real-world multiview networks are used to test the performance of the proposed method and compare it to other existing methods. Finally, the experiments' results show that the proposed method achieves better clustering results in terms of multiple quality metrics compared to many state-of-the-art SVC and MVC algorithms.

APPENDIX A SUBPROBLEM B
The subproblem B in Eq. (19) can be written as follows: Eq. (27) can be written as follows: Let N l = Z l + Y 2l γ 2 , by using tensor tubal-shrinkage operator [16], the solution of Eq. (28) is given as: where the t-SVD of N l is N l = Q l * S * Q r , where R S = S * J , and J ∈ R n×n× V is an f-diagonal tensor whose diagonal element in the Fourier domain is J (i, i, j) = ) + , where = V γ 2 , and t + is the positive values of t.

APPENDIX B SUBPROBLEM E
The Lagrangian unconstrained problem of Eq. (20) can be written as: Eq. (30) can be simplified as: Let (v) l γ 1 , then a matrix, C, can be constructed by arranging all views of C (v) vertically and Eq. (31) can be rewritten as: Then the update of E can be calculated following [14] as: where C :,j 2 is the L 2 -norm of the j th column in the matrix C.

APPENDIX C SUBPROBLEM Z
The subproblem in Eq. (21) can be written as: Eq. (34) can be simplified as follows: A closed form solution of Eq. (35) can be computed by taking the gradient of Eq. (35) with respect to Z (v) and set it to zero [58]. The solution of Z (v) l+1 is then given by: where F