Multiview Tensor Spectral Clustering via Co-Regularization

Graph-based multi-view clustering encodes multi-view data into sample affinities to find consensus representation, effectively overcoming heterogeneity across different views. However, traditional affinity measures tend to collapse as the feature dimension expands, posing challenges in estimating a unified alignment that reveals both cross-view and inner relationships. To tackle this challenge, we propose to achieve multi-view uniform clustering via consensus representation co-regularization. First, the sample affinities are encoded by both popular dyadic affinity and recent high-order affinities to comprehensively characterize spatial distributions of the HDLSS data. Second, a fused consensus representation is learned through aligning the multi-view low-dimensional representation by co-regularization. The learning of the fused representation is modeled by a high-order eigenvalue problem within manifold space to preserve the intrinsic connections and complementary correlations of original data. A numerical scheme via manifold minimization is designed to solve the high-order eigenvalue problem efficaciously. Experiments on eight HDLSS datasets demonstrate the effectiveness of our proposed method in comparison with the recent thirteen benchmark methods.


Multiview Tensor Spectral Clustering via Co-Regularization
Hongmin Cai , Senior Member, IEEE, Yu Wang , Fei Qi , Zhuoyao Wang , and Yiu-ming Cheung , Fellow, IEEE Abstract-Graph-based multi-view clustering encodes multiview data into sample affinities to find consensus representation, effectively overcoming heterogeneity across different views.However, traditional affinity measures tend to collapse as the feature dimension expands, posing challenges in estimating a unified alignment that reveals both cross-view and inner relationships.To tackle this challenge, we propose to achieve multi-view uniform clustering via consensus representation co-regularization.First, the sample affinities are encoded by both popular dyadic affinity and recent high-order affinities to comprehensively characterize spatial distributions of the HDLSS data.Second, a fused consensus representation is learned through aligning the multi-view low-dimensional representation by co-regularization.The learning of the fused representation is modeled by a high-order eigenvalue problem within manifold space to preserve the intrinsic connections and complementary correlations of original data.A numerical scheme via manifold minimization is designed to solve the high-order eigenvalue problem efficaciously.Experiments on eight HDLSS datasets demonstrate the effectiveness of our proposed method in comparison with the recent thirteen benchmark methods.

I. INTRODUCTION
C LUSTERING is one of the crucial topics in unsupervised learning [1], [2], [3].The goal of clustering is to partition unlabeled data into different subgroups.Traditional clustering methods have been extensively studied and have found wide applications in bioinformatics, computer vision, and other fields [4].In real-world scenarios, the universal adoption of multi-view data is driven by the acquisition of data from different sources or multiple feature extractors [5], [6].Consequently, the seamless integration of heterogeneous data has become a paramount focus in the realm of multi-view clustering [7], [8].
Recent studies have proposed different strategies to integrate complementary correlations from different views, with the aim of improving clustering performance [9], [10].An intuitive approach begins by directly concatenating the data from different views as vectors.Subsequently, the conventional single-view methods are applied to the concatenated data [11].However, this manner ignores the heterogeneity and difference of scale among multi-view data.Graph-based multi-view clustering methods are proposed to align the affinity graphs for the uniform representation [12].Intuitively, the harmonization of heterogeneous multi-view data can be accomplished by employing affinities with same scale, thus reducing the disparities between different views.To achieve multi-view clustering, a joint analysis of the graphs across multiple views is required to extract consensus and complementary correlations.Accordingly, various approaches have been proposed to refine a consensus representation from multiple graphs [13].For example, Kumar et al. [14] proposed to refine the affinity matrix via performing co-training of spectral results.However, there is a critical prerequisite in aforementioned methods: the data relationship can be accurately described by pairwise affinity.This can be challenging in real applications, especially for high-dimension m yet low-samplesize n (HDLSS) data when n m [15], [16].The clustering performance of HDLSS data is hindered by the concentration effects, also known as the "curse of dimensionality" [17].The collapse of pairwise distances in high-dimensional feature spaces presents a formidable challenge for clustering algorithms reliant on pairwise affinity, impeding accurate clustering [18].
For relieving the curse of dimensionality caused by HDLSS data, recent works have proposed to utilize high-order affinity to describe the spatial distribution of multiple samples.Employing high-order affinities is anticipated to mitigate the concentration effects on dyadic affinity and unveil crucial relationships within Fig. 1.Description of the CRMATS method.CRMATS demonstrates proficiency in accurately acquiring latent representations from multiple views on the manifold space.Specifically, given a multi-view dataset X (1) , . . ., X (N ) , the corresponding pairwise affinities, third-order and fourth-order affinities are computed.Next, the local structure of the intrinsic subspace is encoded and low-dimensional representations are obtained by employing the manifold constraint.To align the representations of each view, we incorporate co-regularized learning, resulting in a consensus representation W . Additionally, a self-weighting fusion module is adopted to compute the corresponding weights λ (1) , . . ., λ (N ) during the alignment process.Finally, we use spectral clustering on the consensus representation to obtain results.the data.Along this line, Mei et al. [19] leveraged first and second-order affinities to mine the local structure of pairwise points, incorporate third-order similarity with a low-rank constraints for enhanced clustering performance and consensus correlations.Also, Ghoshdastidar [20] demonstrated a relationship between the relaxation of hypergraph clustering and the multilinear singular value decomposition with consideration of multiply affinities.Furthermore, IPS2 [21] validates that the higher-order affinities can be a complementary description of pairwise relations to enhance clustering performance.Following this vein, our previous work [22] unifies different order affinities to overcome the concentration effects, leading to remarkable estimating the spatial distribution of HDLSS data.The aforementioned methodology exhibits a deficiency to integrate multiview data, thereby impeding the extraction of heterogeneous correlations.
To solve the key problems of extracting consensus correlations and accurately revealing relationships from multiview data, especially for HDLSS data, we propose the Co-Regularized multi-view clustering via Manifold Alignment on Tensor Spectral embedding (CRMATS) method.Our method presents a unified multi-view clustering framework, which is based on the accurate description of intra-view sample relationships through the introduction of multi-order affinities.The lowdimensional representations from each view are co-regularized on the manifold space, aiming to minimize the geodesic distance and achieve alignment.This alignment process enhances the quality of the consensus representation, ultimately leading to improved partitioning performance as feedback.To accelerate the solution of the unified model, an efficient iterative strategy is designed to solve this model efficaciously.Extensive experiments on both synthetic and real-world datasets validate nice performance of our method.The framework of CRMATS is shown in Fig. 1, and the main contributions of this paper are summarized as follows: r To precisely uncover the intra-view spatial correlations of HDLSS data, the incorporation of high-order affinities is employed, facilitating the capture of intricate sample interactions and effectively eliminating the concentration effects within each view.
r For effective integration multi-view data with hetero- geneous correlations, co-regularized learning and manifold constraint are employed to align the respective low-dimensional representations, effectively leveraging cross-view spatial complementarity of HDLSS data.
r To improve computational efficiency, a singular value decomposition-based method is utilized to solve the quadratic problem on the manifold space.The remainder of this paper is organized as follows.Section II introduces the background on multi-view and tensor spectral clustering.Section III presents the proposed method and its optimization algorithm.In Section IV, we report the experimental results on the comparative datasets and methods.Finally, we draw a conclusion in Section V.

II. BACKGROUND ON MULTIVIEW TENSOR SPECTRAL CLUSTERING
Notations: In this paper, we use bold calligraphy, uppercase letter, and lowercase letter to represent a tensor, matrix, and a vector, i.e., (X , X, x), respectively.For a matrix X ∈ R n×n , the j-th column is denoted as X :,j , the trace of X is represented as T r(X), and I represents the identity matrix.The square of the Frobenius norm of X is denoted as ||X|| 2  F .The Khatri-Rao and Kronecker products are denoted by * and ⊗, respectively.For a third-order tensor X ∈ R n×n×l , we denote the v-th frontal slices of X as X (v) ∈ R n×n .

A. Tensor Based Approaches for Multiview Clustering
Tensor-based clustering techniques harness the high-order representation of multi-view data through tensors, thereby elucidating the intricate inter-view relationships prior to performing cluster analysis [23].In this process, most of the early works relied on tensor decomposition techniques.For example, Yu et al. [24] stacked the original data into a tensor and applied tensor-based factorization to obtain factor matrices that capture high-order relationships.Similarly, Nie et al. [13] proposed a co-clustering method via tensor factorization to learn a low-rank approximation for discovering high-order relationships.Following this approach, Guo et al. [25] utilized tensor logarithmic Schatten-p norm to obtain a more compact low-rank structure, which explores complementary information and characterizes the high-order correlations among multiple views.Meanwhile, Ji et al. [26] employed tensor decomposition to generate consistent and complementary tensors, while refining a tighter approximation of the tensor rank to explore the high-order consistency in consistent tensor.Similarly, Li et al. [27] stacked pairwise affinities into a tensor and employed a hypergraph-induced regularization for tensor factorization, enabling them to learn a consistent representation that preserved high-order correlations and improved performance.The aforementioned methods require the assurance that data relationships can be accurately described by pairwise affinities, which is difficult to hold in HDLSS data.Additionally, the utilization of tensors to represent high-order information between views does not leverage the high-order relationships between samples.

B. Revisiting Classical Spectral Clustering
Spectral clustering is a classical method that employs dyadic affinity to learn an optimal low-dimensional embedding from raw data for clustering purposes.Given a data matrix X ∈ R n×m , where n is the number of samples and m is the feature dimension, the objective of spectral clustering is to divide these n samples into c subgroups by reformulating the clustering problem as a minimum cost problem of graph-cut.The crucial step of this method is to construct a similarity graph by calculating the dyadic affinity matrix A ∈ R n×n .Specifically, the (i, j)-th element of A is calculated as A ij = d(x i , x j ), where d(•, •) is a pairwise metric.The i-th diagonal element of the degree matrix D ∈ R n×n is denoted as D ii = n j=1 A ij .The Laplacian matrix is then defined as L = I − D − 1 2 AD − 1 2 .The spectral clustering seeks a low-dimensional embedding by minimizing the following objective model: Equation ( 1) can be equivalently solved by seeking the dominant eigenvector of the Laplacian matrix, and thus degenerates to a standard eigenvalue problem.Moreover, this equation can be seen as a formulation that addresses the maximum partition problem in graph-cut, as described in [28].Alternatively, one can define a normalized dyadic affinity matrix , other that the early ones to impose a strong positive definite Laplacian graph.The spectral clustering can also be popularly expressed as a maximization problem : Then, one can perform a clustering task like k-means on the obtained embedding.

C. Tensor Spectral Clustering
The core of model ( 2) is to maximize the intra-cluster affinity, thereby preserving the volume of each subgraph after graph cut.In our previous work [22], we introduced a normalized affinity entropy measurement, which effectively evaluates the volume of affinity using any number of samples.
Definition 1.The total normalized similarity: Let C be a group of samples belonging to the dataset X, and S ∈ R n 1 ×n 2 •••×n r be an order-r similarity tensor.The total normalized similarity of the samples in C is defined as follows: where L represents the normalized order-r affinity tensor.Let the samples X partitioned by k sets, i.e., C 1 , C 2 , . . ., C k , the normalized associativity (N Assoc ) of the resulting clustering is defined: where |C i | denotes the cardinality of cluster C i .Definition 2: mode-k product The mode-k product between an order-m tensor

and zero otherwise.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
To obtain an optimal sample assignment C 1 , . . ., C k , the maximization of the normalized associativity in ( 4) is pursued.This can be achieved through algebraic manipulation, which allows us to reformulate the problem as follows: Due to the NP-hardness of the maximum normalized associativity problem, it is necessary to use relaxation techniques to make the problem more tractable.One such technique involves relaxing the binary assignment matrix to an orthonormal matrix V ∈ R n×k , where V V = I.This relaxation reduces the strict binary assignment requirement and simplifies the problem as: where v j represents the j-th column of V .

D. Co-Regularized Multiview Clustering
In multi-view clustering, learning a consensus representation is the common approach to capture the local view-specific structure [29].Based on this hypothesis, co-regularized learning is applied to align the low-dimensional representation of different views, reducing the impact of noise or errors and enhancing the quality of the consensus representation [30].Given multi-view data X ∈ R n×m×l with n samples, m features, and l views, one can construct the corresponding affinities L (i) ∈ R n×n of each view and project into the low-dimensional space to obtain representations V (i) ∈ R n×k .The corresponding strategy is to align diverse low-dimensional representations and obtain a consensus representation W ∈ R n×k .This alignment essentially establishes a consensus representation by minimizing the geodesic distance on the manifold [31], and the measurement of geodesic distance can be: where V (i) is obtained through (2) to learn the low-dimensional representations within a single subspace, and the optimization of (8) aims to minimize geodesic distances and promote intersubspace proximity [32].Building upon these principles, the co-regularized learning process can be as: max where L(i) 2 ∈ R n×n is the pairwise affinity of the i-th view.However, noise or errors of different views can adversely affect multi-view performance [33].In order to tackle this challenge, λ (i) is introduced to control the influence of the deteriorating views, and the final co-regularized model with the manifold constraints can be as: max where λ (i) is used to measure the weight of the i-th view.

III. MULTI-VIEW TENSOR SPECTRAL CLUSTERING VIA CO-REGULARIZATION
Existing multi-view methods primarily rely on linear spaces, rendering them inadequate for analyzing multi-dimensional data characterized by intrinsic complex structures.Fortunately, manifolds can be conceptualized as the low-dimensional smooth surfaces embedded within higher-dimensional euclidean spaces, providing a framework that enables the effective capture and comprehension of the intricate structures in high-dimensional data [34], [35].Follow this vein, Khan et al. [36] introduce manifold-based methods that effectively capture complex structures, leading to significant improvements in clustering performance.These techniques enable the representation of spatial structures, making them particularly advantageous for analyzing HDLSS data.Inspired by these manifold-based work, a unified tensor clustering model can be developed by combining high-order affinities, mitigating the concentration effects in HDLSS data.To enhance the discriminative power of the low-dimensional representation for HDLSS tasks, we impose a constraint that restricts the embedding to the manifold space.In this paper, to demo the effectiveness of our framework in handling both odd-order and even-order affinities, we introduce the third and fourth-order affinities.Thus, a model that incorporates high-order affinities and manifold constraints is proposed: where and L 4 are triadic and tetradic affinity tensors for quantifying similarities among triplets and quadruplets data.

A. Co-Regularized Multi-View Clustering via Manifold Alignment on Tensor Spectral Embedding
Equation ( 11) achieves the low-dimensional representation on the manifold space by integrating multi-order affinities.Building upon this, a co-regularized learning step with the manifold constraints is introduced for handling the concentration effects and extracting heterogeneous correlations in multi-view HDLSS clustering.The goal of the proposed method is to effectively leverage cross-view correlations in HDLSS data by combining Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.(11) and (10).Consequently, a Co-Regularized multi-view clustering via Manifold Alignment on Tensor Spectral embedding (CRMATS) method is proposed as follows: min where L3 and L4 are the normalized unfold form of tensor affinities, calculated by: where D3 1 and D3 2 are the diagonal matrices with diagonal elements obtained by taking the square root of the Khatri-Rao product of the column-wise sums and the column-wise sums of T 3 , respectively.D 4 is a diagonal matrix in which the elements on the diagonal are computed as the reciprocal square root of the sum of each row of T 4 .The matrices T 3 and T 4 are the unfold form of third and fourth order affinity tensors.Tensor affinities T 3 and T 4 can be defined as: for i, j, k, l ∈ n, where d ij denotes the distance between samples x i and x j , and σ is an scaling constant.However, the computation in (12) involves high-order polynomial function that may lead to numerical difficulties.To circumvent this, one can introduce a slack variable 2 to approximate the term (12).Furthermore, we introduce γ 1 , γ 2 to trade off the affinities term and co-regularization term.The final model can be: Equation ( 16) aims to fuse multiple affinities to produce a consistent representation that is robust against noise and concentration effects.The clustering task is then accomplished by applying spectral clustering, leading to the final group assignments.

B. Numerical Scheme to Solve CRMATS
An efficient alternating direction minimization strategy is employed to solve CRMATS.Using the Augmented Lagrange formulation methodology, the corresponding function of ( 16), defined as L, is obtained by: where Y i and Z are Lagrange multipliers, μ 1 and μ 2 > 0 are penalty parameters.Our goal is to minimize L by dividing it into several subproblems.We accomplish this by considering the following variables alternatively and solving each variable while keeping the others fixed.
Step 1): Solving the subproblem with respect to the variable When keeping the related items, (17) can be: In (18), we encounter a quadratic term −( . By incorporating the manifold constraint, (18) can be simplified as a quadratic optimization problem within each view.To solve this problem, we introduce a universal Stiefel manifold M p = {V ∈ R n×k |V V = I}.The quadratic and first-order terms for each view are denoted as A and B. min where A ∈ R n×n is a symmetric matrix.In order to solve the problem in (19), it can be relaxed into: where Â = αI − A ∈ R n×n .The parameter α is an arbitrary constant such that Â is a positive definite matrix.A closed-form solution to (20) can be achieved through the corresponding Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Compute the gradient, M ∈ R n×k ← 2 ÂV + 2B.

4:
Calculate M = F ΣR via the compact SVD of M .5: Update V ∈ R n×k ← F R .6: end while 7: returnThe orthonormal base V .partial derivative M [37].Suppose that M ← F ΣR represents the details of the singular value decomposition, in which (20) can be solved equally: Then we have the following equations: where Z = R V F ∈ R k×n .Apparently, we conduct that z ii ≤ 1 since ZZ = I.In this way, T r(V M ) reaches maximum when the matrix Z = [I, 0] ∈ R k×n .Thus, the optimal solution of V can be: Algorithm 1 provides a concise description of the quadratic optimization algorithm on the manifold.For a comprehensive analysis of its convergence, we refer readers to Section A of the supplementary materials, available online.Once we have derived the objective function for the subproblem, we can utilize Algorithm 1 to obtain the solution for V (i) .As we are imposing the orthogonal constraint of manifold, the gradient of a differentiable function f : V k,m → R can be given as [38].Thus, the gradient of where can be updated iteratively through (24) and Algorithm 1.
Step 2): Solving the subproblem with respect to the variable Discarding the items irrelevant to V (i) 2 , the augmented Lagrange function can be simplified as: The gradient of the objective function is: By setting the gradient to zero, L(i) 4 is symmetric and its diagonal elements are not yet zero, one can obtain the implicit solution as: Step 3): Solving the subproblem with respect to the latent representation W .
When the terms associated with W are kept, the following subproblem is obtained: where ( 28) is actually a standard k-means problems with a specific kernel We can obtain W via eigenvalue decomposition on this specific matrix.
Step 4): Solving the subproblem with respect to λ (i) . max By defining T r(V , and combining the Cauchy-Schwarz inequality [39], the optimal solution for λ (i) can be obtained as: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

7:
Update Z and W via (32) as well as (28).8: Check the convergence conditions.9: end while 10: returnConsensus representation W . 11: Perform spectral clustering on W to have sample assignment Y pred .
Step 5): Updating the multipliers Y i and Z, their formulations are follows: where t is current number of iterations.The five steps are iteratively updated until convergence or until a stopping criterion is met: max( . Algorithm 2 presents a comprehensive outline of the solving process in CRMATS, serving as a valuable reference for a detailed understanding of the methodology.The convergence proof of CRMATS is elaborated in Section A of the supplementary materials, available online.

IV. EXPERIMENTS
In this section, a comprehensive experimental study is conducted on eight HDLSS datasets to showcase the effectiveness of CRMATS.All of the experiments are implemented in Matlab 2020a on 64-bit Windows OS PC with an Intel 2.30-GHz CPU.

A. Comparative Datasets and Methods
A total of eight datasets are utilized to validate the effectiveness of our method in the experiment, comprising six real datasets and two HDLSS synthetic datasets: Syndata1 and Syndata2.Syndata1 consists of 120 samples divided into two categories, with each category containing 60 samples.Each sample is described from three views.To verify the robustness of our methods on HDLSS data, we extend the dimensions and the number of views, and reduce the sample size on Syndata1 to obtain Syndata2.Specifically, Syndata2 comprises 90 samples divided into 3 categories, with each sample being described from four views.Each subcategory of synthetic data is generated from independent and identically distributed normal distributions with the mean values of 2 and the standard deviations of 0.5.In addition, we evaluate the effectiveness of CRMATS in six public benchmark datasets, including Coil-20 [40], MSRC_v1 [41], Yale [42], BBCSport [43], 3Sources [44] and Reuters [44].To demonstrate the effectiveness of our method on HDLSS datasets, we randomly select samples from these datasets for experiments.More details about the datasets are provided in Table I, and our method is compared with other multi-view clustering methods.The details of the comparative methods are shown as follows: I. Scalable Multi-view Subspace Clustering (SMSC) [45] constructs latent graph after anchor learning.II.Pure graph-guided multi-view subspace clustering (PGSC) [46] learns consensus graph by leveraging the sparsity and connectivity of each affinity graph.III.Robust Multi-View Spectral Clustering (RMSC) [12] considers low rankness and sparsity of matrix to learn a common graph after decomposition.IV.Low-rank Tensor Based Proximity Learning (LTBPL) [42] performs probability affinity to recover the low rankness and high-order correlations.V. Multiview Subspace Clustering via Low-Rank Symmetric Affinity Graph (LSGMC) [44] pursues a consistent low-rank structure across views.

VI. Measuring Diversity in Graph Learning: A Unified
Framework for Structured Multi-View Clustering (CD-MGC) [47] leverages the multi-view consistency and the diversity in a unified framework.VII.Co-regularized kernel k-means for multi-view clustering (Co-reg) [39] combines similarities of different view and latent representation for clustering.VIII.Multiview Clustering via Co-Training Robust Representation (CoMSC) [48] finds a consensus matrix and complementary information.IX.Efficient Multi-view Graph Clustering (EMGC) [49] finds a consistent cluster indicator matrix with a Super Nodes Similarity Minimization module.[50] integrates multiple affinity graphs into a consensus one with the topological relevance.[51] couples the representation matrix to explore high-order relationship.

XII. Multiview Subspace Clustering by an Enhanced Tensor
Nuclear Norm (WTSNM) [52] studies the Schatten pnorm to solve the minimization problem.XIII.High-order Complementarity Multi-View Clustering with Enhanced Tensor Rank (HCETR) [26] adopts Tensor Rank to find high-order consistency.

B. Clustering Performance
The clustering performance is evaluated using several commonly employed metrics, including Accuracy (ACC), Normalized Mutual Information (NMI), Purity, and Fscore [32].A larger value denotes superior performance.Consequently, the best results are highlighted in bold.Considering that the clustering problem does not include the number of groups present in the data, we further use the Calinski Harabasz index (CHI) to further assess the quality of the clustering results.CHI measures the compactness and separation of clustering outcomes by evaluating the ratio of inter-class variance to intra-class variance [53].A higher CHI value indicates better clustering results.In our evaluation, a comprehensive assessment is provided using five metrics, with each metric representing a specific property of the clustering outcomes.The clustering performance of our method and comparison methods on eight benchmark datasets is presented in Tables II and III.Moreover, we report the CHI values between the ground truth labels and the features for each dataset.To ensure the robustness, each algorithm is repeated 20 times to obtain the mean value.We then use Student's t-test to test the statistical significance of the results, with the p-value represented in parentheses.
To assess the discriminative power of the consensus representation, we utilize t-SNE for visualizing the differences by projecting the latent representation onto a two-dimensional space.Syndata2, which has the highest dimensionality, is specifically selected to represent the synthetic scenarios.When visualizing the raw data from their respective views using t-SNE (Fig. 2(a)-(d)), it is evident that most of the samples in Syndata2 appear intermingled and lack clear separation.In contrast, the consensus representation achieved by our method successfully separates subcategories without any overlap (Fig. 2(e)).For the purpose of further validating the effectiveness of CRMATS, an analysis of the heatmaps of similarities is performed to evaluate the distinctions between groups.In Fig. 2(f)-(i), the affinity heatmap of the raw samples lacks clear boundaries and block structures.However, Fig. 2(j) shows that the affinity obtained from the low-dimensional embedding after applying CRMATS exhibits distinct boundaries, indicating the method's ability to mitigate potential biases.The results indicate that the fusion of different order affinities outperforms the traditional pairwise affinity, low-rankness, and tensor-based methods in fully capturing the data structure on synthetic data.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II COMPARISON RESULTS (%): THE MEAN AND p VALUE MEASURED BY DIFFERENT CLUSTERING METHODS ON ALL THE CORRESPONDING DATASETS
To visually demonstrate the effectiveness of our method on Coil-20, we present the spatial distributions of the raw data and consensus representation obtained using t-SNE in Fig. 3. Additionally, Fig. 4 showcases the corresponding heatmaps and consensus representations learned by our method.These visualizations provide compelling evidence supporting the superiority of our approach.In the pairwise affinity-based visualization (Fig. 3(a)-(c)), most samples appear mixed together, making it challenging to accurately distinguish between different subgroups .However, our method generates a consensus representation that exhibits improved separation and reduced overlap, leading to enhanced clustering performance.The effectiveness  Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.  of our method is demonstrated through the comparison of affinity heatmaps.In Coil-20, the affinity heatmaps of each view on the raw samples (Fig. 4(a)-(c)) exhibit blurred boundaries and lack a clear block structure.However, after applying our method and generating the affinity matrix from the consensus representation, the boundaries become well-defined.This is supported by the t-SNE visualization of the consensus representation, where most samples are clustered with their corresponding partners, and distinct blocks are observed on the diagonal of the heatmap (Figs.3(d) and 4(d)).Overall, CRMATS yields superior clustering performance on all real-world datasets, as shown in Tables II and III.Furthermore, we conducted an analysis on the randomness of views and resource consumption in CRMATS, which can be found in Section B and Section C of the supplementary materials, available online.
Based on the experimental results, CRMATS demonstrates several advantages in handling HDLSS data.First, our method effectively utilizes high and low-order affinities to comprehensively capture the spatial structure of HDLSS data.Second, our co-regularization approach aligns the different low-dimensional representations to seek a consensus graph and incorporate crossview correlations, thereby avoiding suboptimal clustering results.Lastly, by learning a consensus representation on the manifold space, we consider the complex connections among samples of the original data, leading to improved clustering performance.

C. Convergence Analysis
An alternate minimizing algorithm is developed to solve the optimization problem.As for the convergence of CRMATS, we have provided the corresponding theoretical proof in Section A of the supplementary materials, available online.In this subsection, we compare objective value of the benchmark datasets with diverse backgrounds to eliminate randomness and then illustrate clearly in Fig. 5.We show the objective value with 50 epochs.The corresponding objective value of each benchmark dataset decreases sharply in the first 5 iterations and then stays steady with more iterations, implying that CRMATS converges steadily after just a few iterations.The fast convergence of CRMATS is due to the alternate minimizing algorithm, which updates each variable separately.Moreover, by integrating co-regularization techniques and leveraging high-order affinities on the manifold space, our proposed method adeptly captures the intrinsic structure of the data.Consequently, our method exhibits rapid convergence, highlighting its effectiveness as a solution for optimizing the problem in multi-view HDLSS clustering.

D. Hyperparameter Sensitivity Analysis
The influence of hyperparameter sets on the clustering performance in terms of NMI is investigated, and the results are represented in Fig. 6.Notice that (16) consists of two parts, the affinities term and co-regularization term.We set hyperparameters for each part, denoted as γ 1 and γ 2 , respectively.We use a combination of hyperparameters [1e-6, 1e-5, 1e-4, 1e-3, 0.01, 0.1, 1, 10, 100] for both γ 1 and γ 2 , and evaluate the performance on four benchmark datasets.
According to Fig. 6, the clustering performance is influenced by γ 1 in certain instances.For example, the NMI of Syndata1 and Yale (Fig. 6(a) and (c)) decreases with lower γ 1 .In addition, the NMI changes little while γ 2 increases.Although the performance of CRMATS changes with different combinations of γ 1 and γ 2 , as shown in Fig. 6(b) and (d), CRMATS still outperforms its comparative methods on benchmark datasets, demonstrating the stability of our model.Furthermore, the results depicted in Fig. 6 provide empirical evidence of the robustness of CRMATS to the selection of hyperparameters.Our method consistently outperforms the comparative methods on multiple datasets under different settings, indicating its insensitivity to specific hyperparameter choices and its ability to achieve favorable performance across a wide range of hyperparameter values.The observed effectiveness and robustness of CRMATS in multi-view clustering tasks, as demonstrated in the results of Fig. 6, underscore the practicality and versatility of our method.

E. Ablation Analysis
In this subsection, an ablation study of our proposed CRMATS method is conducted to investigate the roles played by different order of affinities and their combinations.In order to assess the significance of multi-order affinities, an ablation study is conducted as follows.First, the removal of L3 and L4 results in the method degrading into the traditional Co-reg approach.This variant is referred to as CRMATS-L2.Subsequently, each high-order affinity is individually applied in isolation, yielding the methods CRMATS-L3 and CRMATS-L4, respectively.In the multiple affinities situation, we combine L2 , L3 , and L4 in pairs.For instance, CRMATS-L23 is the combination of L2 and L3 .The clustering performance of each ablation method is evaluated using the same benchmark datasets and metrics as employed in the previous experiments.
The experimental results presented in Fig. 7 demonstrate the importance and effectiveness of incorporating multi-order affinities in multi-view clustering tasks.First, our method achieves the best performance on the following datasets, indicating its effectiveness in capturing the underlying structure of the data in multi-view clustering tasks.Second, singular affinity experiments demonstrate that high-order affinities (CRMATS-L3, CRMATS-L4) supplement the inherent information of traditional pairwise affinity.For example, the performance of CRMATS-L3 is better than CRMATS-L2 on Syndata1 and BBC.Finally, the fusion of the second, third, and fourth order affinities exhibits superior performance in comparison to any two-order fusion (e.g., CRMATS-L23, CRMATS-L24, CRMATS-L34).This observation suggests that the inclusion of each order of fusion contributes to the enhancement of internal information, resulting in incremental improvements.

V. CONCLUSION
In this paper, we have presented a unified multi-view clustering framework, which aligns the latent representations of views on the manifold space based on the accurate description of intra-view sample relationships through the introduction of multi-order affinities.The efficiency of CRMATS is improved by employing an alternating minimization strategy and singular value decomposition.Furthermore, a novel set of evaluation metrics is devised to comprehensively assess the performance of CRMATS in capturing the underlying structure of the data, followed by taking into account the similarity within clusters and the dissimilarity between clusters in clustering tasks.Experimental results on eight HDLSS datasets have demonstrated the effectiveness of the proposed method in comparison with the other popular approaches.
Although our method effectively addresses the concentration effects in high-dimensional data clustering and outperforms several baseline methods on the benchmark dataset, there are still potential directions for improvement.First, our strategy of improving the HDLSS clustering results through high-order affinity requires more time and memory costs for computation.Additionally, existing high-order affinity methods have limitations when dealing with the complex graph data.To address these limitations, we consider incorporating deep graph neural network for high-dimensional learning.The deep neural network can leverage GPU computing units to reduce memory costs within each mini-batch.Furthermore, the graph neural network enables us to extract more potential high-order correlations from high-dimensional data.

Algorithm 1 :
Generalized Power Iteration Method (GPI).Input: The matrix A ∈ R n×n , matrix B ∈ R n×k and α Output: The orthonormal base V of Stiefel manifold M p = {V ∈ R n×k |V V = I}.1: Initialize a random orthonormal base of Stiefel manifold, definite matrix Â = αI − A via α.2: while Not Converged do 3:

Fig. 2 .
Fig. 2. Visualizations of the consensus representation, raw data with t-SNE, and the heatmap on Syndata2.Subfigure (a)-(d) depict the spatial distribution of Syndata2 from different views using t-SNE.Subfigure (e) is the visualization of the obtained consensus representations W using t-SNE.Subfigure (f)-(i) show the heatmap of Syndata2 from different views.The affinity heatmap on raw samples has blurred boundaries and no apparent block structures.In contrast, Subfigure (j) shows the affinity from consensus representations has clear-cut boundaries.

Fig. 3 .
Fig. 3. Visualization of the consensus representations and raw data.Subfigure (a)-(c) depict the visualization of Coil-20 using t-SNE.Subfigure (d) is visualization of the consensus representations W .

Fig. 4 .
Fig. 4. Visualization of heatmap on Coil-20.Subfigure (a)-(c) are the heatmap from different views.The affinity heatmap on raw samples has blurred boundaries.In contrast, Subfigure (d) shows the affinity from consensus representation has clear-cut boundaries.

Fig. 5 .
Fig. 5.The convergence curve of the proposed CRMATS method, on (a) Syndata1; (b) Coil-20; (c) Yale; and (d) BBC.The objective value decreases consistently with respect to the iteration number.

TABLE III CHI
RESULTS: THE MEAN AND p VALUE MEASURED BY DIFFERENT CLUSTERING METHODS ON BENCHMARK DATASETS