Human-Guided Functional Connectivity Network Estimation for Chronic Tinnitus Identification: A Modularity View

The functional connectivity network (FCN) has been used to achieve several remarkable advancements in the diagnosis of neuro-degenerative disorders. Therefore, it is imperative to accurately estimate biologically meaningful FCNs. Several efforts have been dedicated to this purpose by encoding biological priors. However, owing to the high complexity of the human brain, the estimation of an ’ideal' FCN remains an open problem. To the best of our knowledge, almost all existing studies lack the integration of domain expert knowledge, which limits their performance. In this study, we focused on incorporating domain expert knowledge into the FCN estimation from a modularity perspective. To achieve this, we presented a human-guided modular representation (MR) FCN estimation framework. Specifically, we designed an adversarial low-rank constraint to describe the module structure of FCNs under the guidance of domain expert knowledge (i.e., a predefined participant index). The chronic tinnitus (TIN) identification task based on the estimated FCNs was conducted to examine the proposed MR methods. Remarkably, MR significantly outperformed the baseline and state-of-the-art(SOTA) methods, achieving an accuracy of 92.11%. Moreover, post-hoc analysis revealed that the FCNs estimated by the proposed MR could highlight more biologically meaningful connections, which is beneficial for exploring the underlying mechanisms of TIN and diagnosing early TIN.


I. INTRODUCTION
T INNITUS, the perception of sound in the absence of a corresponding external acoustic stimulus, is experienced by approximately 5% to 15% of adults [1], [2]. The diagnosis of tinnitus or its distress evaluation is based on the use of selfrating questionnaires. Thus far, there is no consensus on the neurophysiological model of tinnitus (TIN). Recently, several studies have illustrated that unusual brain activity [3], [4] and aberrant functional disruptions within certain brain networks [5], [6], such as the auditory network, default mode network and executive control network, tend to have a high correlation with TIN. Naturally, a robust approach to help identify tinnitus is to discover more informative biomarkers by effectively analysing the brain networks.
Recently, functional connectivity network (FCN)-based methods have achieved remarkable results in studies conducted on the diagnosis of neuro-developmental disorders, such as mild cognitive impairment [7], autism spectrum disorder [8] and TIN [9]. In particular, in all existing FCN research, the regions of interest (ROIs) are considered as nodes, and edges in the graph are regarded as representing the relationship between different brain regions [10]. In this study, we utilized this approach and divided the brain regions into different ROIs to estimated FCNs. Then, the estimated FCNs were utilized for subsequent tasks, such as neuro-developmental disorder diagnosis.
To achieve more accurate FCN-based diagnosis of neurodevelopmental disorders, the current studies can be divided into the following two main directions. The first direction focuses on utilizing an advanced diagnosis model (i.e., classifier-oriented This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ direction). Following this direction, several studies have been conducted by taking advantage of some powerful classifiers to boost the performance of diagnosis. For example, Jie et al. conducted graph kernel support vector machine (SVM) for MCI classification [11] in which a novel topological kernel is derived for FCN-based classifier. Peng et al. combine the information from both connection and its graph metrics for ASD identification by a multi-kernel SVM [12], which effectively utilized the information of the connectome. Liu et al. utilized the graph convolution network to identify MCI patients from normal controls (NCs) [13], which significantly increased the diagnostic performance. The second direction focuses on estimating biologically meaningful while accurately FCNs (i.e., FCN-estimation-oriented direction). In this direction, multiple FCN estimation methods are designed by combining the biological priors of FCNs, such as sparsity [14], modularity [7], group-similarity [15]- [17], inter-similarity [18] and scale-free priors [8] into the FCN estimation framework by encoding the regularizer or the prior. The estimated FCN is then fed to a train classifier to improve the diagnosis performance.
Although the classifier-oriented methods can achieve impressive results for some neuro-disease diagnoses, they still heavily rely on the quality of the estimated FCNs, which limits their performance. Thus, to boost the performance of neuro-disease diagnosis, we focused on the second direction, i.e., the FCNestimation-oriented direction. Due to the limited understanding of the human brain, the estimation of an 'ideal' FCN for such a direction remains a challenge. Motivated by the 'No Free Lunch' theorem, we believe that a well-designed FCN estimation model needs to make an appropriate assumption from domain knowledge about the brain to have a good generalization. However, it is often difficult for these methods to encode more fine-grained domain expert knowledge, which further restricts their performance. Thus, we attempted to encode the humanguided domain knowledge into a certain regularizer. Note that the modular knowledge is often available [19], [20], we thus focused on the modularity prior and suggested the integration of domain expert knowledge (e.g., modular partition index) into a unified FCN estimation model. Specifically, we proposed a novel FCN estimation framework by encoding the modular representation (MR) into the adversarial low-rank constraint, with the guidance of the given modular partition index (i.e., the work by Power's work, in which 264 ROIs were partitioned into 14 sub-networks/modulars) [20]. More specifically, under the guidance of the modular partition index, we reformulated the intra-and inter-modular structures of FCNs to an adversarial low-rank constraint (i.e., intra-modular structure for low rank and inter-modular structure for high rank) for FCN estimation.
To evaluate the performance of the estimated FCN by using the proposed MR method, we conducted several experiments by using real functional magnetic resonance imaging (fMRI) data. Specifically, we first estimated the FCN using the given fMRI data, and the estimated FCN was then adopted to identify the TINs from that of the NC. Based on the empirical results, we observed that the best results were achieved using the proposed MR method, which significantly outperformed the state-of-the-art(SOTA) methods. Notably, such results were achieved by conducting simple feature selection (t-tests) and by considering the most commonly used classifier (i.e., linear support vector machine) strategies [21] 1 Additionally, post-hoc analysis further showed that the FCNs estimated via MR tended to be biologically meaningful. Finally, to facilitate the reproduction of our results, the pre-processed data and the corresponding code were provided on Git-Hub. 2 In summary, the highlights of this study are as follows: 1) To the best of our knowledge, this is the first attempt to establish a unified framework, that effectively incorporates domain expert knowledge into FCN estimation. 2) A novel FCN estimation model (MR) is derived to encode the modular partition knowledge, which significantly boosts the performance of the estimated FCN in subsequent tasks.
3) The experimental results confirmed the effectiveness of the proposed MR method on the TIN vs. NC classification task with a classification accuracy of 92.11%. We hypothesized that the estimated FCN could be used to identify the TIN from NC.

A. Pearson's Correlation
The Pearson's correlation (PC) based FCN estimation method has become the mainstream in the current study of FCNs [8], [10], mainly due to its simplicity. Specifically, suppose that there are N ROIs. The mean BOLD signal time series can be formulated by a data matrix X = [x 1 , x 2 , ..., x N ] ∈ R T ×N , such that the column vector is the corresponding BOLD signals of the i-th ROI, where T is the number of temporal points. Then, the PC-based method estimates the FCN G = (G ij ) ∈ R N ×N by the following equation.
where barx i is the mean value of the x i . Note that the FCNs estimated by PC always generate dense FCNs, due to the notorious noises exist. Consequently, to filter out these potential noisy connections, a straightforward way is to provide a threshold to sparsify the estimated FCNs, which can be expressed as follows: where G new ij denotes the filtered FCN after thresholding. 1 Notably, the derived FCNs can methodologically be adapted to almost all classification models, such as universum support vector machine [22], ensemble deep transfer learning [23], graph-kernel SVM [11] and graph neural networks [13]. However, it should be emphasized that regardless of which classifier we adopted, the exploration of the effectiveness of the FCN estimation itself is the core of our research. 2 [Online]. Available: https://github.com/Cavin-Lee/FBN_MR.

B. Sparse Representation
Although the PC-based method successfully measures the full correlation between different ROIs, the estimated FCN tends to ignore the interactions among multiple ROIs, which is caused by the confounding effect. To alleviate such a issue, the partial correlation is proposed. Unfortunately, the partial correlation is easily ill-posed due to invention of the co-variance matrix Σ = X T X. As a result, a base solution is provided by incorporating an additional l 1 -norm regularizer [14], which also naturally incorporates the sparse representation (SR) of FCN. The SR-based FCN can be estimated by the following equation: where λ is a hyper-parameter. In 3 can be reformulated to the following matrix form:

C. Modularity Representation
In fact, there are FCN topological priors than sparsity. For instance, modularity is also an important topology prior of FCNs, where the connection in FCN tends to be gathered in the same sub-network but separated into different sub-network [19], [24]. Motivated by this, Qiao et al. [7] incorporates the modularity prior into the FCN estimation by an additional sparse and low-rank regularizer.
Similarly, the matrix form of (5) can be rewriten in the following format:

D. Matrix-Regularized Network Learning Framework
According to a recent study [8], almost all existing FCN methods can be easily formed into a matrix-regularization framework. Such a framework can be formulated into the following format: where f (X, W ) defines the data-fitting term that effectively models statistical information [25], such as the previously mentioned full correlation [8] or the partial correlation [14], and R(W ) is the corresponding regularization term which provides an easy platform to incorporate the biological priors of FCN. In addition, some specific constraints such as symmetry or nonnegativity can be included in Δ for shrinking the search space of G. Moreover, the λ in Eq.7 is a hyper-parameter. In the current Update ∂( G t * ) 3: Update G t+1 by Eq. (23); 4: end while 5: Return G.
FCN estimation research, only the corresponding biological priors were encoded. Examples include the sparsity prior by using the sparse representatio (SR), i.e., l 1 -norm [14], modularity prior by using the sparse and low rank representation (SLR), i.e., trace, and l 1 -norm [7], and group prior by using the group sparse representation (GSR), i.e., l 2,1 -norm [15], [16] or some variants, e.g., weighted l 1 -norm [26]. Unfortunately, these methods lack the integration of domain expert knowledge, and thus, their performance may be limited. For instance, a potential problem in the SLR model [7] is that the estimated connections may tend to be homogeneously distributed throughout the estimated FCN, and, the modular partition information is not considered.

III. HUMAN-GUIDED MODULARITY REPRESENTATION FOR FCN ESTIMATION
As an attempt, we aim to incorporate the domain knowledge of the modular, i.e., the participant index (modular ID of each ROI), into the FCN estimation. Specifically, we provide a novel FCN estimation framework called human-guided modularity representation (MR) based on (7), which can effectively incorporate the information of the human guided participants index. In the following section, we first provide our motivation. Then, we give the main objective. Next, to confirm its rationality, we derive a theoretical analysis. In the end, the entire algorithm is provided.

A. Motivation
This paper focuses on the modularity prior and attempts to incorporate its domain expert knowledge (i.e., participant index) for FCN estimation. Here, the motivation for MR is shown in Fig. 1. Specifically, the connections in the brain network tend to gather into the same sub-networks [19], [24]. In other words, the connections in the same module are often dense, while the connections from different modules are often sparse and orthogonal. Mathematically, if all nodes in network have dense connections, the rank of the adjacent matrix is low, but not vice versa. Naturally, the rank in the subnetwork tends to be low. In contrast, the connection across the modular tends to be sparse, resulting in a potential high-rank in the whole network, Thus, we incorporate an adversarial low-rank penalty with the guidance of the modular participant index, which characterizes the inter-and intra-modular structures of FCNs into an adversarial low rank constraint.

B. Human-Guided Modularity Representation
With the guidance of domain expert knowledge (i.e., the participants index), we provide a low-rank constraint for each sub-network of each modular, and give a high-rank constraint for the full network: Here, we use X − XG 2 F to capture the inverse covariance structure as the given data-fitting term, due to its empirical effectiveness and success [10], from which the second-order statistics between ROIs are captured [27]. G k is the sub-network denoted by the given participants index. rank(W ) is the rank of adjacent matrix G. However, rank(W ) is NP-hard. Consequently, we thereby relax rank(W ) to X * as follows:

C. Theoretical Analysis
Since there is a negative term in (9), in this subsection, to confirm its applicability, we provide the following theoretical analysis : Lemma 2 (Majorization Lemma [30]): If M, N ∈ S n + and M N, then, we have:
where S n + is the closure pendant of a semi-positive definite matrix and {λ j } n j=1 is the singular value and λ 1 ≤ λ 2 ≤ · · · ≤ λ n .
Proof: Using SVD on M, we have Then, using Lemma 2, we easily have: Thus, we have: Similarly, we have: Q.E.D Consequently, according to Theorem 2, the term K k G k * is an upper bound of G * , implying that (9) will not degenerate.

D. Algorithm
The objective function J (G) in (9) is neither differentiable nor convex, making the problem nontrivial. Fortunately, the objective function can be reformulated to the sum of a convex part J vex (W ) and a concave part J cav (G): Obviously, 21 is a typical difference of convex function problem, thus, we can easily utilize the ConCave-Convex Procedure (CCCP) to slove [31]. As a typical majorization-minimization algorithm, the CCCP solves this problem as convex problem sequences. Specifically, in each iteration, we approximate the concave part by using its gradient ∂J cav , i.e., ∂( G t * ), and obtain the following surrogate sub-problem: where G t is the solution of the last iteration. The nonlinear conjugate gradient method is adopted to solve the resulting convex sub-problem [32]. Let the derivation equal 0, we have: The entire algorithm is given in Algorithm 1. Specifically, the sub-gradient of J cav [33] is effectively evaluated as Algorithm 2.

E. Complexity Analysis
The time complexity of Algorithm 1 and Algorithm 2 is significantly dominated by singular value decomposition (SVD) for obtaining the sub-gradient, the matrix multiplication and operations. Specifically, obtaining the subgradient of G ∈ R N ×N in Algorithm 2 leads to a complexity of O(N 3 ). Moreover, for each iteration of Algorithm 1, the calculation of G t+1 requires O(N 3 + T N 2 ). Consequently, the total time complexity of MR is O(γ(N 3 + T N 2 )), where γ is the iteration number. It should be noted that it is not difficult to speed up the algorithm by using ADMM [34], but this is beyond the scope of this work.

A. Data Acquisition
Sixty-one chronic tinnitus patients were recruited from the Department of Otolaryngology of Nanjing First Hospital and fifty-five healthy controls were recruited through a health screening at the local community (i.e., patiens who were right-handed, with at least 8 years of education). The age, sex and education of the patients and controls were well-matched. The demographics of chronic tinnitus patients and healthy controls are given in Table I. Tinnitus severity and tinnitus distress were assessed by using the Iowa version of the Tinnitus Handicap Questionnaire (THQ) [28]. All participants had clinically normal hearing from 250 Hz to 8 kHz (hearing thresholds < 25 dB). According to the self-rating anxiety scale (SAS) and the self-rating depression scale (SDS), participants with depression and anxiety were excluded from this study [35], [36]. This study was approved by the Ethics Committee of Nanjing Medical University and informed consent was obtained.

B. Data Pre-Processing
All participants were obtained using a 3.0 T MRI scanner with an 8-channel receiver array head coil. Resting-state functional MRI data were obtained using a gradient echo-planar imaging sequence   [28] images were acquired with a three-dimensional turbo fast echo (3D-TFE) T1WI sequence (sagittal: TR = 8.1 ms; TE = 3.7 ms; FA = 8 • ; FOV = 256 mm × 256 mm; acquisition matrix = 256 × 256; thickness=1 mm; 172 total slices. Specifically, the SPM8 toolbox 3 and DPARSFA (version 2.2) 4 [37] were adopted to execute the fMRI pre-processing pipeline. Specifically, a well-defined pipeline provided by DPABI [37], was referenced pre-processing in this paper. Then, the average time series was grouped into the 264 ROIs according to Power's atlas [20]. Finally, we utilized these pre-processed BOLD signals as the corresponding data matrix to estimate the FCN.

C. Experimental Setting
In this study, we adopted Power's atlas to define the ROI, where 264 ROIs are divided into 14 subnetworks/modules [20]. The FCN is then estimated by the pre-processed fMRI signals. To verify the performance of the proposed MR, we compare the performance with that of the state-of-the-art FCN estimation methods, including the PC, SR, and SLR (modular representation without participant index knowledge) methods. Since the estimated FCN G tends to be asymmetric, we followed the recent study by [7], and force the estimated FCNs to satisfy symmetry by G * = (G + G T )/2, which contains (N × (N − 1))/2 effective connectivity measures.
To validate the proposed FCN method, we conducted multiple experiments. Accordingly, after we obtained the estimated FCNs, we utilized them to train a classifier for identifying TINs from NCs. It should be noted that we only used the upper triangular elements of the FCN as the input features since the adjacent matrix of the FCN is symmetric. However, it should also be noted that, in our experiment, each FCN has 264 nodes, and thus can produce 34,716 features, resulting in 34,716 edges. Thus, the dimension of the selected features was still too high compared to such a small sample size, which not only results in a large computation cost but also disturbs the generalization performance of the estimated FCNs.
As pointed out in a recent study [8], the feature selection or classifier has a core role in the final accuracy performance. Therefore, to avoid these notorious impacts, we utilized the most commonly used SVM classifier [21] with a linear kernel and the simplest feature selection method (i.e., t-test with a p-value ≤

TABLE III CLASSIFICATION RESULTS ON DIFFERENT FCN ESTIMATION METHODS
The results are obtained by 10-fold CV with 50 iterations. 0.01), since our main focus is to validate the performance of the estimated FCNs.
To fairly evaluate the final performance of different FCN estimation models, the leave-one-out (LOO) cross validation was conducted in this paper since the sample size is small [38], [39]. Specifically, in each verification cycle, only one participant is excluded for testing, while other participants are used to train the model and obtain the best parameters. To select the optimal parameters of the model, we further conducted an inner LOO cross-validation on the training set of the outer loop with a grid-search strategy. More specifically, for the regularized parameter λ in our proposed MR, the candidate values ranged from {2 −5 , 2 −4 , . . . , 2 4 , 2 5 }; for the threshold of PC, we use multiple sparsity levels ranging in {1%, 10%, . . . , 90%, 100%}. Here, the threshold 90% means that 90% of the weak edges are retained in FCN.

D. Classification Results
We collected multiple measurements including accuracy, specificity, sensitivity, and area under curve (AUC), to evaluate the final classification performance. The experimental results are given in Table. II.Due to the limited sample sizes, we also conducted 10-fold cross-validation with 50 iterations, and the results are given in Table III. Furthermore, we also provided the results of the ROC curves in Fig. 2. From these two results, we can easily observe that an appropriate prior can significantly improve the performance of our method for the TIN diagnosis. Specifically, SLR and MR are significantly outperform SR, indicating the rationality of modular prior, since both SLR and MR contain additional modular prior information. Meanwhile, the expert domain knowledge can help to estimate a discriminate FCN. In particular, the proposed MR method achieves superior results in all measures, achieving a gain of 26.32% improvements over SLR, and confirming the performance gain obtained by utilizing the domain expert knowledge, since MR contains more information about domain knowledge (i.e., modular participant). In addition, with DeLong's non-parametric statistical significance test [40], the proposed MR achieves better results than PC, SR, and SLR under 99 confidence interval with p-value equaling to 2 × 10 −5 , 1 × 10 −10 , and 4 × 10 −8 , respectively. The superior performance of MR further illustrates that the adversarial low rank is able to improve the classification performance with the guidance of domain modular knowledge. In summary, the results reveal that domain knowledge indeed helps to improve the performance of the estimated FCNs.

E. Sensitivity Analysis
Note that regardless of which FCN estimation method is adopted, the final performance can be quite sensitive to various parameters (e.g., λ in MR or SR and the threshold in PC). Therefore, in our above classification experiments, we further provide a sensitivity analysis by conducting the classification results with various λ via LOOCV. In Fig. 3, we find that the MR achieves superior accuracy (92.98%) when λ = 2 2 . In addition, the best performance in terms of sensitivity and specificity were also achieved when lambda = 2 2 (95.08% sensitivity and 90.57% specificity). Thus, in its implementation, we can first train an SVM classifier as a diagnosis machine for future diagnosis by using the whole brain network estimated by MR with lambda = 2 2 and then use lambda = 2 2 to construct the FCN for patients as the model input.

F. Complexity Validation
We also validated the time complexity of the different FCN estimation methods through an empirical analysis. Specifically, we compare the time cost of our method with PC, SR and SLR. The environment was i7-6700 HQ CPU with 16 GB memory on MATLAB 2015b. The results are given in Table. IV. As we can observe, our method achieves a comparable time cost, which is nearly 8 and 43 times faster than that of the SR and SLR methods, respectively. Thus, our method is both efficient and effective for FCN estimation.

V. DISCUSSION
In the current study, our goal was to determine whether the FCN model could help dissociate chronic TIN patients from NC. We selected discriminative features from the estimated brain connectome a to train a classifier for distinguishing participants with TIN from NC. Based on the feature selection and a combination of the proposed methods, we further described the altered patterns of the best distinguishing features of TIN through group comparison, aiming to further clarify the neuropathological mechanism of TIN. In particular, the hub nodes, discriminative connections analysis and limitations are listed below.

A. Discriminative Connections and Degree Analysis
To reveal the biological meaning of the FCNs estimated by the proposed MR, we further investigate the most discriminative edges estimated by MR when λ = 2 2 for identifying TINs from NCs. The top 5‰(i.e., 70) connections based on the p-value (all smaller than 0.00001) are shown in Fig. 4. The thickness of each arc is inversely proportional to the p-values estimated by the t-test. Our results indicate that the edges in the middle temporal gyrus, median cingulate, paracingulate gyri, middle frontal gyrus and fusiform gyrus tend to be more discriminative, which corresponds to the subnetworks of the default mode network, the fronto-parietal task control network and the auditory network. In particular, for the auditory Network, several studies have confirmed that such a network plays a key role in auditory function [9], [41]. In addition, an interesting finding is that most of the connections in the auditory network significantly increase while most of the connections in the fronto-parietal task control network and memory retrieval network tend to decrease. Therefore, the research based on the proposed MR confirms the abnormality of the auditory network in individual with TINs, and reveals the abnormality of other brain networks that have not been previously mentioned.
Previous studies have found that compared with NC, TIN patients have increased neural activity and functional connectivity within the auditory network, which was consistent with the current finding [42], [43]. Moreover, Schmidt et al. detected a strong decrease of the connectivity within the Default Mode Network in TIN patients [5], [44]. However, the source or type   of aberrant hub nodes and discriminative connections within the above brain networks remain unclear. Our results indicate that the abnormalities in the hub nodes may be responsible for the disruption of the auditory network and default mode network in individual with TINs as shown in Fig. 5.

B. Limitations
Several inevitable shortcomings must be acknowledged in this work. First, the sample size in our work is moderate, which may limit the generalization of our results. Furthermore, subjects inevitably hear some scanner noise, which may affect the metabolism degree of the attention network. Finally, we adapted almost the most straightforward feature selector and classifier for TIN identification, which limits its performance. These limitations should be taken into consideration in our future research.

VI. CONCLUSION
In this paper, we focused on the modular prior for estimating the FCN and attempted to integrate domain expert knowledge into a unified framework. Specifically, a human-guided functional connectivity network estimation model, namely MR, was proposed, which effectively includes the domain knowledge of modular partitions. The estimated FCNs were then validated via the TIN vs NC identification task, leading to the achievement of superior results. It should be noted that the biological prior is not merely a modularity. In the future, we will further focus on the discovery of more domain information of other priors, such as hierarchy, to improve the accuracy of TIN diagnosis. In the end, we want to emphasize that some superior classifiers, e.g., universum SVM and GNN can also be considered to further boost the performance of TIN diagnosis, which will also be addressed in our future work.