A Hypothesis Testing for Large Weighted Networks With Applications to Functional Neuroimaging Data

Neuroimaging techniques have been routinely applied in various studies in neuroscience, which contribute to providing novel insights into brain functions. One of the most important and challenging questions related to data collected from such studies is hypothesis testing for the differences between two samples of networks of brain regions. This is due to the fact that networks constructed from neuroimaging studies, which can be weighted and large, are very complex. Focusing on this problem, a novel hypothesis testing procedure is proposed under a general framework for large weighted networks. The asymptotic null distribution is derived and the power guarantee is also provided theoretically. Simulation experiments and practical application to the real brain networks are carried out to demonstrate the effectiveness of the proposed testing method.


I. INTRODUCTION
The past decade has witnessed the development of different techniques for analyzing functional neuroimaging data, especially in the study of human brain's structural and functional systems [1]- [3]. Particularly, there is a strong need to develop neuroimaging-based models to improve our understanding of a diverse span of topics such as cognitive science and mental functions. Recent views in neuroscience claim a variety of implementations via brain networks which comprise different brain regions [4]- [8]. In this setting, for each individual such as a patient, a network is formed by using brain regions of interest (ROIs) as nodes, which are shared across all networks and predefined anatomically. A measure of association between two nodes is represented by the edge between the corresponding ROIs. The way to define this connectivity varies widely with imaging collecting methods. For example, in functional magnetic resonance imaging (fMRI), The associate editor coordinating the review of this manuscript and approving it for publication was Lin Wang .
it is a functional statistical association between each pair of ROIs [9].
The recent advancement in technology and computer prowess has led to a number of important research initiatives in neuroscience, such as Northwestern University Schizophrenia Data and Software Tool (NUSDAST; available at http://schizconnect.org/), International Neuroimaging Data-sharing Initiative (INDI; available at http://fcon_1000.projects.nitrc.org/) [10], [11], and the Child and Adolescent NeuroDevelopment Initiative (CANDI, available at https://www.nitrc.org/). These initiatives make available fMRI datasets collected from around the globe and provide us with the opportunity to perform analysis on many biological studies.
Statistical tasks with brain networks constructed from functional neuroimaging data have received a lot of attentions, such as pattern-based classification, disease states prediction, and performance qualification [4], [12]- [14]. One of the most pressing and important practical problems is hypothesis testing for differences in networks. It covers a VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ variety of interesting applications including for instance, the differences in brain structures between networks of males and females [15], and topological changes in gene regulatory networks coming from two kinds of treatments of breast cancer [16]. However, hypothesis testing for populations of networks remains largely understudied especially for large networks (having large number of nodes), which is an increasingly common trend [17]- [20].
In this paper, motivated by fMRI data from INDI, we investigate the differences between two populations of interest, e.g., schizophrenics (SCZs) and healthy controls (HCs). To attain this goal, we need to develop a statistical test of hypothesis for two samples of networks. The main challenge is that networks are non-traditional forms and are not Euclidean objects, which makes extension of traditional testing methods hard. Focusing on this problem, [15], [21]- [23] have carried out some new hypothesis tests. Nevertheless, these methods are restricted to binary or random dot product networks, which are not suitable for many situations in practice since many real networks are weighted [15], [24]- [26]. Moreover, most studies above are limited to the assumption of fixed nodes number and large sample size except [22]. For example, the statistic in [15] is constructed by using half-vectorization of the graph Laplacian matrices and the inverse of the corresponding pooled covariance matrix, which is not an easy task especially when the nodes number n is large. This can be seen from a case of n = 200, the dimension of the half-vectorization based covariance matrix would be as large as 20100, thus posing challenges for estimating this pooled covariance matrix and its inverse.
The main purpose of this paper is to analyze the differences between networks from neuroimaging data, which therefore can provide diagnostic suggestions. To address this issue, a hypothesis test for two samples of weighted and large networks is developed. Specifically, we expect to investigate asymptotic null distribution and asymptotic power under the alternative theoretically, and further evaluate the proposed method via simulation studies and real functional neuroimaging data studies. We first propose a novel testing statistic based on the singular value. Then, we prove that this statistic asymptotically converges to Tracy-Widom distribution under the null hypothesis and the asymptotic power tends to 1 under the alternative hypothesis.
The main advantage of our approach is that our proposed test adopts a very general framework, allowing the connectivity to be weighted and the number of the nodes to grow to infinity, instead of requiring binary networks or fixed nodes number like in most of the existing methods. Moreover, although our main application in this paper is two-sample test of brain connectivity networks, our method is applicable to any other weighted or unweighted binary networks.
The main contributions of this work can be summarized as follows: (1) A novel two-sample test for populations of networks is developed. Due to the generality of the connectivity structure on the networks, the proposed test can be applied to both weighted and binary large-scale network scenarios. (2) Asymptotic null distribution and power are obtained theoretically. The new statistic is proved to have asymptotic Tracy-Widom distribution under the null hypothesis and full power against alternative models as the nodes number tends to infinity. (3) The performance of the proposed test is assessed by comparing to the current state-of-the-art methods in simulations and application to real human brain data. This paper is organized as follows. In Section II, we propose a testing statistic and analyze its asymptotic properties. Simulation studies are presented in Section III. In Section IV, we illustrate our proposed method with a brain connectivity dataset of SCZs and HCs. Section V gives some conclusions.

A. NOTATIONS
Throughout this paper, we use lightface letters, boldface lowercase letters, boldface uppercase letters and lightface uppercase letters to denote scalars, vectors, matrices and distributions respectively.
The notation R n represents the set of all n-dimensional real column vectors. The symbol · is the Euclidean norm for vector and spectral norm for matrix. For an n × n matrix A, A ij denotes its (i, j) entry. For an n × n symmetric matrix A, λ j (A) denotes its jth largest (real) eigenvalue, ordered as λ 1 (A) ≥ λ 2 (A) ≥ · · · ≥ λ n (A), s 1 (A) is the largest singular value. For two matrices A and B, A • B denotes the Hadamard product of A and B.
The operations P{·} and E[·] denote the probability of some random event and mathematical expectation of random variable respectively. The notation TW(1) represents Tracy-Widom distribution with index 1. Write x n x if a sequence of random variables {x n } converges in distribution to random variable x as n tends to infinity.
For two sequences of real numbers {a n } and {b n }, b n = O n (a n ) means that there exists a constant c such that b n /a n → c as n → ∞; b n = o n (a n ) means b n /a n → 0 as n → ∞; b n = O p (a n ) means for any ε > 0, there exist finite c > 0 and n 0 > 0 such that P{|b n /a n | > c} < ε for any n > n 0 ; and b n = o p (a n ) means P{|b n /a n | ≥ ε} → 0 as n → ∞ for any positive ε.

II. TWO-SAMPLE TEST FOR LARGE WEIGHTED NETWORKS A. PROBLEM FORMULATION
Let F 1 = {F 1,i j } 1≤i<j≤n and F 2 = {F 2,i j } 1≤i<j≤n be two sequences of distributions. For two samples of weighted networks with n nodes, we assume that A 2,i j ∼ F 2,i j , k 2 = 1, 2, . . . , m 2 , 1 ≤ i < j ≤ n. In this paper, {F 1,i j } 1≤i<j≤n and {F 2,i j } 1≤i<j≤n are assumed to be defined on bounded intervals and decided by some parameters. We consider networks that are undirected and with no self-loops, hence A (k 1 ) We formulate the problem of distinguishing two groups of weighted networks as the following two-sample hypothesis testing: (1)

B. SINGULAR VALUE BASED TEST
For any k 1 = 1, 2, . . . , m 1 , k 2 = 1, 2, . . . , m 2 , i, j = 1, 2, . . . , n, and i = j, denote σ 1,i j > 0 and σ 2,i j > 0 as the standard deviations of A 2,i j respectively. In addition,σ 1,i j > 0 andσ 2,i j > 0 are some estimators of σ 1,i j and σ 2,i j respectively, i = j. We first introduce a centralized and re-scaled symmetric matrix V with zeros on diagonal and entries for 1 ≤ i < j ≤ n as u,i j , u = 1, 2. We propose the following singular value based test statistic: Moreover, the rejection rule for testing problem (1) under significance of α ∈ (0, 1) is where τ α/2 is the α/2 upper quantile of TW(1). Define a symmetric n × n matrix V 0 with zero diagonal entries and other elements for 1 ≤ i < j ≤ n as Extending the proof of Theorem 3 in [22], we have the following lemma.
Lemma 1: For V 0 defined by (5), we have We now have the following theorem on the asymptotic null distribution of the test.
Theorem 1: For any u = 1, 2, and i, j = 1, 2, . . . , n, if max (1), then under the null hypothesis H 0 in (1), we have Proof: Under the null hypothesis, σ 1,i j = σ 2,i j , and it is observed that Without loss of generality, assumeσ 1,i j ≤σ 2,i j . According to the condition on the errors of standard deviations, we have the following for the numerator in (10), where the last inequality comes from as follows. Obviously, there exist a constant c > 0 and n 0 > 0 such that for any n > n 0 , for any ε > 0 and n > n 0 , we have From (11) and (13), Combining (14) with (10), we obtain where M is an n × n matrix with entries M ij = o p (n −7/6 ). In addition, from [27], V 0 = O p (1). Then, from the definition of spectral norm, Therefore, From (17) and Lemma 1, we immediately have (8). The asymptotic distribution given by (9) is similar. Note that Theorem 1 is rather general and has great practical value in real applications with the advantage of imposing almost no restrictions on the network structure (undirected and with no self-loops). Moreover, the order of the error condition in the theorem seems to be a little strict, but there still exist some appropriate estimators of σ u,i j and we will discuss some standard estimators and compare their properties in the later discussions.
Next, we analyze the size and power for the proposed rejection rule (4). From we can conclude that the rejection rule given by (4) has size α. Define a symmetric n × n matrixṼ with diagonal elements 0 and for any 1 ≤ i < j ≤ n, where u,i j ], k u = 1, 2, . . . , m u , u = 1, 2. The asymptotic power can be presented in the theorem below.
Theorem 2: Under the same assumptions in Theorem 1, Proof: Define a symmetric n × n matrix W with zero diagonal and entries for all 1 ≤ i < j ≤ n as Recall the definitions of V , V 0 andṼ given by (2), (5) and (19) respectively, from (15), it is observed that Thus The last equation implies where, J is the n × n matrix with all the elements equal to 1, D is an n×n matrix with elements D ij = o p (n −2/3 ). Similarly to the proof of Theorem 1, we have s 1 (Ṽ • (J + D)) = s 1 (Ṽ ) and s 1 (W • (J + D)) = s 1 (W ) with probability 1 as n → ∞.
Applying the triangle inequality of spectral norm, we have with probability 1 as n tends to infinity. Noting that W is a mean zero matrix whose singular value can be bounded by using the TW(1) asymptotic distribution. Hence, for any β ∈ (0, 1), Set τ β = n 2/3 (s 1 (Ṽ ) − 4) − τ α/2 , and plug this in (26), then we have If n −2/3 (s 1 (Ṽ )−4) −1 ≤ o n (1), for a fixed α ∈ (0, 1), we have τ −1 β = o n (1), that is β = o n (1). Therefore, This theorem thus holds. Remark 1: It is noticeable that a test between two large graphs for m 1 = m 2 = 1 has been discussed in [22], in which the test statistic also has the TW(1) asymptotic distribution. However, rather than considering single network in each population, our testing statistic focuses on two-sample test on two populations of multiple networks, which has become increasingly more common in practice. Besides, [22] proves the asymptotic Tracy-Widom law under the binary random network setting, whereas in this paper, we are interested in weighted networks, and simultaneously can yield a two-sample test for binary networks when {F 1,i j } 1≤i<j≤n and {F 2,i j } 1≤i<j≤n are Bernoulli distributions, which will be discussed later. Finally, our theoretical results are based on appropriate estimates of the element-wise standard deviations instead of the unknown true parameters presented in Theorem 3 in [22]. Moreover, compared with the method in [15], which is based on geometric characterization between networks and involves the inverse of covariance matrix, our method is easier to compute.

C. COMPARISON OF DIFFERENT ESTIMATORS OF STANDARD DEVIATION
Various estimators for standard deviations of adjacency matrix entries can be used as the plugging estimates in the test statistic and we will analyze their properties in this subsection.
The most natural and simple estimator is the sample standard deviation applying to each sample group. We denote our proposed test based on sample standard deviation as SAMPT and the corresponding estimator byσ SAMPT,u,i j . Next, we consider its estimate error.
Define two n × n matrices G 1 and G 2 , with elements G u,i j = E A (k u ) u,i j 2 for u = 1, 2, then where the second to last equality comes from Bernstein's inequality [28] and the last equality is due to the boundedness of distributions {F u,i j } 1≤i<i≤n in our framework. It is observed thatσ SAMPT,u,i j and σ u,i j are bounded, so |σ SAMPT,u,i j + σ u,i j | is also bounded, then max It is noted that the SAMPT estimator is more suitable for cases of large sample size. Specifically, to satisfy the estimate error condition in Theorem 1, the sample size needs to be m u = O n (n α u ) and α u > 7/3. This implies a faster increasing rate of m u than that of nodes number n. However, this is hard to hold in some scenarios like for fMRI networks. Actually, as pointed out in [29], it can be hard to obtain adequate number of samples in some neuroimaging studies. Now, we consider an estimator of σ u,i j under the stochastic block model (SBM), which endows the network with planted clusters [30]. This estimate method is inspired by [22]. First, assume the networks are SBMs, or approximate them with SBMs according to a weaker version of Szemerédi's regularity lemma [31]. Then perform the community detection algorithms proposed in [32] to obtain the estimated membership vector v u ∈ {1, . . . ,l u } n and the community set D u,l = {i|1 ≤ i ≤ n, v u,i = l}, wherel u is the estimated community number, l = 1, 2, . . . ,l u and v u,i is the ith element of v u . Then σ u,i j is consequently approximated by the sample standard deviation of all the entries A (k u ) u,i j for i ∈ D u,v u,i , j ∈ D u,v u,j , and k u = 1, 2, . . . , m u . We refer our proposed test under this SBM-based standard deviation estimate as SBMT.
If we assume that each community number is at least proportional to n/l u , where l u is the true community size, then similarly with SAMPT method, max i,j |σ SBMT,u,i j − σ u,i j | = O p (l u m −1/2 u n −1 ). This implies that when m u = O n (n α u ), l u = O n (n γ u ), 0 < γ u < α u /2 − 1/6, the estimate error condition in Theorem 1 can be satisfied. This condition further means α u > 1/3 + 2γ u and is much less restrictive than that in SAMPT method especially when γ u is controlled to be relatively small. Hence, the SAMPT method can generally achieve better performance, which is also demonstrated in simulations in Section III.
Although the theoretical results are based on weighted networks, our proposed method can also be applied to unweighted binary networks, where F u,i j = Bernoulli(P u,i j ), with P u the link probability matrix of the uth group and hence A (k u ) u,i j ∼ Bernoulli(P u,i j ). For this case, we only need to estimate every mean parameter P u,i j instead of standard deviation σ u,i j since σ 2 u,i j = P u,i j (1−P u,i j ). In a similar way, one can adopt the two estimate methods mentioned above but in terms of sample mean rather than sample standard deviation, and both of the properties remain the same. On the other hand, under the binary framework, [33] proposes a modified neighborhood smoothing (MNBS) method to estimate P u,i j . Specifically, for each node, that paper defines its own neighborhood by some distance measure between nodes and employ the average network information restricted in that neighborhood. We refer this method as MNBS-based test (MNBST).

III. SIMULATION STUDY
In this section, we evaluate the performance of the proposed TW(1)-type tests based on different standard deviation and mean value estimators. We compare SAMPT and SBMT with the test based on T 2 in [15] using geometric characterization of the space of graph (namely GEOT) for weighted network scenarios, and compare SAMPT, SBMT, and MNBST with the test based on T fro in [22] involving some estimate of Frobenius distance between two network distributions (namely FROT) for binary network scenarios. For the consideration of covariance matrix inverse in [15], we only implement relatively small n for GEOT in our simulations. To conduct simulations related to MNBS for binary network cases, a tuning parameter in [33] is required to be specified and we set it to be 3 as default. Moreover, the significance level is set at α = 0.05 throughout this section. The testing performance is evaluated by the empirical significance level and empirical power through 5000 Monte Carlo simulations.
To explore the performance of the proposed test on large-scale networks, we consider the second experiment, which is similar with the first one but with different parameters. We vary n growing from 100 to 1000 with step 100, m = 20, 30, x 1 = 2, x 2 = 4, y 1 = 1, y 2 = 8, and set η as 0 and 0.04 for the null hypothesis and the alternative hypothesis respectively. The results are shown in Figure 2.
The third experiment is for binary networks. We create networks with two communities from Bernoulli distributions. The settings of n, m, and membership for each node are the same as those in the second experiment. Under the null hypothesis, for 1 ≤ i < j ≤ n, k u = 1, 2, . . . , m u , u = 1, 2, A 2,i j ∼ Bernoulli(0.7 + η), η = 0.05, 1 ≤ i < j ≤ n/2. The corresponding results are plotted in Figure 3.
The comparison results of the three experiments in Figures 1-3 reveal a superior performance of SBMT than that of SAMPT, GEOT and FROT in both significance level and power for all the cases and also the effectiveness of MNBST in binary setting cases. Specifically, for weighted network cases, Figure 1 and Figure 2 show the undesirable behaviors of GEOT and SAMPT since with increasing number of nodes n, the empirical significance levels of both tests grow quickly away from the significance level of α = 0.05 and can become too large to be used in practice. This is due to that both GEOT and SAMPT require large sample sizes compared with the number of nodes. Figure 2 further shows that the inflated null rejection can be alleviated with larger sample sizes. It is also noted that the powers of GEOT and SAMPT are larger than those of SBMT, but with small differences. In contrast, the performance of SBMT is much better, the empirical significance levels are stable and smaller than α = 0.05, and the powers also improve to 1 as n grows. As for unweighted network situations shown in Figure 3, SBMT and MNBST both perform well with empirical significance levels close to 0.05 and empirical powers increasing to 1 fast, while SAMPT suffers a high type I error and FROT has a large type II error with powers to be zero. The bad performance of FROT in power is partly caused by the minor differences between {A (k 1 ) we set, and the poor behaviors of GEOT and SAMPT are also due to the strong requirement of large sample size compared with n. In addition, SBMT and MNBST fully utilize the inherent connections of networks and thus are more efficient even for large-scale networks.
Overall, for the tests based on three kinds of methods we propose to estimate standard deviation, SAMPT is more suitable for cases of large sample size, whereas SBMT is more robust, especially when the networks are SBM constructed, and MNBST also generally has good performance, but only can be applied to unweighted networks.

IV. SCHIZOPHRENIA DATA ANALYSIS
Schizophrenia is a complex mental illness characterized by abnormal brain function and is difficult to diagnose [34], [35]. Therefore, it is of great interest to compare SCZ and HC groups and automatically test whether there are significant differences between these two groups. A diagnostic system aided by hypothesis testing between their neuroimaging data would lead to a more objective approach to distinct schizophrenia and healthy people.
Consider a dataset contributed by the Centers of Biomedical Research Excellence (COBRE) included in INDI. It contains raw anatomical and functional scans from 147 subjects of 72 SCZs and 75 HCs, and can be downloaded from the public database (http://fcon_1000.projects.nitrc.org/indi/retro/ cobre.html). This connectomics dataset has been processed in [36]. A set of 264 ROIs from the predefined parcellation in [37] are used as nodes to construct the brain networks from fMRI measurements. These ROIs are derived through several meta-analyses of task fMRI data and with the fc-Mapping technique applied to rs-fcMRI data [37], [38]. This ROI definition has been widely used in neuroimaging studies because it is functional based and can provide better validity of networks than structural atlases or arbitrarily defined sampling grids [39]- [41]. The main pre-processing steps for all the scans sequentially contain slice time correction, anatomic normalization, frame-wise displacement analysis, extraction of cerebral spinal volume and white matter compartments, and nuisance regression to the time series of fMRI activity. Finally, the edge weights are measured by the Pearson r-values between the time series in corresponding ROIs, which is a common method to measure functional connectivity [42], [43]. More pre-processing and registration details can be found from supplemental materials in [36].
After a series of pre-processing procedures, 263 ROIs are chosen as nodes since node 75 in COBRE is missing, 54 SCZ and 70 HC subjects are kept and time course adjacency matrices of size 263 × 263 are extracted. Figure 4 shows the mean connectivity networks over SCZ and HC groups, which initially visualizes the differences between them. In Figure 4, for convenience of visualization, the values of the Fisher z-transformed correlations between the nodes have been binarized with respect to the 75th percentile of the overall distribution of the entries in the full COBRE dataset. It can be seen from Figure 4 that there are differences between SCZ and HC groups, but not very distinct. Now we test for health differences between two groups of brain networks extracted from the COBRE data using the proposed TW(1)-type two-sample test. Using the data preprocessed in [36], which are weighted networks, we analyze the graphical properties of these brain functional networks from two aspects. First, we test health differences between SCZs and HCs. Then, we assess the ability of our test to get objects into the same category if they are in the same group (here we choose the group HC), which is helpful for automatic diagnosis. We do not compare our method with GEOT in this case because by half-vectorization of the Laplacian matrix, the dimension of the covariance matrix in the statistic in [15] reaches 34716 which is so large and difficult to be implemented in practice. In addition, the MNBST and FROT methods do not work in this case since they are designed for binary networks and are not applied to the weighted networks here.
In the first situation, the health differences between SCZ and HC groups are tested by SAMPT and SBMT and they both reject the null hypothesis with zero p-value. This implies significant differences between brain networks of SCZs and HCs. In the second situation, we randomly subsample 60 subjects from HCs and compare them with the original HC data. As expected, both tests fail to reject the null hypothesis under significant level of 0.05, since the samples are extracted from the same population. However, when we decrease the subsample size to 50, SBMT still fail to reject the null hypothesis while SAMPT shows a reject result. This confirms that good standard deviation estimators of adjacency entry standard deviations contribute to better testing performance.

V. CONCLUSION
Neuroimaging-based model plays a vital role in increasing our understanding of many clinical conditions. Motivated by the applications to neuroimaging, we have investigated hypothesis testing on whether two populations of weighted networks are from the same distribution. In this paper, we have proposed a novel singular value based hypothesis testing statistic which simultaneously can tackle weighted and large-scale networks. The asymptotic distribution of the null statistic and asymptotic power under alternative both have been derived. The test statistic utilizes some plugin estimates for standard deviations of adjacency matrix entries, and properties of the resulting tests with various estimates have been discussed theoretically and numerically with both simulated and real brain network data. Moreover, the computation is easy, making it strongly practicable. From the example results, the proposed TW(1)-type tests based on SBM and MNBS estimates perform well.
In this paper, our hypothesis testing is based on the asymptotic extreme eigenvalue behaviors for the adjacency matrices of undirected networks with no self-loops. Although our setting is very general compared to the existing literature [15], [21]- [23], functional neuroimaging can be much more sophisticated than the setting in this work. For instance, networks from these data can have self-loops [44], [45]. Besides, directed influence of brain regions on each other can provide effective functional connectivity, however, neuroimaging investigations on the directed networks are still not studied thoroughly because of the technical constraints surrounding the inference [46]- [48]. Such complex networks can capture deeper connectivity information than normal networks. Therefore, in the future work, it is worth extending the hypothesis testing to networks with self-loops and directed structures. Actually, in this work, networks including self-loops can be tackled by removing the assumption A (k 1 ) 1,i i = A (k 2 ) 2,i i = 0 and modifying the scale factor n − 1 to n in the statistic (2) and Theorem 2 would hold for this case. For directed network, maybe we need to build a new test under this framework.

ACKNOWLEDGMENT
This work was initialized while Li Chen visited at University of Notre Dame. VOLUME 8, 2020