Abnormal Brain Regions in Two-Group Cross-Location Dynamics Model of Autism

Resting-state fMRI studies have suggested that autism spectrum disorder (ASD) is associated with aberrant dynamic changes. However, existing research either has difficulty showing the brain’s dynamic characteristics or cannot obtain stable results. We examined the ‘two-group cross-location hidden Markov model’ of each region of interest (ROI) to identify possible pathogenic features of ASD. Specifically, we selected resting-state fMRI data with complete scales and good quality from Autism Brain Imaging Data Exchange (ABIDEI). Eligible data included 145 ASD and 157 control (CON). Two groups of subjects were separated to train Hidden Markov models representing respective populations. Then, we used each model to estimate the likelihood values of all participants. Using the likelihood value as features, we tested the significant differences of 200 ROIs and finally identified ROIs with common significant differences in the two types of models. Additionally, we investigated the relationship between likelihood values of significantly different ROIs and clinical scales. some ROIs were negatively correlated with the Autism Diagnostic Observation Schedule and positively correlated with full IQ. Finally, we constructed a support vector machine to classify ASD and CON. Overall, our findings suggested that the abnormal areas in the frontopolar area, orbitofrontal area, inferior temporal gyrus, middle temporal gyrus and fusiform gyrus are prominent features of ASD and are closely related to clinical functional decline. The average accuracy rate reached 74.9% after ten cross-validations. This ‘two-group cross-localized Hidden Markov Model’ provides a robust and powerful framework for understanding the dysfunctional brain architecture of ASD.


I. INTRODUCTION
Autism spectrum disorder (ASD) is a typical derangement of brain development. ASD occurs during a momentous period of nervous system refinement that includes nerve cell proliferation, synapse formation and functional maturation of various regions. Developmental disorders lead to differences at the cerebral cortex macro anatomy level in infants and young children [1]. ASD is characterized by a collection of The associate editor coordinating the review of this manuscript and approving it for publication was Mohammad Zia Ur Rahman . behavioral abnormalities such as difficulties with social interactions and verbal and nonverbal communication, repetitive behaviors, and a number of comorbid conditions [2], [3]. Although the clinical cause of autism is unclear, studies have shown structural and functional brain abnormalities. From a structural perspective, Khundrakpam et al found widespread increased cortical thickness in ASD, primarily left lateralized, with differences decreasing gradually during adulthood [3]. Diffusion tensor imaging studies show individuals with ASD have more densely packed columns of neuronal cells than normal people [4]. T1-weighted MRI has reported a significantly reduced size of the posterior subregions of the corpus callosum on average and abnormal connections of the limbic-striatal region, which is the social brain system, in autistic individuals [5]- [7]. Over the years, the development of functional magnetic resonance imaging (fMRI), which measures intrinsic neural activity based on blood oxygen level-dependent (BOLD) signals, has provided a versatile tool for investigating functional mechanisms underlying cognitive dysfunction. Compared to other imaging methods, fMRI has advantages in that is non-invasive and has relatively good temporal and spatial resolution [8]. The brain does not need to perform cognitive tasks during collection of resting state fMRI (rs-fMRI) data, so this type of imaging can reflect the functional organization of the brain [9] and the hemodynamic changes caused by disease [10], [11]. The most common method of rs-fMRI data analysis uses correlation [12]- [15], partial correlation [16], [17], or sparse regression [18], [19] to build a brain functional connection (FC) network, which is a measure of synchronous activation of spatially varied brain regions [20], [21]. Based on functional connections, some autistic brain connections have been found and are different from those of healthy people. Most of the reported underconnectivity in autism is in specific brain regions or networks [22], [23], such as the default mode network, which demonstrates a consistent pattern of deactivation across a network of brain regions that includes precuneus/posterior cingulate cortex (PCC), medial prefrontal cortex (MPFC) and medial, lateral and inferior parietal cortex occurring during the initiation of task-related activity [21].
However, the method of constructing the brain network overcompresses the information on the time scale and cannot accurately reflect the internal dynamics of the brain. As shown by research, the state of the brain is a dynamic process that changes over time [24]. Many people have made attempts to find another method to describe the dynamic brain, such as calculating multiple brain connection networks based on sliding windows [25]- [27]. The limitation of sliding windows is that the size of the window is usually determined empirically. If the window is too long, it is difficult to estimate the dynamic characteristics of brain activity. If the window is too short, the number of observations is insufficient [28].
To avoid possible disadvantages in window length selection, some researchers have also begun to use dynamic Bayesian networks to generate the general framework of probabilistic models, which is often used to deal with timevarying signals with good results [29]. One of the effective methods is the Hidden Markov Model (HMM), which can describe the dynamic state switching process of the brain as a Markov chain with different transition probabilities between states [30], [31]. Due to the many excellent characteristics of HMM, it has been applied to the study of a variety of clinical diseases, among which the representative diseases include cancer [32] Alzheimer disease (AD), Huntington disease, Parkinson disease [29] and ASD [8]. Therefore, we consider the HMM to be a good model for measuring the dynamic changes in the brain time scale with good generalization ability.
In this article, we use ASD and control (CON) subjects to train the respective HMMs for each region of interest (ROI) and then use the constructed models to calculate the occurrence probability of each ROI's fMRI signal, called goodness of fit or likelihood value. Statistical analysis was used to locate areas where the dynamic state had significant differences between autistic patients and normal subjects in the two models and to discuss the clinical significance of these differential ROI. This work also used the likelihood value of the selected ROI to build a classifier. The contributions of our work can be summarized in three aspects. First, data filtering is based on multisite datasets, so the results are more scalable. Second, more representative models are trained for two groups of people, and then the intersections of areas with significant differences are chosen, so the results are more robust. Finally, the brain partitions we use are more detailed and can more accurately locate the abnormal brain areas of ASD. Our research can provide useful information for clinical studies of autism. Then, we can use the data of the characteristic regions we are looking for to construct a classifier, which is expected to achieve an accurate diagnosis of autism.
The remainder of this article is structured as follows: In the methods section, we introduce the exclusion criteria and demographic information for the data. Then, we introduce the data processing and analysis process, including data preprocessing, model construction, and classifier construction. In the results section, we summarize the abnormal brain regions exhibited by autistic patients, their relationship to clinical scales, and the performance of the classifier. The discussion section discusses the significance of the findings, the shortcomings and future perspectives.

II. MATERIALS AND METHOD A. SUBJECTS
The present study downloaded rs-fMRI time series and acquisitions for samples of 16 international imaging sites that had aggregated and were openly sharing neuroimaging data from 539 individuals affected by ASD and 573 controls (CON), available in the ABIDE (http://fcon_1000. projects.nitrc.org/indi/abide) [7]. After a series of screenings for all datasets, 302 subjects were chosen. The selected data consisted of 145 ASDs and 157 CONs. The summary of demographics and clinical characteristics is represented in TABLE 1. Full IQ (FIQ) was measured using the Wechsler Adult Intelligence Scale, and the FIQ of CON was significantly higher than that of ASD (p<0.01). The Autism Diagnostic Observation Schedule (ADOS) was measured only in ASD. Data screening criteria are shown in FIGURE 1.

B. PREPROCESSING
In this research, we preprocessed datasets by the Cofigurable Pipeline for the Analysis of Connectomes (C-PAC,  There are manual quality assessments of the data by three raters with columns having the prefix qc_ in the phenotypic file downloaded from the site. To obtain high quality data, we excluded the data if any rater's evaluation result was ''Failed'' for functional data. 2 Choose the data's repetition time as 2000 ms. The TR time was the same to ensure that the time series had the same strength standard. Data with TR equal to 2000 constituted the majority of all data. 3 Exclude the data whose age was over 35. Some research suggested that the difference in the brain between ASD and CON patients was small over the age of 35. 4 Exclude the data of subjects who lacked WASI-FIQ if the patient lacked the ADOS. For the purpose of subsequent analysis, the data from incomplete clinical scales were removed. 5 Exclude female data. A chi-square test was performed on the sex of the ASD group and the CON group, and there was a significant difference between the two groups. Female data were much less frequent than male data, so all female data were excluded. http://fcp-indi.github.com). This python-based pipeline tool makes use of AFNI, ANTs, FSL, and custom python code. The pre-processing steps include slice time correction, motion correct to the average image, skull-strip, global mean intensity normalization to 10,000 and nuisance signal regression. Band-pass filtering (0.01-0.1Hz) was applied only for one set of strategies. Functional images were registered linearly to anatomical space and were normalized to Montreal Neurological Institute (MNI) 152 stereotactic space (1 mm 3 isotropic) with linear and non-linear registrations. The regressed rs-fMRI images parcellated into 200 ROIs in the cortical regions, and then the mean across the voxels within each ROI was computed. Finally, we obtained a 200-dimensional vector sequence for each subject [33].

C. HMM
Some studies have suggested that the brain switches from one state to another. HMM is a probability model that solves the unknown hidden state by using known observation data.
In this section, we will describe how to model the brain time series using an HMM framework. We will use the data of ASD and CON to construct HMMs of their populations separately (ASD MODEL and CON MODEL) and use the trained model to calculate the likelihood of all participants (FIGURE 2).
Assume the amount of data we will use to train the model is R (all ASD or CON). Every subject data has D (in fMRI data, D can be the number of voxels, regions of interest (ROI) or components). ROIs have a sequence of E-length time series. Denote the time series data as χ = X (1) , In this work, we support the hypothesis that (i) the temporal BOLD fluctuations of ROIs, that is to say,X _ROI = (X have their own dynamic patterns, and (ii) there may be significant differences in temporal dynamic models between ASD and CON. Based on these hypotheses, we model region-wise temporal dynamics with HMMs. It is assumed that the hidden states are changing over time with a certain probability, called state transition probability. In a transition matrix A, i.e., A = a ij each a ij refers to the probability of moving from state i to state j. Moreover, each state should be initialized, i.e. π = [π i ], in which each π i refers to the probability that the initial time may be in a hidden state. In the HMM model, we mark HMMs set as S ROI is the HMM of dth ROI and comes from a discrete state set with size K. For the dth ROI, the hidden state is denoted as 94528 VOLUME 8, 2020 FIGURE 2. Description of training hidden Markov processes and calculation of likelihood values. First, we used the data of all ASDs (145) to train the HMM (ASD MODEL) of each ROI. The likelihood value of all people including ASD and CON was estimated by the trained corresponding ROI model. Then, a rank sum test was performed on the likelihood value of each ROI, and the test value was subjected to FDR correction. Second, we replaced the training model data with all CON (157) (CON MODEL) and repeated the previous steps. Finally, the ROI that showed significant differences in both tests was selected.
Different hidden states obey a Gaussian distribution and have the same calculation form for probability but with unequal mean and variance. Combining p(X e _k) with the transition probability, the probability likelihood of the dth ROI is written as: Maximizing this probability gives the maximum likelihood estimation (MLE) based on the model parameters θ = {π, A} and ∅ = µ k , ε k K k=1 The detailed construction process of HMM is shown in FIGURE 3. HMM can be solved according to the Expectation-maximization (EM) algorithm, which is a common method for solving maximum likelihood estimate in probability models. Since the likelihood value is too small, we take the natural logarithm of it. Then, in order to eliminate the difference in length of the rs-fMRI BOLD signals, the original likelihood value v is scaled by the length of time series E. Therefore, the final likelihood value L is expressed as follows:

D. STATISTICAL ANALYSES
To determine the brain regions with abnormal switching of ASD, we use the rank sum test to statistically analyze the VOLUME 8, 2020 likelihood values of the two groups of people and correct the statistical P value using false discovery rate correction [34].

III. RESULT A. STATISTICAL RESULTS OF ROI LIKELIHOOD AND THE CORRELATION WITH CLINICAL SCALE
On the one hand, we used ASD MODEL to calculate the likelihood values of ASD and CON and performed statistical tests. On the other hand, we trained the CON MODEL and repeated the same process. In the end, there were a total of 7 ROIs with significant differences selected using ASD MODEL, all of which were included in ROIs with significant differences selected using CON MODEL. A boxplot of the likelihood values and specific position in the brain for these seven regions is shown in FIGURE 4. In the CC200 brain region, their numbers were 39, 42, 57, 100, 113, 124, and 183. The corresponding AAL partitions are shown in Table 2. We calculated the correlation between ASD and ADOS in 7 ROIs and found that ROI_39 showed a statistically significant negative correlation in the two models (FIGURE 5a, FIGURE 5b). Since most of the clinical scales were only for ASD patients, the IQ scale set was relatively complete, and the IQs of the two groups showed significant differences (p <0.01). IQ was also used as a reference value for clinical diagnosis, so we calculated the correlation between these regions and FIQ. Likelihood values of ROI_113 in both models showed significant positive correlations with FIQ (FIGURE 5c, FIGURE 5d).

B. DIFFERENCE IN LIKELIHOOD VARIANCE
In two models, the variances of the seven ROI likelihood values of ASD and CON were calculated. The two groups of variance values were tested using the paired sample t-test. The p-value of the paired sample T test between the two groups was 0.303 in ASD MODEL and 0.026 in CON MODEL. In CON MODEL, the variance of the likelihood value in the CON group was significantly smaller than that in the ASD group. Although there was no significant difference in another model, we could see that the median variance of the CON group was smaller than that in the ASD group. All in all, the variance of the likelihood value in the CON group was slightly smaller than that in the ASD group.

C. RESULTS OF CLASSIFICATION
Next, in order to assist in diagnosing ASD objectively with neuroimaging data, we used the seven brain regions extracted by the two models as features. For enriching the features, we also used ASD and CON data to train an HMM (ASD-CON MODEL) corresponding to seven regions jointly. It is worth mentioning that the new likelihood values of the seven brain regions had significant differences. Using 302 samples, each consisting of 21 features, we constructed a linear support vector machine classifier. The average classification accuracy rate after ten cross-validations reached 74.9%. The receiver operating characteristic curve (ROC) is shown in FIGURE 6. The average Area Under the Curve (AUC) was 0.8.

IV. DISCUSSION
In this study, we used a two-group cross-localized Hidden Markov Model approach to brain rs-fMRI data. We focused on areas with abnormal dynamics states in ASD and discovered their clinical significance.

A. ABNORMAL BRAIN REGIONS WITH SIGNIFICANT DIFFERENCES
To clearly show where these brain regions belong and to discuss the physiological significance of these brain region abnormalities, we merged these brain regions according to Brodmann divisions. The name and number of the Brodmann areas and corresponding anomaly areas are shown below: Due to the complexity of the causes and symptoms of autism, there were also many abnormal areas. By accounting for the regions listed above, we described their relatedness to ASD in each region, separately.
Neuroimaging studies have implicated the frontopolar regions of the prefrontal cortex in playing a central role in higher cognitive functions such as planning, problem solving, reasoning, and episodic memory retrieval [35]. The significant positive correlation between FIQ and the likelihood value of a part of the ROI in this region could indicate that the function of this region has a greater impact on VOLUME 8, 2020  IQ. This is not contrary to our general understanding because the functions affected by the frontopolar area are also the abilities examined in the Wechsler Intelligence Scale. Significant abnormalities in this area might explain why ASD showed lower cognitive abilities than CON. Based on neuroanatomy, Eric Courchesne et al. [36] also confirmed the abnormality of the prefrontal cortex of ASD. They found that there was overgrowth in this area in ASD, and the number of neurons was significantly higher than that in CON. Combined with the results we obtained, we could infer that the function of the prefrontal cortex might be abnormal because of overgrowth in ASD.
Functions of the orbitofrontal cortex include emotions, decision-making processes and cognition [37]- [39], many of which are abnormal in ASD. In particular, some research [40] pointed out that among the many complications of autism, emotional problems are particularly prominent, and it is often difficult for such individuals to maintain their emotions in a relatively stable state. Our results could show that the emotional problems of autism are caused by abnormalities in the orbitofrontal cortex, which is consistent with the clinical manifestations of autism and previous studies [41], [42]. Obviously, the functions of this region also have an important impact on IQ, so the likelihood of partial ROI correlating with FIQ further confirmed the role of orbitofrontal cortex abnormalities in the pathogenesis of autism.
We will discuss the next three areas together because they are all related to the visual function of the brain. Anatomic, ablation, and physiological evidence all suggest that the neuronal mechanisms that connect vision and memory in primates are located within the inferior temporal cortex, which consists of the middle temporal gyrus and the inferior temporal gyrus anatomically [43]- [45]. The fusiform is involved in both detection and identification of faces [46]. From behavioral experiments and clinical observations, it can be seen that individuals with autism will scan nonfeature areas of the faces significantly more often and core feature areas of faces (such as nose, eyes, etc.) less often [47], [48]. Our findings suggest that disorganized processing of face stimuli is caused by the abnormal activation of the visual area of the brain. Multiple studies involving brain functional connectivity networks and behavioral experiments found that autism is abnormal in the inferior temporal gyrus, middle temporal gyrus, and fusiform gyrus, which supports our results in other ways [49]- [53]. In addition, there is evidence that visuospatial processing is related to the development of core autistic sociocommunicative impairments [54]. In addition to the results mentioned above, the significant negative correlation between the likelihood values of ROIs in this area and the ADOS also proved that the brain status in this area can reflect the severity of autism.

B. STATISTICAL DIFFERENCES OF LIKELIHOOD VARIANCE
According to the results of the statistical test of the different regions and the analysis of FIGURE 4, an interesting phenomenon was found, that is, the likelihood values of the seven different regions of the CON were significantly higher than those of the ASD in the models trained by ASD or CON. According to the statistical test results of the variance in the abnormal area, it could be seen that the variance of the likelihood value in the CON group was significantly smaller than that of the ASD group in CON MODEL. In the other model, although the variance of CON was not statistically significant, we could still see from the trend that its median variance was smaller than that of patients with autism. For this reason, a hypothesis was proposed by us in an attempt 94532 VOLUME 8, 2020 to give a reasonable explanation for the above experimental results: Compared to a resting state of CON, the brain of ASD is often in a chaotic state randomly. When we trained HMMs, a Bayesian probability model, the effects of these random states were averaged. The averaged model was less affected by the random state and was similar to the CON MODEL, so there was not much difference between the two models. However, when we calculated the likelihood value based on such a model, ASD would show a lower likelihood value and a higher variance due to the random state. Rudie et al also found similar results [55]. They found ASD functional connectivity networks had lower clustering (i.e., local efficiency) and shorter average path lengths, which were the characteristics of random connected networks [56].

C. PERFORMANCE IN ASD IDENTIFICATION
After extracting clinically meaningful brain features, a direct way to apply them was to use the extracted features to build a classifier with high accuracy and stability to assist doctors in making more objective clinical diagnosis. Using ASD MODEL, CON MODEL and ASD-CON MODEL calculated the likelihood values of 7 ROIs as features to train the linear support vector machine model. The average accuracy rate reached 74.9%, and the average AUC reached 0.8 after ten cross-validations, which was better than the classification accuracy rate (67%) obtained by Alexandre using functional connectivity networks [57]. The classification results could also explain the effectiveness of our localized ROI and extracted features.

D. LIMITATIONS OF CURRENT RESEARCH
First, in the process of data selection, to ensure the quality of the data we use, complete scales and sex matching can be guaranteed, a large amount of data were eliminated. Although we selected the data from multiple sites without discrimination, the representativeness of the experimental data has not yet been determined. At the same time, because the amount of data is not very large, the final classification results have not been verified on an independent test set. Additionally, we only kept male data because of the inability of men and women to match. Therefore, it is unknown whether the results of this study will be applicable to women with autism. If we can collect enough data for female autism patients in the future, the above results will be supplemented and updated. Furthermore, regarding the selection of the number of hidden states in the experiment, we have combined three factors. The first is research with some good results; the second is to make the average likelihood value obtained by the trained model from the training data as large as possible; the third is the computing power we can currently achieve. We found that within a certain range, the greater the number of hidden states, the larger the mean of the likelihood would be. The larger likelihood values indicated that the currently trained model could better reflect the characteristics of the data. However, when the number of hidden states was large, the mean value of the likelihood increased slowly. Considering our computing power, we finally chose 20 hidden states. Exploring the physiological meaning of the hidden state and the results in the higher hidden state is our next work plan.

V. CONCLUSION
In conclusion, the present study demonstrated that the abnormal areas in the frontopolar area, orbitofrontal area, inferior temporal gyrus, middle temporal gyrus, and fusiform gyrus are prominent features of ASD and closely related to the decline in clinical function. This 'two-group cross-localized Hidden Markov Model' provides a robust and powerful framework for understanding the dysfunctional brain architecture in ASD and auxiliary diagnosis.