Regional-Asymmetric Adaptive Graph Convolutional Neural Network for Diagnosis of Autism in Children With Resting-State EEG

Currently, resting-state electroencephalography (rs-EEG) has become an effective and low-cost evaluation way to identify autism spectrum disorders (ASD) in children. However, it is of great challenge to extract useful features from raw rs-EEG data to improve diagnosis performance. Traditional methods mainly rely on the design of manual feature extractors and classifiers, which are separately performed and cannot be optimized simultaneously. To this end, this paper proposes a new end-to-end diagnostic method based on a recently emerged graph convolutional neural network for the diagnosis of ASD in children. Inspired by related neuroscience findings on the abnormal brain functional connectivity and hemispheric asymmetry characteristics observed in autism patients, we design a new Regional-asymmetric Adaptive Graph Convolutional Neural Network (RAGNN). It utilizes a hierarchical feature extraction and fusion process to learn separable spatiotemporal EEG features from different brain regions, two hemispheres, and a global brain. In the temporal feature extraction section, we utilize a convolutional layer that spans from the brain area to the hemisphere. This allows for effectively capturing temporal features both within and between brain areas. To better capture spatial characteristics of multi-channel EEG signals, we employ adaptive graph convolutional learning to capture non-Euclidean features within the brain’s hemispheres. Additionally, an attention layer is introduced to highlight different contributions of the left and right hemispheres, and the fused features are used for classification. We conducted a subject-independent cross-validation experiment on rs-EEG data from 45 children with ASD and 45 typically developing (TD) children. Experimental results have shown that our proposed RAGNN model outperformed several existing deep learning-based methods (ShaollowNet, EEGNet, TSception, ST-GCN, and CGRU-MDGN).


I. INTRODUCTION
A UTISM spectrum disorders (ASD) are a neurodevelop- mental disorder characterized by impaired language and social interaction and have limited, repetitive, and stereotyped behavior patterns [1], [2].It is usually delayed brain development before the age of 3 years and before the child's behavioral abnormalities appear.The rate of its incidence has increased rapidly, from 1 in 56 nationwide in 2016 to 1 in 44 nationwide in 2018 [3], [4].With the exposure of the media and the high incidence of autism, more and more parents are aware of the early signs and harms of autism, and at the same time, the requirements for diagnostic efficiency have increased.As a neurodegenerative disease, it is better to diagnose as early as possible and cooperate with treatment, and mild patients can be no different from ordinary people.Considering the complex and lengthy diagnostic steps of existing clinical diagnosis, efficient auxiliary diagnostic tools are extremely necessary to greatly promote the screening of early diagnosis of autism.In the future, with the support of technologies such as artificial intelligence (AI) and medical internet of things, smart healthcare holds great promise for improving the uneven distribution of medical resources and promoting medical fairness.
From a structural and functional point of view, normally, the human brain has hemicerebral lateralization.As early as 1981, Roger Sperry confirmed the "left and right hemisphere division theory" of brain function lateralization through the famous split-brain experiment.The brain development of children in general has the functional specificity of the cerebral hemispheres, the left bias of language function [5], and the right deviation of the visual space [6], [7].The brain also exhibits distinct asymmetrical characteristics in gray matter morphology [8], white matter attributes, cortical thickness, and surface area [9], [10], [11], [12], [13].Studies have shown that with age, the asymmetry of the left and right brains will become more significant [14].Normal children tend to be symmetrical from significant salinization of the hemispheric white matter network in childhood to sound development in adolescence [14], [15].
For children with autism, various neuroscience findings indicate that in the early stages, these children will exhibit relatively symmetrical atypical characteristics [16].Simultaneously with abnormalities in long-range white matter pathways, there are also abnormalities in functional connectivity between hemispheres [17].Anderson et al. demonstrated a sustained reduction in interhemispheric connectivity in ASD patients by evaluating neural synchronization between homologous voxels [18].Moreover, the abnormality also manifests in the early brain structural development of patients with autism, which develops too quickly but tends to normalize in adulthood [19].Cardon et al. found through structural magnetic resonance imaging that the covariance of brain structure volume in the left and right hemispheres of individuals with ASD is significantly reduced in the somatosensory cortex [20].At the same time, resting-state electroencephalograph (EEG) of children with autism shows excessive power in the low-frequency delta (δ) and theta (θ) bands and high-frequency beta (β) and gamma (γ ) bands compared to normal children, while exhibiting reduced characteristics in the medium-frequency alpha (α) band [21], [22], [23], [24].Obviously, compared to typically developing children, the asymmetry of the left and right hemispheres and the inverted U-shaped frequency characteristics of autistic children can be used effectively to distinguish between autism and typical children.
Most early diagnosis of autism is based on imaging such as functional magnetic resonance imaging (fMRI) [25], [26], magnetoencephalogram (MEG) [27], [28] and eye tracking technology [29], [30].Compared with the above ways, noninvasive EEG has the advantages of high temporal resolution and low cost, and resting-state EEG is suitable for young children with poor task coordination [31], [32].One option is to use EEG signals as stationary random signals and process them using time-frequency transformations for time series analysis [33], [34], [35].To maintain the non-random and non-stationary characteristics of EEG signals and extract nonlinear dynamic features from them, undoubtedly, deep learning networks could be used.The classical analysis of EEG and functional connectivity, as well as the analysis of symmetric indices in classical graphs, require manual selection and feature extraction from raw data.On the other hand, deep learning models can automatically extract features from raw data, reducing the workload and subjectivity of manual feature engineering.In addition, deep learning models can learn more abstract and advanced feature representations from raw data, which is particularly helpful for complex pattern recognition in disease diagnosis.Deep learning models also have powerful classification abilities and can improve classification accuracy by training large-scale neural networks.In the future, deep learning models can effectively handle large-scale data, which is important for training on large sample sizes and highdimensional data.
By considering the non-Euclidean nature of the spatial relationship between electrodes, more and more graph theory methods are applied in brain networks [36].Emotion recognition is a widely studied topic and various forms of graph neural networks have been applied, including regularized graph neural networks [37], dynamic graph convolution [38], graph convolutional neural networks combined with long short-term memory (LSTM) [39], and DialogueGCN [40].In the field of disease diagnosis, it is common to use graph theory methods to analyze brain connectivity, such as in the case of Alzheimer's disease [41].Factor graph-based models have been utilized in diagnostic methods to identify the specific brain regions involved in epileptic seizures [42].Asadzadeh et al. [43] proposed using a neural network model to detect and classify epileptic seizures through image generation.Graphs describe the structural and functional connections between neurons, whereas sparse structural networks are currently used in most graph theory applications, and an inherent simplification is an assumption that "all nodes and edges are identical and homogeneous in a given network representation".Overall, extracting abnormal structural features of the brains of individuals with autism using graph convolutional neural networks based on resting-state electroencephalograms presents a significant challenge.
Motivated by the successful achievements of convolutional neural networks (CNN) and graph convolutional neural networks (GNN) in the field of EEG data analysis, we embed relevant findings in the neuroscience of autism into the model design and propose a novel end-to-end regional-asymmetric adaptive graph convolutional neural network (RAGNN) with resting-state EEG for diagnosis of autism in children.In summary, the main contributions of this paper can be divided into three parts:  The remainder of this paper is organized as follows: In Section II, we detail the methodology, including data collection, preprocessing, and our proposed model.In Section III, we conduct experiments to evaluate the proposed method and explore its compatibility with prior knowledge in neuroscience.Finally, we summarized this paper in Section IV.

II. METHODOLOGY A. Subjects
The dataset in this work was provided by the State Key Experimental Team of Cognitive Neuroscience and Learning at Beijing Normal University.A total of 90 children are enrolled and their parents volunteered for their children to participate in the experiment, including 45 autistic (ASD) children and 45 typically developing (TD) children, aged 3-6 years (ASD: mean=4.13±0.98,TD: mean=4.13±0.98).We assessed children using the Autism Behavior Scale (ABC), Social Response Scale (SRS), and Social Communication Questionnaire (SCQ) based on the Diagnostic and Statistical Manual of Mental Disorders Fifth Edition (DSM-V) report to ensure the validity of the classification results.All TD children were chosen from a neighborhood kindergarten and tested using the ABC, SCQ, and SRS scales to look for signs of autism.Table I summarizes details of behavior scores of all TD and ASD children.None of all 45 TD children reached the cut-off score of the ABC, SCQ, and SRS scales.This study was approved by the Beijing Normal University's Research Ethics Committee (References number: IRB_A_009_2021001, approved date: 02/03/2021), and all children's parents gave their agreement before their inclusion as subjects.Due to the nature of this research, the data contained information that could compromise the privacy of research participants; participants of this study disagreed with their data being shared publicly.

B. EEG Data Collection and Preprocessing
The children's resting-state EEG data were collected by the 128-channel EGI HydroCel geodesic system (Eugen, Oregon, USA).The duration of data collection is guaranteed to be between 5-10 minutes, during which the child is usually accompanied by a caregiver in a quiet room, sitting in a comfortable chair with his eyes open.The sampling frequency is 1000 Hz, all electrodes are referenced by the Cz electrode, and the electrode impedance is kept below 50 k .By selecting the artifact detection algorithm, electrodes exceeding 50 k during recording and electrodes with thresholds exceeding 200V are marked as poorly interpolated channels, and the screened data segments are re-referenced to the average reference of the left and right mastoids.

C. Proposed RAGNN Model
Our proposed RAGNN model aims to extract discriminative features and realize accurate autism diagnosis.We take advantage of the parameter-sharing mechanism of CNN to learn the temporal characteristics of each brain region, and then learn asymmetric spatial features based on brain functional connections using adaptive graph convolution.Finally, the two features that use the attention mechanism to maximize the difference are integrated and sent to the classifier.Fig. 2 shows the overall framework of our proposed RAGNN model, which contains a regional temporal feature extractor, adaptive graph structure learning, an asymmetric spatial feature extractor, attention fusion, and a classifier.We will introduce these parts in detail as follows.
1) Model Input: Based on the 10-10 criteria for autism in this study and to avoid excessive distractions, we selected 46 channels that maximize coverage of four functional brain regions (frontal, parietal, temporal, and occipital lobes).At the same time, in order to ensure that the left and right hemispheres are comparable, we have symmetrically sorted the left and right hemispheres of the cerebral channels.And normalize all participant data to reduce participant differences to a certain extent.The data of different lengths after cutting the bad guide and poor-quality fragments will be randomly selected for ten seconds as the slice in units of sampling frequency.Each participant randomly selects 10s of data and processes it into 10 samples (Our defined samples refer to the segments shown in Fig. 1) and inputs them into the network, where there is no overlap.For the data format of the input model, in order to enable feature learning in the temporal dimension and the spatial dimension respectively, we add a dimension to the data to generate multiple convolution kernel channels.Define the model input matrix as The matrices X le f t , X right ∈ R c 2 × point represent the matrix composed of EEG data from the left and right hemispheres, c is the number of all channels, point is the length of the segment, point = 1000.where X le f t = {1, 2, 3, 4} which represents four brain regions within the hemibrain.The specific channel distribution selected is shown in Fig. 1.
2) Regional-Temporal Feature Extractor: In this part, we mainly used two-dimensional convolution to extract the temporal features of each electrode.In particular, we utilize the parameter-sharing mechanism of convolution to design a Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.regional temporal feature extractor for the functional characteristics of brain regions in autism.The time series generated by each electrode undergoes two two-dimensional convolution operations, and the difference between the two convolution operations is the range of shared parameters.We demonstrate regional-temporal feature extractor details in Fig. 3, only the right brain is shown, and the left brain is similar.The first part uses convolutional blocks on each of the four brain regions of the selected left and right brains to extract the temporal features of the subregions.Convolution kernel parameters are shared in the respective brain regions, where each convolutional block consists of a twodimensional convolutional layer, an average pooling layer, and a ReLU activation function.The whole process of this part is defined as: where r ∈ {le f t, right} and i ∈ {1, 2, 3, 4}, X r i is the raw data input for the i-th brain region of the r hemisphere, and B r i represents the regional temporal feature of the i-th brain region of the r hemibrain.Conv2D (•) is a twodimensional convolution, where s 1 is its convolution kernel scale, s 1 = f s 2 .σ (•) is ReLU activation function, and Avg 2 (•) is the average pooling layer, which reduces the dimension to one-half of the original dimension in the time dimension, which prevents overfitting to a certain extent and increases robustness.
The commonly used EEG frequency for autism diagnosis is between 2Hz-70Hz, which is why the convolution kernel scale we chose is (1, f s 2 ).Because it can learn the temporal feature of the frequency above 2Hz.After the regional-temporal features of the brain region are extracted, the same convolution blocks are used to learn the temporal features of the hemibrain.Before that, it is necessary to concatenate the brain region feature maps of the same hemibrain in the spatial dimension.This process is defined as: The final feature map obtained by the regional-temporal feature extractor is defined as: where s 2 is half of the last stage feature map temporal feature.
3) Adaptive Graph Structure Learning: Unlike the fixed graph structure, we obtain the adjacency matrix for establishing the graph structure by learning the adjacent feature relationship of the electrode by the model.In this part, we establish the graph structure of the left and right hemibrains, respectively, and obtain the adjacency matrices A le f t and A right of the left and right brains.
Let the adjacency matrix A le f t i j , A right i j = g x i , x j (i, j ∈ {1, 2, . . ., n}) is used to represent the node x i and x j connections.n is the number of nodes in the hemibrain.x i , x j from the feature map T r is the vector corresponding to node i, j. g(X i , x j ) is calculated from the learnable weight vector ω = ω 1 , ω 2 , . . ., ω f ∈ R f ×1 and the features of node i and j.The learnable weighted vector ω shares parameters during the learning process of each node connection.The convolution kernel feature layers obtained by all the previous convolutional layers are averaged.and then the adjacency matrix graph structure is represented as and the activation function ReLU corrects each linear unit to a non-negative number.The basis of the adjacency matrix is a dynamic time feature map, we directly use the total cross-entropy loss for iterative optimization.The detailed optimization cross-entropy loss function is Finally, the adjacency matrix of the left and right hemispheres is obtained, which is input as a graph structure into graph convolution learning.
4) Asymmetric-Spatial Feature Extractor: Graph convolution, which generalizes convolution from Euclidean space to non-Euclidean space adapted to EEG.In graph convolution learning in this part, we use graph convolution kernels approximated using Chebyshev polynomials.Compared to the ordinary spectrogram convolution, the computational complexity is reduced to K , and K is the number of Chebyshev polynomials.This part combines the hemibrain adjacency matrix and the temporal characteristics of each node to learn the non-Euclidean spatial features of the left and right hemispheres.Chebyshev graph convolution using the polynomial of order K − 1 is defined as where g θ represents the convolution kernel, * G represents the graph convolution operation.L is a normalized Laplace matrix, L = 2 λ max − I .λ max is the maximum eigenvalue of the Laplace matrix, According to the conditions of the Chebyshev polynomial of the first class, the eigenvalue diagonal matrix of the normalized Laplace matrix is transformed between [−1,1].The recursive formula of the first class of Chebyshev polynomials is This is obtained from the above recursive formula T k L represents Chebyshev polynomial.θ k represents the vector of Chebyshev's coefficients.
5) Attention-Based Feature Fusion: In this part, we use attention mechanisms to weigh the spatiotemporal features of the left and right brains, and then enter the classifier.Depending on the influence of each part (i.e., the left and right hemisphere) on the classification results, the attention mechanism is used to dynamically make the network focus more on a particular hemisphere.The illustration is shown in 5. First, the global pooling layer is used to compress the global features and the nonlinear relationship between the left and right hemibrains is learned through the fully connected layer combined with the activation function.The spatio-temporal characteristics of EEG asymmetry in autism after fusion were obtained after the weighted average.The features after fusion are defined as where K is the number of graph convolution kernels in the graph convolution.Q is the dimension reduction ratio during the attention mechanism's compression.β 1 , β 2 is the result of left and right brain weights derived from attention mechanisms.6) Classifier: Finally, the training process is completed by the classification loss optimization model, and the diagnosis results of autism probability are obtained.The spatiotemporal fusion feature matrix is input to the classification layer containing two fully connected layers, a ReLU activation function and a So f tmax activation function, and the training is optimized using the cross-entropy loss function to obtain the final diagnosis result.Define the output as: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.get the Fusion weighted sum by the attention mechanism by Eq. 9 14 end 15 get ŷ by Eq. 10 variables (e.g.per-fold data, random parameters, iterative strategies) and strictly controlled the variables to ensure that the experiments were comparable.

A. Implementation
The proposed method was implemented using the Pytorch framework (Pytorch v1.8.1 as backend).Refer to Section II for the specific segmentation and model input format of the original data related to the subject.To ensure the reliability and accuracy of our experimental results, we use a robust validation process.Considering the severity of autism is age-related, to avoid the influence of age on the experimental results, we adopted a random stratified method by age, as shown in Fig. 6.This strategy can ensure that each experiment has the same age distribution in the training and testing sets.Specifically, we divide the data of children in each age group into five equal splits, forming five splits.Each split is used as the testing set once in turn, which accounts for 20% of all data.The remaining four parts are used for model training, accounting for 80% of all data.Additionally, our data splitting is based on individual participants, therefore samples from the same participant would be in the same split.This achieves the independence of participants, ensuring that samples from the same participant were not used for both parameter training and testing, thus avoiding the error of artificially inflated results.This will ensure that all the data is examined and that the model's performance is evaluated comprehensively to be used to evaluate the effectiveness of our proposed method.Finally, the average result of the five cross-validations is taken as the overall performance of the

B. Performance Comparison
Our proposed RAGNN model is an end-to-end model based on deep learning, which can adaptively learn temporal and spatial features from local to global in both left and right hemispheres from the preprocessed EEG signal segments.It requires no manual feature extraction procedures and can provide an automatic feature learning way.Therefore, in this study, to better evaluate the performance of our proposed RAGNN model in feature learning and classification, we mainly focus on performance comparison with existing deep learning models.However, there are rare studies using deep learning with EEG data in autism diagnosis.In recent years, deep learning has been widely used in the field of EEG analysis for different tasks, such as motor imagery classification and emotion recognition, and various advanced models have been developed and achieved excellent performance.Therefore, we select five representative deep learning models for EEG data from related domains as compared methods, which are as follows: • ShallowConvNet [44](2017) is a fairly versatile architecture tailored specifically to decode band power characteristics based on convolutional neural networks.
• EEGNet [45](2018) is a typical convolutional neural network model that can be used in different brain-computer interface paradigms, including P300, ERN, MRCP, and SMR.It is a universal and compact convolutional neural network specially designed for general EEG recognition tasks.
• TSception [46](2021) is a newly developed convolutional network model designed for EEG-based emotion recognition.It takes into account the characteristics of the left and right brain differences in emotional expression, which also motivates us to design our RAGNN model.
• ST-GCN [47](2022) is a dynamically spatio-temporal graph convolutional neural network designed to fully explore its potential in utilizing electroencephalogram recordings for early diagnosis of Alzheimer's disease.
• CGRU-MDGN [36](2024) is a convolutional gated recurrent unit-driven multidimensional dynamic graph neural information and captures no-Euclidean spatial features between EEG channels to classify emotions.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 7 shows the confusion matrix results of different models.We can observe that the most common error in all methods is misdiagnosing typically developing children as autistic.
To evaluate the overall performance of different models, we calculate the average accuracy, sensitivity, specificity, NPV, and PPV based on the confusion matrix results.Additionally, we compute the variance of the corresponding results from the five-fold cross-validation.The specific calculation formula of each indicator is: Sensitivit y = T P T P + F N (13) P P V = T P T P + F P (15) where TP, TN, FP, and FN denote true positive, true negative, false positive, and false negative, respectively.In this experiment, the autism sample was defined as positive, while the typically developmental sample was defined as negative.Specificity (Spec) represents the proportion of negative samples that are correctly predicted out of all actual negative samples, while sensitivity (Sens) represents the proportion of positive samples that are correctly predicted out of all actual positive samples.PPV indicates the proportion of true positive results out of all positive results reported by a new test.On the other hand, NPV indicates the proportion of true negative results out of all negative results reported by a new test.
Firstly, from Table II, the proposed method is the most accurate and has the lowest variance of all the methods.In the actual application scenario of disease diagnosis, there is a large gap between the population base of autistic patients and ordinary children.In this case, both PPV and NPV are extremely important, and the NPV we have obtained is the only one that exceeds 95%, reaching up to 97%.Another explanation is that when the diagnosis is negative, the results are 97% confident that it is not a disease.According to the specificity and sensitivity results, RAGNN had a low missed rate (1-sens) of only 2.77%.In the comparison model, TSception is a model that utilizes emotional EEG asymmetry similar to the model we proposed.Clearly, both TSception and our model are significantly more effective than other models.ST-GCN and CGRU-MDGN are two models used in EEG analysis.ST-GCN is utilized for diagnosing Alzheimer's disease, while CGRU-MDGN is used for analyzing emotion in EEG data.Both models employ graph convolutional networks (GCNs) to extract spatial features.Our model, however, not only achieves higher performance (where the accuracy is 5.34% higher than ST-GCN and 10.56% higher than CGRU-MDGN), but also outperforms CGRU-MDGN by 30% in terms of NPV.In the current auxiliary diagnosis application, we need to screen a large number of individuals who visit hospitals for examinations.The goal is to achieve the lowest possible rate of missed diagnoses and to minimize misdiagnosis, thus ensuring that patients are not overlooked and improving diagnosis efficiency.
To visualize the separability of the model more intuitively, we plotted the receiver operating characteristic (ROC) curve and calculated the AUC indicator, as shown in Fig. 8.The AUC in the figure is an index of separability, which is numerically equal to the area under the ROC curve.The closer the AUC is to 1, the better the separability.Conversely, the closer it is to 0, the worse the separability.Clearly, our proposed method is closest to 1, exhibits the best separability, and is more effective at distinguishing between patients with and without the disease.

C. Ablation Study
This work consists of three functional components: regionaltemporal feature extractor, asymmetric-spatial feature extractor, and attention-based feature fusion.The combination of these three parts led to the success of the classification task.In this subsection, we evaluate the contribution of a specific functional module to task performance by gradually reducing its functionality.The classification results after removing each regional-temporal feature extractor (noRegion), asymmetricspatial feature extractor (noGraph), and attention-based feature fusion (noAtt) from RAGNN are summarized in Table IV.From Table IV, we can observe that the attention mechanism improves the performance of the model, although its contribution is relatively smaller compared to the contribution of the adaptive graph convolutional module, which is less than 1%.The asymmetric-spatial feature extractor can improve performance by 2% to 3%.Overall, all three functional extractors contribute to improving the performance of RAGNN.

D. Discussion and Analysis
Our model is built on the theoretical foundation of autism brain characteristics.In this subsection, we carry out three experiments to explore the design of adaptive graph structure, left-right hemisphere asymmetry, and different brain regions to the performance of the model.The functional abnormalities of the left and right hemispheres of autism and different brain regions were analyzed.The connection between each pair of nodes is calculated based on the spatial distance between electrodes that conform to the 3D scalp model.The Gaussian kernel function is then applied to this distance.The specific implementation formula is where loc v i and loc v j are Cartesian coordinates of node i and node j, and σ is the scale parameter to adjust the connectivity level.
• Distance-based dense graphs (Distance): Use only the spatial distance between electrodes to represent the connections between nodes.Specifically, it is expressed as • K-nearest neighbor rule (K-nearest): Every node retains the five nearest edges, so k is set to 5. The final adjacency matrix is an unweighted sparse matrix, the edge corresponds to 1, and the other is 0.
• Dist-based: Sort all connected edges from short to long, and retain some edges to construct an unweighted sparse matrix.To divide the regions as accurately as possible while maintaining effective connectivity, through the experiment we chose to keep the first 15% of the edges.
• noGraph: Use 2D convolution layers instead of graph convolution layers in the spatio-temporal feature fusion part.The results are shown in Table III.Compared to the adaptive graph, the structure of the fixed graph is based on the electrode's spatial structure.This structure is obtained from the x, y, and z-axis coordinates converted from the electrode cap parameters.Three-dimensional distance is utilized in Gaussian kernel functions to map finite-dimensional data to high-dimensional spaces.As the distance between the two vectors increases, the value of the Gaussian kernel function Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE III RESULT (MEAN± STD) OF COMPARATIVE METHODS AND EXPERIMENT I-III
decreases monotonically.When comparing the sparsity matrix of the two mainstream rules, the k-nearest neighbor algorithm, and the distance-based algorithm, the performance will be slightly worse than that of the Gaussian model.However, the amount of computation required will be smaller.In the pursuit of accuracy, a part of the calculation speed will be sacrificed.Therefore, the appropriate graph structure should be selected based on the situation.
2) Experiment II: Hemishphere: The second experiment is conducted to investigate the effect of local spatial feature extraction on the network of the left and right hemishpheres, and the specific experimental settings are as follows: • Whole: Global spatial feature extraction with the whole brain as a whole using only whole brain data.
• Left: Only EEG data from the left brain are used.
• Right: Only EEG data from the right brain are used.
• Hemi-whole: Simultaneous EEG data from the whole brain and the left and right hemispheres were used.
• noAtt: Fuse the characteristics of the left and right brain directly, without weighting attention to them.
Table III shows that our proposed model, combining with the left and right hemisphere parts, demonstrate that differential training of the left and right hemispheres can enhance model performance, with or without global data.Compared to the Hemi-whole model, doubling the features does not improve our model performance and instead introduces some interference effects.Meanwhile, comparing the left hemisphere and the right hemisphere, the left hemisphere's discrimination ability is about 2.5% higher than that of the right hemisphere, as reflected in the model's performance.This experimental result is consistent with the significant language impairments observed in children with autism.The language center is divided into Broca's area and Wernicke's area, both located in the left hemisphere of the brain [5].According to the results of noAtt, it is evident that using attention to weigh the fusion of left and right brain spatiotemporal features can significantly enhance the model performance.3) Experiment III: Brain Region: The third experiment is the exploration of data on various brain regions of autism.
• Frontal lobe [48]: Only data from the frontal lobe is used.This area of the brain is responsible for "higher cognition," which includes managing attention, thinking, voluntary movement, and judgment.Additionally, the left frontal lobe contains an important region called Broca's, primarily responsible for functional language production.
• Temporal lobe [49], [50]: is located next to the ear, and contains the main auditory cortex.Additionally, it includes the Wernicke area, which is responsible for language understanding.
• Parietal lobe [51]: Only data from the parietal lobe is used.The parietal lobe is located between the central sulcus and the occipital ridge.
• Occipital lobe [52]: Only data from the occipital lobe is used.The main visual cortex is responsible for all visual perception.Symptoms such as memory deficits and motor intuition disorders may occur when damaged.
• noRegion: Global temporal feature extraction is performed without local convolution of the brain region twice.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE IV ABLATION EXPERIMENTAL RESULTS (MEAN± STD)
The brain regions and specific channels are shown in the Fig. 9.A total of 10 electrodes were selected in the parietal lobe region, while 4-5 electrodes were selected in other brain regions.After comparing the results of experiments conducted on a single brain region, it can be seen that the data from the temporal lobe performs the best in terms of classification accuracy, while also having the smallest variance.We speculate that the relationship between Wernicke's area and autism is highly relevant to the temporal lobe [53].The accuracy of the parietal lobe, where Broca's area [54] is located, is higher than that of the occipital lobe, which is responsible for visual perception.In particular, experiments that use only temporal lobe data have significantly higher accuracy compared to those that use only left-brain or right-brain data.Therefore, it can be observed that the EEG signals generated by brain abnormalities caused by autism can be utilized in the feature extraction process of convolutional neural networks.While convolutional neural networks could be difficult to explain due to their black-box nature, they can certainly be optimized through careful design.We further conducted an experiment (noRegion) where no regional time features were extracted.The results of this experiment indicate that this is indeed a good strategy for optimizing the diagnosis of autism.

IV. CONCLUSION
We propose a diagnostic model named RAGNN, based on resting-state EEG, to assist in the rapid diagnosis of autism spectrum disorder, which is experiencing a significant increase in demand.Our model adopts an end-to-end design, using selflearning convolutional networks to extract features from EEG data of children with autism and TD children.This model uses temporal feature extraction from different brain regions and adaptive graph convolution learning from each hemisphere to extract temporal and spatial features from resting-state signals.This design utilizes the pathological mechanism of functional connectivity abnormalities and hemispheric asymmetry in autism.We investigate the effectiveness of different components, including graph structure characteristics, hemispheres, and various brain regions through extensive experiments.Compared with several deep learning-based models, our model has demonstrated significantly better performance, which indicates that our model has great potential in practical applications to improve diagnosis accuracy.
In terms of the limitations of this work, there are three folds.Firstly, in our study, our developed diagnosis model is a binary classification model, which cannot predict disease severity.Second, our model does not consider the effects of different factors, such as age, gender, and symptoms.Thirdly, our model is developed only with EEG data that ignores other modality data, such as eye-tracking data and fMRI data, thus resulting in limited diagnosis performance and low reliability.To deal with these limitations, in our future work, we will recruit more subjects and closely collaborate with hospitals to collect multimodal data, including EEG, eye-tracking, fMRI, and other data with more detailed information about disease severity, geographic location, and symptoms.We also attempt to develop advanced interpretable diagnosis models with multimodal data to perform more accurate, trustworthy, and reliable diagnosis of autism in children.

Fig. 1 .
Fig. 1.Illustration of data segmentation, where each segment corresponds to data points within one second.

1 )
Experiment I: Graph Structure: The first experiment involves an experimental comparison of different graph structures, including adaptive graphs and various types of fixed graphs (with only variable control for the graph functional construction part).The difference lies in the construction of the adjacency matrix, which measures the functional connectivity between electrodes.Additionally, several fixed graph structures are constructed for comparison purposes.The calculation of the adjacency matrix for the fixed graph is based on the spatial Cartesian coordinates of the electrode.•Dense graph based on Gaussian kernel function (Gauss):

TABLE II COMPARISON
RESULTS OF DIFFERENT METHODS(MEAN±STD) Fig. 8. ROC curves of different methods.