Graph Adversarial Transfer Learning for Diabetic Retinopathy Classification

Diabetic retinopathy (DR) is an essential factor that has caused vision loss and even blindness in middle-aged and older adults. A system that can automatically perform DR diagnosis can help ophthalmologists save a lot of tedious work, such as DR grading or lesion detection. At the same time, patients can find their diseases earlier and perform the correct treatment. However, most of the existing methods require many DR annotations to train the model, and the DR data will vary to different degrees due to various shooting tools. The above problems lead to the inefficient use of existing data in the experiment, limiting actual deployment. To alleviate this problem, we propose a novel Graph Adversarial Transfer Learning (GATL) for DR diagnosis in a deep model through transfer learning, including intra-domain alignment and inter-domain alignment. The proposed GATL enjoys several merits. First, our GATL adopts the self-supervised training to save the annotating cost in the target domain thus this domain adaptation method can significantly reduce annotation cost compared to the supervised approaches. Second, we introduce the graph neural network to extract potential features between unknown samples. Third, to enhance the robustness of the model, we use adversarial training to perform both inter-domain and intra-domain alignment to further improve the model’s classification accuracy. GATL achieved 94.3%, 97.5%, and 91.1% in accuracy, sensitivity, and specificity in the APTOS dataset and 92.7%, 95.7%, and 89.7% in the EyePACS dataset, respectively. Extensive experimental results on two challenging benchmarks, including APTOS 2019 and EyePACS, demonstrate that the proposed GATL performs favorably against baseline DR classification methods.


I. INTRODUCTION
Diabetic retinopathy (DR) is one of the eye diseases with a rapidly increasing incidence in recent years. It is a complication of diabetes, which can cause damage to the blood vessels in the eye and eventually lead to vision loss, even blindness [1], [2]. In modern social life, high blood pressure and high blood sugar have become factors that plague the health of most middle-aged and older people. This further The associate editor coordinating the review of this manuscript and approving it for publication was Inês Domingues .
increases the incidence of DR. Therefore, the early detection of diabetic retinopathy is essential, significantly improving the effectiveness of treatment. However, in clinical testing, the early diagnosis of DR is relatively tricky. Even professional ophthalmologists need to spend a lot of time and energy to compare color images. Therefore, developing a system for automatic detection or classification of DR is very meaningful and necessary. An automated diagnostic system will review retinal images and perform DR classification based on the learned experience. In some public retinal datasets, DR is usually classified VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ by experts into the following five categories [3]: R0-no DR, R1-mild, R2-moderate, R3-severe, and R4-proliferative. In detail, the type of DR is affected by several factors, such as the number of lesions, area size, and appearance in fundus images. At the same time, the above five categories can also be divided into two different categories according to the standard [3], [4], [5]. For instance, Figure 1 provides an illustration of five DR grades in the Kaggle DR dataset [6].
Notably, the features of DR lesions in fundus images are complex and diverse. Therefore, the diagnostic system needs to extract critical features with discriminative from the sample images to perform DR screening accurately. In recent years, deep learning has achieved excellent results in many fields, and there are also many studies that apply deep learning to DR classification [7], [8], [9]. Dutta et al. [7] introduce an automated knowledge model to identify the critical antecedents of DR, which identify the target class thresholds weighted Fuzzy C-means algorithm. Kassani et al. [8] propose a new feature extraction method using a modified Xception architecture for the diagnosis of DR disease, which is based on deep layer aggregation that combines multilevel features from different convolutional layers of Xception architecture. Pratt et al. [9] develop a network with CNN architecture and data augmentation, which can identify the intricate features involved in the classification task such as micro-aneurysms, exudate, and hemorrhages on the retina and consequently provide a diagnosis automatically and without user input. There are some researches based on pixel-level supervision [10], or patchlevel supervision [5], [11], [12] also have been proposed.
Since the annotation of fundus images requires manual annotation by experienced domain experts, the advanced methods [13], [14], [15] will be limited in flexibility and scalability in actual deployment.
Although the above studies all demonstrate the effectiveness of their methods on a single dataset, the problem of domain adaptation is ignored in practical application scenarios. For example, in clinical experiments, imaging equipment for acquiring various image data may be provided by multiple manufacturers. These devices have specific differences in software and hardware, resulting in different levels of image quality gaps, and annotating sufficient data in new scenarios requires cost-expensive professional labor. Therefore, when a model trained in a fixed source domain is applied to another unknown domain, its diagnostic accuracy usually drops significantly. One way to address this problem is to fine-tune the model trained on the source domain with enough labeled samples from the target domain to align the interdomain differences. However, the annotation data of medical images is minimal, manual annotation requires a lot of time and energy from experts, and the economic cost is high. Therefore, it is necessary to make effective use of label data. As a result, the method of semi-supervised field adaptation is of great significance for DR classification.
Traditional domain adaptation methods usually use Convolutional Neural Networks (CNN) as the backbone to extract the internal features of the samples for classification. This ignores the potential connections between retinal images. In other words, GNN can ignore the invalid features in the CNN-extracted features and achieve the effect of noise reduction. There is already a large body of literature available on this application [16], [17], [18], [19], [20], [21], [22]. Therefore, the Graph Neural Network (GNN) can extract the potential features between images to use more practical information for DR detection. Nevertheless, the problem of model instability is often encountered during the experiment. Through experience, the model under adversarial training is often very robust, and adding adversarial loss is an excellent way to enhance the stability of the model. In addition, if the model is simply constrained by the method of intradomain alignment, the model trained with a large amount of source domain data may not be suitable for the target domain. Therefore, adding a method of inter-domain alignment can further improve the model effect. Thus, we propose a novel Graph Adversarial Transfer Learning (GATL) method for diabetic retinopathy classification in this paper. GATL not only focuses on the pixel characteristics of the retinal image itself, but it also extracts the potential relationships between samples, making the most of the original data. At the same time, GATL uses the high-confidence prediction value of the target domain as a pseudo-label and uses the confrontation training of two classifiers to promote further the intra-domain alignment of the source domain and the target domain. Finally, the newly designed discriminator will narrow the relationship between domains and align the data distribution of the source domain and target domain.
The main contributions of our method are as follows: (1) We provide a brand-new graph-based domainadaptive approach to investigate additional fundus image characteristics, and we employ transfer learning to successfully make use of additional data for DR classification of unidentified data.
(2) We train the model by designing two classifiers for adversarial purposes to improve the robustness of the model and also to improve the classification performance of the model.
(3) We design a discriminator and obfuscate the discriminator's judgment of the source and target domain data, allowing the model to be further compared between domains on top of the intra-domain comparison.

II. RELATED WORK
This section first introduces some recent research results on the classification of diabetic retinopathy and then summarizes the successful applications of some domain adaptation methods in medical images.

A. RETINAL IMAGE CLASSIFICATION
With the continuous development of computer vision, some computer-assisted retinal image research has also achieved great success [23], [24], [25], [26], [27], [28], [29], [30], [31]. For example, Chen et al. [23] propose a general deep learning model for DR classification, which uses a 2-stage training method to solve the overfitting problem. By the way, they also provide a simple method of addressing the imbalance of DR databases. Erciyas et al. [24] develop a deep learning-based method in which diabetic retinopathy lesions are detected automatically and independently of datasets, and the detected lesions are classified. Vives et al. [25] present a bio-inspired approach on synaptic metaplasticity in convolutional neural networks to detect diabetic retinopathy. Jadhav et al. [26] design an optimal feature selection-based diabetic retinopathy detection method which can develop automated DR detection by analyzing the retinal abnormalities like hard exudates, hemorrhages, Microaneurysm, and soft exudates. Canayaz et al. [27] propose an approach based on feature selection with wrapper methods used for fundus images. Abdelmaksoud et al. [28] introduce E-DenseNet, a hybrid deep learning method. Based on transfer learning, they combine the EyeNet and DenseNet models. Zhang et al. [29] design a Source-Free Transfer Learning technique for detecting referable DR that uses unannotated retinal pictures and only uses the source model during the training process. Gangwar et al. [30] tackle the challenge of automated diabetic retinopathy diagnosis and suggest an unique deep learning hybrid solution. Yi et al. [31] suggest the network known as RA-EfficientNet, in which a residual attention (RA) block is added to EfficientNet in order to extract additional features and address the issue of minute changes between lesions.
However, none of the above methods considers domain adaptation issues. The fundus data taken by different equipment are different to a certain extent. Moreover, in practice, annotated data is challenging to obtain due to the high cost, requiring limited available data to design domain adaptation methods.

B. DOMAIN ADAPTATION METHODS IN MEDICAL IMAGE CLASSIFICATION
In deep learning research, data labels are often the most expensive and difficult to obtain. This problem is particularly manifested in medical images. Therefore, how to efficiently use existing tags to predict unknown data has become a hot issue. Domain Adaptation [32] is one of the popular research directions. In recent years, many successful domain adaptation methods [33], [34], [35], [36], [37] have been widely used in medical images. Wang et al. [33] propose a method called deep adversarial domain adaptation to improve the performance of breast cancer screening using mammography. They aim to extract the knowledge from a public dataset and transfer the learned knowledge to improve the detection performance on the target dataset. Castellanos et al. [34] design a method that combines neural networks and domain adaptation in order to carry out unsupervised document binarization. Abbet et al. [35] propose a method for colorectal cancer tissue phenotyping to Adapt, which takes advantage of self-supervised learning to perform domain adaptation and remove the necessity of a fully-labeled source dataset. Konyakhin et al. [36] present their solution to the Traffic4Cast 2021 Core Challenge, which employs multiple domain adaptation techniques to fight the domain shift. Hong et al. [37] report an unsupervised domain adaptation framework for cross-modality liver segmentation via joint adversarial learning and self-learning.
Inspired by the above migration methods, to effectively use open source data tags and reduce the loss caused by domain transformation, we propose a GATL network for DR classification.

III. METHOD
This paper introduces a graph adversarial transfer learning method for DR classification. In this section, we divide GATL into two stages. Finally, we report the whole training schema of our GATL.

A. THE FIRST STAGE OF GATL
We define the input data as denotes the unlabeled data from target domain. The purpose of domain adaptation is to train a classifier F, which can accurately predict the sample category under the premise of a large amount of source domain data support and can be well applied to target domain data. In our proposed GATL, we use resnet50 as the backbone of the CNN feature extractor G c to extract the intrinsic features of the fundus images. At the same time, the GNN feature extractor G g will VOLUME 10, 2022 FIGURE 2. GATL framework description: First, fundus images are fed to a feature extractor that combines CNN and GNN architecture. Then, the extracted key features are respectively transmitted to classifiers F 1 and F 2 for adversarial training, and discriminator D is trained for confusing the source and target domains. Finally, the entire model is trained simultaneously from two perspectives of intra-domain and inter-domain alignment. explore the potential connections between samples. The biclassifier F 1 and F 2 will output the prediction vector of each retinal image.
We first feed the source domain data to the CNN feature extractor G c , and update the CNN network in the following supervised loss, where L ce denotes the standard cross-entropy loss function. θ c , θ f 1 , and θ f 2 represent the parameters of G c , F1 and F2 respectively. In a short time, we can obtain a feature extractor G c with a certain feature representation ability. Then we join the GNN extroctor G g after G c . With the obtained representation H s = h s 1 , , · · · , h s i , · · · , h s N S from G c , we use the k-nearest neighbor method to build the graph, the formula is as follows, where d ij denotes the similarity between the i-th feature h s i and j-th feature h s j from H S . By sorting d ij , we can obtain the top k features with high similarity. At the same time, the sort index is Idx = {i 1 , · · · , i m , · · · }, in which i m represents the m-th closest neighbor feature to i-th one.
Thus, the adjacency matrix A can be obtained according to the following rules.
In this way, we can use G g to extract special features in the graph structure. Assist by the labels, we update backbone and replace Eq(1) in the following way, where θ g denotes the parameters of G g . The features extracted through the cooperation of G c and G g will be sent to F1 and F2. We designed a self-supervised loss L ss to enhance the robustness of the model. The formula is as follows, where P 1 and P 2 represent the output predict vectors from F 1 and F 2 for the same source input. In addition, |·| 1 represents the l 1 distance.

B. THE SECOND STAGE OF GATL
After training the model with a large amount of labeled source domain data, we feed the target domain data to the network and perform high-confidence label filtering on the softmax output of the classifier. The filtered annotations are used as pseudo labels Y T = y t 1 , , · · · , y t i , · · · for the fundus image of the target domain.
In this way, we can use the target domain data in a selfsupervised manner. At the same time, self-supervised training can reduce the fuzzy samples of the decision boundary. The loss function is as follows, where n t is the number of selected high-confidence pseudolabel.
The difference from the first stage is that in order to perform inter-domain alignment, we design a discriminator D to identify whether the image belongs to the source domain or the target domain. The purpose is to confuse the discriminator's recognition of inter-domain samples so that the feature extractor can extract cross-domain features. The inter-domain loss is as follows, where θ d denotes the parameter of D. By reducing the sample difference between domains GATL will narrow the distance between the source domain and the target domain, and further improve the accuracy of the model applied to the target domain.
C. TRAINING STRATEGY GATL aims to perform DR classification by aligning the category distribution within and between domains in this work. We divide the GATL training process into the following three steps: Step 1. Training the model by the labeled source data Use the large amount of source domain data to learn G c , G g , F1, and F2 together to reduce the empirical risk of the source distribution. The model runs in the following way, where λ denotes the balance factor.
Step 2. Self-supervised learning for the unlabeled target domain Join the self-supervised training of the target domain and inter-domain alignment. Freeze the parameters of feature extractors G c and G g , and then update classifiers F 1 and F 2 to maximize the difference in probability output between them while maintaining classification accuracy. The model runs under the adversarial loss, where λ 1 , λ 2 and λ 3 denote the balance factors.
Step 3. Further optimization on feature extractors Freeze the parameters of the classifiers F 1 and F 2 , and update the feature extractors G c and G g to minimize the difference between the classifiers while minimizing the difference in sample characteristics between domains. The model runs in the following way, After repeating the above steps several times, GATL can effectively classify DR and perform inter-domain alignment based on intra-domain alignment. The model optimization is summarized in Algorithm 1.

A. IMPLEMENTATION DETAILS
The entire experimental process is completed using GeForce 2080ti GPU under the PyTorch framework. According to statistics, the model converges around 40 epochs, at which point the curve of the loss function also reaches a smooth state. It took about 6 hours from start to model convergence. Each retinal image is adjusted to 512*512 pixels and augmented by random rotation, cropping, and contrast-color augmentation before input to the network. In the GATL training process, we choose Adam as the optimizer, the batch size is 32, and the epoch is set to 50. For the learning rate setting, we chose the strategy of dynamically adjusting the learning rate, which gradually decreases according to the epoch change.
The initial learning rate is set to 0.1. In detail, we use Algorithm 1 Graph Adversarial Transfer Learning (GATL) Input data: Source data X S with corresponding labels Y S , target data X T , every images are reshaped into 512 × 512 and augmented, batch size =32, learning rate=0.0001, number of epochs T , the parameters λ and k. Initialize network parameters for G c , G g , F 1 , F 2 and D; for t(epoch) = 0 to T 1 do (step 1) Extract the feature vectors from G c and G g for source data Obtain the predicted probabilities from F 1 and F 2 for source features Optimize the parameters in G c , G g , F 1 and F 2 by minimizing Eq.8 end for for t(epoch) = T 1 to T 2 do (step 2) Extract features from G c and G g for target data Obtain the predicted probabilities from F 1 and F 2 for target features Freeze G c and G g Optimize the parameters in F 1 and F 2 by minimizing Eq.9 end for for t(epoch) = T 2 to T 3 do (step 3) Freeze F 1 and F 2 Optimize the parameters in G c and G g by minimizing Eq. 10 end for Return: The trained network parameters resnet-50 as the backbone, and the classifier is composed of linear layers. For parameter settings, the balance factor λ and KNN parameter k in graph building are 0.4 and 4. As for the data division, the source data are all fed into the network training, and the target domain is split into training and testing sets with an 80:20 ratio. The available codes and trained models will be available at https://github.com/huanw0813/GATL once the paper is published.

B. DATABASE DESCRIPTION
The proposed model is evaluated on publicly available EyePACS [6] and APTOS 2019 [38] datasets.
EyePACS compiled 88,702 color fundus images from the patients' left and right eyes. Multiple devices took these images under different imaging conditions. To uniformly process these fundus images with specific differences, we resize them to 512*512 pixels in the preprocessing stage. After being annotated by experts, these images were classified into five categories: 0 -No DR, 1 -Mild, 2 -Moderate, 3 -Severe, and 4 -Proliferative DR. In addition, the proportion of various types of images in this dataset is unbalanced, and the specific distribution is shown in Table 1.
APTOS 2019 is an open-source retinal dataset collated by the Asia Pacific Tele-Ophthalmology Society, in which fundus images were collected under different imaging conditions. There are 3662 images in this dataset, all uniformly preprocessed and rescaled to 512*512 pixels. Clinicians labeled these images into the following five categories: 0 -No DR, 1 -Mild, 2 -Moderate, 3 -Severe, and 4 -Proliferative DR. The category distribution of the dataset is organized into Table 1.
We divide the samples in these two data sets into binary categories according to the normal/abnormal criteria.

C. EVALUATION METRICS
To evaluate the performance of the proposed method, we employ accuracy (ACC), precision (PRE), sensitivity (SEN), and specificity(SPE) for binary normal/abnormal DR grading tasks. In this section, we also use the ROC curve and confusion matrix to evaluate the performance of GATL. The evaluation formulas are as follows, where TP means that the predicted value and the true value are sick; FP means that the predicted value is sick and the true value is healthy; TN indicates that both the predicted value and the true value are healthy; FN means that the predicted value is healthy, and the true value is sick. We also included AUC in the evaluation. AUC is a performance metric to measure the quality of a model, and it is defined as the area under the ROC curve. Usually, researchers can utilize the AUC area to measure the effectiveness of the binary classification model, representing the probability that the predicted positive example is ranked ahead of the negative example. The more significant the AUC value, the better the classification effect of the classifier.

D. EVALUATION ON DR CLASSIFICATION 1) COMPARE WITH BASELINE METHODS ON APTOS DATASETS
In order to further analyze the performance of GATL on DR classification, we have selected some baseline methods for comparison. The comparison results are shown in Tables 2. From Table 2, it can be observed that the proposed GATL achieves the highest accuracy of 94.3% among all the methods, which is 1.9% higher than the supervised methods SE-ResNeXt50 and Vgg-16, 3.1% higher than SFTL, 2.8% higher than AmResNet50, 3.25% higher than PB-CNN, 2.3% higher than ShallowNet+PI, and 3.6% higher than EfficientNet. This shows that GATL has a great advantage in DR classification accuracy. GATL also achieves the high sensitivity of 97.5% among all the baseline methods, which is 10.4% higher than SE-ResNeXt50, 16.8% higher than EfficientNet, 2.4% higher than SFTL, 8.5% higher than AmResNet50, and 4.9% higher than Vgg-16. This indicates that the misdiagnosis rate of GATL is low, and it can play an excellent effect in clinical applications.

2) COMPARE WITH BASELINE METHODS ON EyePACS DATASETS
We exchanged the source and target domains of the proposed GATL, and chose the corresponding baseline method for comparison. The comparison results are shown in Tables 3.  From Table 2, it can be observed that GATL achieves the highest accuracy of 92.7% among all the methods on EyePACS datasets, which is 1.8% higher than Inception V3, 5.36% higher than Bi-channel CNN, 7.2% higher than ViT, and 9% higher than Vgg-16. Also, in terms of sensitivity, GATL achieves the highest value of 95.7%, which is 41.2% higher than VGG16, 1.2% higher than Custom CNN, 18.77% higher than Bi-channel CNN, 2% higher than ViT, and 15.7% higher than Modified VGGNet.
Summarizing the results of the above two tables, we can summarize the following conclusions, (1) GATL still has a high accuracy rate after swapping the source and target domains, which means the proposed model has high robustness due to the adversarial loss. (2) Our domain adaptation method performs better than some supervised methods. This shows that after intra-domain and inter-domain alignment, the classification effect of GATL can be compared with some supervision methods.

3) PERFORMANCE OF ROC VISUALIZATION
In order to evaluate the performance of the proposed model, we draw ROC curves for two public data sets as shown in    The areas above and below the diagonal in the ROC plot are opposing areas. The diagonal lines represent the results of random classification. The closer the curve is to the upper left corner in the comparative experiment, the better the classifier's performance. It can be seen from Figure 3 that GATL is very effective in the task of DR classification. The value of AUC reaches 0.99 for the DR binary classification task. Also, in Figure 4, the value of AUC reached 0.98 in the DR classification. This shows that GATL has improved with training in the migration of the two data sets, which proves the effectiveness of GATL.

4) PERFORMANCE OF TSNE VISUALIZATION
We also made TSNE diagrams for two datasets simultaneously to show the effect of GATL more intuitively.
TSNE is a data visualization tool for data dimensionality reduction. It can reduce and visualize high-dimensional data VOLUME 10, 2022  and intuitively display the distribution of data samples. As shown in Figure 5, the fundus image features after passing through the GATL network are clearly divided into two categories. The same effect was also shown on another dataset as shown in Figure 6, which shows that GATL uses GCN to extract potential features between samples, which improves the classification accuracy.

5) PERFORMANCE OF CONFUSION MATRIX
In order to analyze the classification performance of GATL and adjust the parameters, we draw confusion matrices as shown in Figures 7 and 8 based on two public datasets. The confusion matrix can quickly visualize the proportion of various misclassified categories into other categories, which can help researchers adjust subsequent models, such as setting weight attenuation for some categories. From Figure 7 and 8 we can observe that the misdiagnosis rate of GATL is very low, which is of great significance in practical applications.

6) PERFORMANCE OF LOSS GRAPH
To imply the model training performance, we visualize the graph of loss function from start to model convergence in the training process. Figure 9 summarizes the training loss curves for APTOS2019 and EyePACS datasets, showing the model begins convergence at 10 epochs and remains stable in the following training steps. This visualization further proves the model robustness of our GATL approach, bringing it into correspondence with above-mentioned visualizations.

7) FEATURE MAP VISUALIZATION FOR INTERMEDIATE STEPS
To visualize the sample images in the intermediate steps, we generate feature maps from the first convolutional layer for an abnormal retinal image. As shown in Figure 11, it is key evidence to visualize the shape and texture information seen by the convolution layers. That demonstrates the proposed GATL can explicitly identify and recognize patterns inside the network models.

8) CLASS ACTIVATION MAP VISUALISATION
To further evaluate the performance of GATL, we visualized the Class Activation Map (CAM) of the features extracted by GATL, as shown in Figure 10. CAM is a tool that helps researchers [51] visualize CNNs. It can clearly show the image regions that the network is focusing on. DR can present with microaneurysms, hemorrhages, hard and soft exudates, and microvascular abnormalities within the retina, which are usually in the vicinity of blood vessels. The CAM diagram shows that the GATL network's attention is concentrated on the lesion region surrounding the vessel in the DR image, demonstrating that GATL is capable of accurately localizing the vessel location for lesion identification.

E. FURTHER ANALYSIS
In this section we analyze some factors that affect GATL performance.

1) THE IMPACT OF PRE-TRAINING
In order to explore the effect of pre-training on the model effect, we compared the effect of pre-training and  non-pre-training models. As shown in Figure 12, the red line represents the change process of the classification accuracy with the increase of epoch after adding ImagNet pre-training. The blue line represents the result of not adding pre-training. From 12, it can be observed that the accuracy of the pretrained GATL is higher than that of the non-pre-trained model at the beginning of training, which is because the pre-trained network model already has the potential to extract simple features. After a period of training, the accuracy of the pretraining model has gradually grown from 68% to 90%. At the same time, the accuracy of the un-pre-trained model after training increased from 60% to 83%. This means that pretrained models can converge faster and achieve good results. In subsequent training, the accuracy of both the pre-trained GATL and the non-pre-trained network gradually leveled off, but the accuracy of the pre-trained GATL was higher than that of the non-pre-trained model. This demonstrates that the pre-trained network can extract essential characteristics VOLUME 10, 2022 and, to some part, demonstrates the superiority of transfer learning itself, implying that using transfer learning to DR classification is efficient. GATL can effectively use the source domain data, transfer the knowledge of the known domain to the new domain, and ensure that the good effect of the model can be further improved. In the diagnosis of some clinical diseases, experts need to spend a lot of energy to screen the medical results of patients, which causes labeled data to be precious and challenging to obtain. Our GATL reduces the model's dependence on labeled data to some extent.

2) IMPACT OF THE GRAPH NEURAL NETWORK
We then analyze the influence of the GNN module. We removed the GNN module when performing experiments on the EyPACS data set and finally reduced the dimensions of the extracted features and made TSNE maps, as shown in Figure 13. It can be observed that after removing the GNN module, the number of fuzzy samples has increased a lot. This shows that the graph neural network is of great significance for the extraction of latent features. The GNN module plays an essential role in GATL.

3) THE EFFECTIVENESS OF DISCRIMINATOR D
The discriminator D provides the domain-differentiation ability, which is optimized by loss L dis . To evaluate the effectiveness of discriminator D, this paper removes L dis in the final objective function, and it obtains accuracies of 91.8% and 92.7% on APTOS2019 and EyePACS datasets ( Table 2 and  Table 3), respectively. That states the discriminator improves the accuracy performance of 2.5% and 2.9% on both datasets, and verifies the necessary of the discriminator D.

4) THE IMPACT OF MULTIPLE CLASSIFICATIONS
To explore whether the proposed GATL is generalizable in a multiclassification task, we conducted experiments on DR   with five classifications and collated the results into Table 4 for comparison with other methods. It can be observed that GATL also achieved good results on the multiclassification task. 83.5%, 69.2% and 97.8% were achieved in terms of accuracy, specificity and sensitivity, respectively. Of these, GATL achieved the best results in terms of accuracy and specificity, suggesting that GATL has general ability and a low rate of misdiagnosis.

5) THE IMPACT OF K
In our model, k is a significant adjustable parameter of the relationships initializing module, which has a huge impact on accuracy. In order to explore the effect of changing k values on the model, we took different k values for several experiments when performing binary classification under the APTOS dataset. Limited by the memory of the graphics card, we only set k empirically between [2], [7], and the results are shown in Figure 14. It can be observed that the accuracy of GATL reaches its highest value when k is 4, so for the experiments, we set k = 4.

6) THE IMPACT OF λ
To analyze the effect of the balance coefficients on the experiment, we adjusted and recorded the accuracy of GATL with different parameters during different training sessions as shown in Tables 5, 6, and 7. As can be observed from Table 5, the model achieves its best results in Step 1 when λ = 0.4. From Table 6, it can be observed that in Step 2, to maximize the differentiation of classifiers F1 and F2 while ensuring accuracy, the model achieves the highest accuracy and sensitivity when λ 1 = 0.4, λ 2 = 0.4, and λ 3 = 0.2. In Step 3, as seen in Table 7, GATL achieves the best results when λ = 0.4.

7) THE IMPACT OF SIMILARITY METRICS
Many studies have found that in some data the Euclidean distance does not reflect the true distance between features very well [56], [57], [58]. In order to analyze which similarity measure is more suitable for GATL in DR data, we experimented with three similarity metrics on APTOS dataset and recorded the variation in model performance in Figure 15. It can be observed that the use of cosine and Euclidean distances has little effect on model accuracy, but the sensitivity is relatively low when GATL uses cosine distances to calculate feature similarity. This implies that the numerical differences are more prominent for lesion characteristics in DR images than the dimensional differences. In addition to this, we experimented with the Minkowski distance, but the adjustment of the parameter p in the Minkowski distance was difficult. In the end, we empirically set p to 1.5, and the results are shown in Figure 15. The experimental findings demonstrate that the Euclidean distance is more suited for determining how related DR image features are.

V. CONCLUSION AND DISSCUSSION
This study proposes a novel graph adversarial transfer learning method to achieve DR classification via the domain adaptation structure. In order to successfully employ the extra information for DR classification of unidentified data, we first provide a unique graph-based domain adaption strategy for investigating additional fundus image characteristics. Second, we create two adversarial classifiers for use in training the model. This enhances the model's classification performance in addition to strengthening its robustness. Finally, we design a discriminator and obfuscate the discriminator's judgment of the source and target domain data such that the model may be aligned between domains on top of the intradomain alignment. Specifically, GATL combines the characteristics of convolutional neural networks and graph neural networks to extract the pixel features and potential features of fundus images. Under the guidance of adversarial training, the goal is to align within and between domains. Extensive results on two public datasets show that our GATL method is better than some baseline DR classification methods.
GATL is well compatible with a wide range of data, making more efficient use of existing data while still maintaining a high accuracy rate for classifying DR. However, the only regret about GATL is that it cannot take into account the fine classification of lesions. Because of the small size and similarity of DR lesions, GATL can only determine with high accuracy whether a lesion is present or not, but further analysis is required to determine the degree of illness. We hope to overcome this problem in our next research work and contribute to the fine classification of DR.