Research on Medical Image Classification Based on Machine Learning

,


I. INTRODUCTION
With the increasing demand for faster and more accurate treatment, medical imaging plays an increasingly important role in the early detection, diagnosis and treatment of diseases. Thanks to the development of physics, electronic engineering and computer science and technology, the resolution of medical image is higher and higher, and the image mode is more and more abundant. At the same time, the number of medical image is increasing rapidly. At present, X-ray imaging, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET / CT), ultrasound imaging are widely used in clinic.
The key to achieve accurate diagnosis and treatment is the accurate interpretation of medical images, but the interpretation of images highly depends on the subjective judgment of doctors, so doctors at different levels have great deviation on the results of image interpretation. In recent years, The associate editor coordinating the review of this manuscript and approving it for publication was Wei Wei .
with the emergence of a large number of labeled natural image data sets and the breakthrough of deep learning in computer vision, image classification, target detection and image segmentation have been significantly improved. Many researches have been carried out on early disease detection and diagnosis based on supervised learning, Ciresan [1] applied deep neural network to medical image analysis, which played an important role in skin cancer classification [2], breast cancer diagnosis [3], brain tumor segmentation [4].Hinton improved the deep convolution neural network and applied it to medical image analysis [5], [6]. The experimental results of Hafemann et al. [7] show that the feature extraction ability of convolutional neural network is better than that of traditional texture descriptors, and the accuracy of image recognition is improved by maximizing pooling, multiplying, summing and other operations. Because the convolution neural network can only segment the general contour of the object, and the result of edge processing is poor, Jonathan proposed the full convolution neural network based on the convolution neural network, which extends the VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ classification of image level to pixel level [8]. Olaf improved the full convolution neural network, proposed the u-net network, and applied it to the segmentation of neuron structure in microscope image [9]. Rania used u-net image for breast ultrasound image segmentation, and removed noise area from segmentation results [10].
In addition to the field of medical image classification, deep learning has also made significant progress in the fields of medical image detection [11]- [13], image segmentation [14], [15], image registration [16], [17], image fusion [18], image retrieval [19], [20], image annotation [21], image generation and enhancement [22], [23], computer-aided diagnosis and prognosis [24]- [26]. Deep learning has made significant progress in medical imaging, because of the superior performance of deep learning in data hierarchical feature extraction and feature learning, on the other hand, the large number of labeled data sets. However, in practical application, medical image annotation is significantly influenced by medical professional knowledge, medical industry standards, medical system, etc. The data scale of labeled data set is far from the scale needed, and it needs huge human and material resources for medical image annotation. Therefore, the classification and detection of medical images have been widely concerned by scholars when there are few labeled samples.
In order to improve the accuracy of image detection, saliency detection is widely used in target detection, scene classification, fixation prediction and other visual tasks. In the aspect of feature significance detection, Achanta proposed the local contrast method [27], Hou and Zhang [28] proposed the spectrum residual method, and Harel et al. [29] proposed the significance detection based on the graph method. With the development of new features and deep learning, especially the emergence of convolution neural network, cyclic neural network and other deep neural network models, it breaks through the limitations of manual feature extraction in the past, and learns the features from the image independently. Convolution neural network, cyclic neural network and other deep neural networks need a large number of learning samples. Because the labeled samples need certain medical knowledge and clinical experience, the medical image label samples are relatively lack, which is difficult to meet the needs of deep learning.
As a new machine learning model, adversarial neural network effectively combines the generation model and the discrimination model, optimizes the neural network training process through the dynamic game between them, and finally makes the whole network reach the optimal state of Nash equilibrium. Based on this, this paper proposes a semi supervised image classification method based on the generative adversarial network (GAN). In the case of limited labeled samples, a large number of unlabeled sample data is introduced to improve the accuracy of image classification. Combining labeled sample and unlabeled sample data effectively, the feature extraction and image of medical image are realized in the case of sufficient labeled samples and limited labeled samples Classification.

II. BASIC PRINCIPLE OF ANTAGONISM NEURAL NETWORK
GAN is a hot research field in machine learning in recent two years. It has been widely used in image super-resolution, image segmentation, text to image generation, image restoration and other fields [8], [30]- [32]. GAN is an unsupervised generation model [33] Based on the idea of zero-sum game in game theory put forward by Professor Goodfellow in 2014. Its process is shown in Figure 1. The generative adversarial network (GAN) originates from the two person zero sum game in game theory. It consists of the generating network and the judging network. It generates the network learning sample data for reconstruction, and the judging network estimates the probability that a sample comes from the real data. The model consists of two parts: a generator G and a discriminator D. The generator G maps the random noise z according to the specific distribution P z to the target domain, learns the probability distribution of the real samples P data , and generates the samples G(z) according to the distribution P data (x) of the real data as much as possible. The discriminator judges whether the input samples come from the real data x or the generated data G(z), so as to output a probability value belonging to the real data. The essence of counter neural network training is to make the discriminator recognize the samples generated by the generator as much as possible, and make the samples generated by the generator cheat the discriminator as much as possible. The two networks are optimized alternately in the training process, so as to form a competitive confrontation until both sides reach a dynamic balance.
The feed-forward propagation in GAN network is to transfer the generated samples from G network to D network. On the other hand, the back propagation process is that G network receives the gradient feedback from D network and updates it, thus completing the information flow between the whole networks. In the training process, in order to let the generator learn the probability distribution P data of the real sample, we assume the P x of the real sample, take the input z of the noise variable as the input, and get the mapping G(z) of the data space. Finally, the D(x) discriminators judge that the sample x is the probability from the real sample. When the input sample is a real sample, the goal of discriminator D is to make the output probability D(x) close to 1; otherwise, when the input sample is a false sample generated by noise, the goal of discriminator D is to make D(G(z)) close to 0, and the goal of generator g is to make it as close to 1 as possible. Therefore, the training of antagonistic neural H. Tang, Z. Hu: Research on Medical Image Classification Based on Machine Learning network is a binary minimax game problem [33], and the objective function is: min

III. MEDICAL IMAGE CLASSIFICATION BASED ON ANTAGONISTIC NEURAL NETWORK A. DATA AND PREPROCESSING
The images used in this study are from a data center of a hospital. The database includes 12000 CT images of brain, chest and cervical spine. In order to improve the generalization performance of the model and eliminate the image imbalance, the image was amplified and normalized. Taking brain CT as an example, the CT dataset of the brain includes 5 common brain diseases with high incidence rate in the brain CT images. Taking into account the actual application, 8 types of data are added to normal brain images without brain diseases and other brain diseases with the exception of the 5 high incidence rate. The treatment of chest CT and cervical CT is the same as above. In order to ensure the same size of image input, the image size is unified to 128pixel * 128pixel. Finally, the selected data set is divided into training set, verification set and test set according to 7:1:2. The training data is divided into labeled sample data and unlabeled sample data, and the labeled sample data is far less than unlabeled sample data. After the network training, the new medical images are divided into normal and sick by using the network, and the classification results of the network are counted.

B. NETWORK STRUCTURE
Based on the above network structure model, aiming at the problem of the scarcity of labeled samples in medical image classification, this paper uses GAN network to classify medical images. At present, as a widely used model and branch of deep neural network, convolution neural network has strong processing and representation ability for complex data, which can effectively extract robust and invariant features from big data, and then is conducive to the subsequent image classification. Therefore, this paper uses the combination of convolution neural network and antagonism neural network to generate antagonism network Deep Convolutional Generative Adversarial Networks (DCGAN) [34]. DCGAN is to replace generator and discriminator in antagonism neural network with convolution neural network. In order to improve the feature extraction ability of the network and the stability of model training, the generator g and the discriminator D are adjusted as follows: (1) Cancel all pooling layers. The generator network uses the transposed convolutional layer for up sampling, and the discriminator D network adds the string convolution to replace the pooling.
(2) Using batch normalization in distinguishing network D and generating network G.
(3) Remove FC layer and make the network become full convolution network.
(4) The generated network uses Relu as the activation function, and the last layer uses tanh activation function (5) The LeakyRelu activation function is used in the discrimination network. The network diagram of generator g in DCGAN is shown in Figure 2 below: FIGURE 2. Generator network structure [34].
The discriminator network is mainly completed by the full connection layer, which mainly uses the output of the network of feature extraction as the input, and then performs multiple nonlinear mapping on the features. Multiple convolutional neural networks are used to further extract the previous features, and finally the global equalization layer gives the image classification results. In this stage, we should try our best to ensure that the classification model has a high classification accuracy, so that the features extracted from the convolution layer are more discriminative, and then integrate the output features of multiple network layers, and finally give the results of image suspicious area marking.

C. NETWORK MODEL TRAINING
The classification network mainly includes the training based on the generation of the network and the training based on the convolution neural network. The network input generated against network training is the result of real pathological effect of brain CT and even sampling. Firstly, the 100 dimensional samples are mapped through the full connection layer to obtain 1024 feature graphs of 4 * 4 in size. Then, the feature graphs are upsampled through the upsampled network with strides 2, followed by the nonlinear transformation of the network output using the activation layer. The resulting image is 64 * 64 * 3. Mark the generated image as 0, the original input image as 1, and then use these marked data to supervise the training of the discriminator. The training process adopts the algorithm of random gradient descent.
In the actual training process, because the analytical solution can't be obtained directly, the iterative numerical method is used to train the model. Because the limited data set may lead to the over fitting of the discrimination network, we choose k-training discriminator D, 1-training generator g, and the selection of K depends on the complexity of discriminator D and the size of data volume. In the actual experiment, K is taken as 4. As long as G transformation is slow enough, D remains near the optimal solution after every complete training. Finally, the parameters of GAN network model are trained through multiple training rounds.
The noise z conforming to the Gaussian distribution is sent into the generator network G to generate the generated image G(z) conforming to the real data as much as possible; the generated image sample G(z) and the database image sample are input into the discriminator network D together, the database sample data includes a small number of labeled sample data and a large number of unlabeled data. The discriminator D is composed of multiple convolution layers and full connection layers. Through continuous cross iterative training, the generator g gradually fits the distribution of database image samples to generate realistic image samples. Meanwhile, the discriminator D's classification and discrimination performance for input is constantly improved.
Through the training in the above stages, a stable feature extraction network is obtained. Then the fixed feature extraction network is trained separately. The input image size is 64 * 64 * 3. After the feature extraction network, the feature size is 3 * 3 * 64. Finally, the feature is mapped to 64 dimensions through the global level pooling layer, and then the convolution neural network and the global average pooling layer are used to complete the final classification task.

IV. TEST RESULTS AND ANALYSIS A. TRADITIONAL CLASSIFICATION METHOD
In order to compare the accuracy of traditional classification method and deep learning classification method for image classification, the traditional classification method adopts support vector machine algorithm, and deep learning algorithm selects neural network for confrontation, and then compares the image classification effect of SVM and GAN. The traditional pattern recognition based on classification method is divided into two parts: feature extraction and model selection. The ability of feature selection to express the image itself directly affects the effect of the final image classification. In order to ensure the accuracy of classification, here we select scale invariant feature transform (SIFT) feature as the description operator, and support vector machine as the statistical classifier.
In the process of image classification, the most important thing is to choose a suitable way to describe the characteristics of the image itself. Through the feature engineering of the image, the image is represented as a series of vectors. If the feature vector of an image has a similar pattern to the feature vector of the corresponding category, then the image can be determined as the category, and vice versa. Therefore, the ability of image feature vector to express image itself directly affects the performance of the final image classification system. Excellent image feature engineering should satisfy the uniqueness of feature description, better anti-interference and high efficiency of fast matching in a short time. Good uniqueness ensures that the feature description can capture the tiny changes of image structure sensitively. When the organizational structure of image data changes, even the structural changes of local small areas can also reflect the feature description sensitively. The better anti-interference can make the feature descriptors recognize the target category image data accurately when the image data changes in scale, rotation and noise. The efficiency of fast matching has certain requirements on the complexity of algorithm calculation and time complexity. If the dimension of description operator eigenvector is too high, the complexity of similarity matching between eigenvectors is high. Because STFT features have strong advantages in these three aspects, this paper uses SIFT features to describe the image.
SIFT feature, also known as scale invariant feature transformation, is an algorithm used to detect and describe local features of images. It has rotation invariance and scale invariance by finding extreme points as feature points in scale space. SIFT was proposed by Lowe [35], and is widely used in object recognition, robot map perception and navigation, 3D model building, gesture recognition, image tracking and action comparison. The basic idea of SIFT is to introduce a parameter which is regarded as scale into the image information processing model. Through finding the key points in different scale space, calculating the size, direction and scale information of the key points, using these information to constitute the key points to describe the feature points, to realize edge and corner detection and feature extraction in different resolutions. SIFT feature detection mainly includes the following four basic steps: 1) The construction of the scale space of DOG 2) Key search and location 3) Direction assignment 4) Key description. SIFT matching is the process of matching key points (feature points). The process is shown in Figure 3 below: Medical image classification is essentially a supervised learning problem in data mining. Firstly, the image data with known category is divided into training set and test set, and the parameters of statistical learning discriminator are adjusted through multiple iterations to achieve good classification effect in the training data set, then the classification accuracy of the discriminator is tested on the test data and test data. Support vector machine (SVM) is a typical statistical classifier, which can not only deal with linearly separable data samples, but also map low-dimensional non separable linear features to high-dimensional features through kernel function, so that they can be separable in higher dimensions. SVM is widely used in various supervised learning scenarios because of its powerful classification ability [36]- [38]. Therefore, SVM is used as the classifier of traditional classification methods. Support vector machines (SVM) is a kind of two classification model. Its purpose is to find a hyperplane to segment samples. The principle of segmentation is to maximize the interval, which is finally transformed into a convex quadratic programming problem to solve. In linear separable scene, we assume that the optimal hyperplane is w x + b = 0. We use | w x + b| to represent the distance from sample point x to the hyperplane, and compare the symbols of w x + b and y to determine whether the samples are correctly classified.
If w x +b = 0 is consistent with the Y symbol of label, sample x is classified correctly, otherwise, the classification is wrong. Therefore, when the samples are classified correctly, y( w x + b = 0) > 0, and vice versa, y( w x + b = 0) < 0, the function distance can be defined based on this. The function interval is as follows (1) X is used to represent the eigenvector of the training sample, y is used to represent the label of the sample, (x i , y i ) is used to represent the eigenvector of the ith training sample and the label of the ith sample. Therefore, the function interval is the minimum value of the function interval of the distance classification hyperplane of all training samples. The optimal hyperplane for linear separability is shown in Figure 4. When the classification hyperplane is far away from the classification data, the confidence of the obtained classification hyperplane is higher. We should try our best to maximize the distance between the hyperplane and all kinds of data samples. The function interval will increase with the increase of the proportion of ω and b. While the geometric interval eliminates the influence of ω, which is only related to the hyperplane itself. Therefore, the function interval should not be used to maximize the interval, but the geometric interval should be used. According to the principle of maximizing the geometric interval, the optimal hyperplane is obtained. The hyperplane represented by the solid line which separates the two kinds of samples is the optimal hyperplane. The distance between the two dashed lines parallel to it is the maximum geometric distance maxγ . The classification sample point on the dotted line is the support vector, which satisfies y i ( ω x i + b) = 1, and for other correctly classified sample points satisfies y i ( ω x i + b) > 1.
For the linear non separable problem, it is necessary to map data to a high-dimensional space so that it can be transformed into a linear separable problem. The mathematical model of SVM is as follows: where, (x i ) represents the sample feature after x i is mapped. The Lagrange multiplier method can be used to obtain the dual problem of the problem and further solve the mathematical model. The formula is as follows: where α i is the Lagrange multiplier. In the above formula, we need to calculate the inner product (x i ) T (x j ) after the sample is mapped to a high-dimensional space. Because of its high dimension, we need factor kernel func- That is, by directly bringing the low-dimensional features into the kernel function, the high-dimensional features are transformed, and then the inner product of the transformed features is obtained. Different kernel functions correspond to different feature mapping methods. Common kernel functions include linear kernel, polynomial kernel, Gaussian kernel, sigmoid kernel and Laplace kernel. Generally, linear kernel is used for text data, and Gauss kernel is used for other cases.
In practical application, there will be some noise points in the data, which can not be used to divide all samples correctly, so the concept of soft interval is introduced. The mathematical model is modified as follows: Among them, ξ i is the relaxation variable, indicating the degree of incorrect sample classification.

B. RESULTS AND ANALYSIS
In this paper, neural network and support vector machine are used to classify medical images. In addition, this paper explores the combination of neural network automatic feature extraction and traditional classification methods, which saves the work of artificial feature engineering, such as CNN-SVM algorithm, RBF-SVM algorithm. Then, the image classification effects of GAN, convolutional neural network, RBF-SVM algorithm, PCA-CNN, CNN-SVM algorithm and SVM algorithm are compared. The comparison of different classification methods is shown in Figure 5.
Convolutional neural network (CNN) is one of the most representative neural networks in the field of deep learning technology. One of the advantages of convolutional neural network compared with the traditional image processing algorithm is that it avoids the complex pre-processing VOLUME 8, 2020 process of image. Especially in the process of image preprocessing, convolutional neural network can directly input the original image for a series of work, which has been widely used in various image related applications. The basic idea of RBF neural network is to use RBF as the ''base'' of the hidden unit to form the hidden layer space, so that the input vector can be directly mapped to the hidden space without weight connection. Principal component analysis (PCA) is one of the most widely used data dimensionality reduction algorithms. The main idea of PCA is to map the n-dimensional feature to the k-dimensional feature, which is a new orthogonal feature, also known as the main component. It is a k-dimensional feature reconstructed on the basis of the original n-dimensional feature. By comparing the classification accuracy of DCGAN, CNN, RBF-SVM, PCA-CNN, CNN-SVM and SVM in different situations, this paper analyzes the advantages and disadvantages of DCGAN compared with other algorithms in different situations.
Generally, we test the generalization performance of machine learning model through experiments. Specifically, the samples are divided into training set and test set. In the training set, the model is trained. In the test set, the discrimination ability of the test model to the new samples is used as the approximation of the model generalization performance. At present, there are three ways to divide training set and test set: hold out, cross validation and bootstrapping.
Set aside method is to directly divide data set into mutually exclusive training set and test set. About 2/3∼4/5 of the partition is used for training, and the remaining data is used for testing. Although the proportion of the division is roughly determined, there are still two points to be noted in the division. First, the proportion of different categories in training set and test set should be consistent to ensure the consistency of data distribution in training set and test set. In addition, even if the above two conditions are met, there will still be many ways of division that may lead to the deviation of results. Therefore, in the experiment, it can be divided many times, then trained on the corresponding data, and finally average the results of all test sets as the generalization error estimation of the final draft of the model.
The cross validation method divides the data set into n mutually exclusive subsets, which should also be divided hierarchically to ensure the same proportion of different categories in each subset. Then N-1 subset is used for training each time, and the remaining 1 subset is used as the test. In this way, the subset of test can be changed, so that N training and test division can be carried out, and N models can be trained. The final result of the model is the average of the results on these N test sets. Generally, n is taken as 10, which is called 10 fold cross validation.
The self-service method is to sample the original data again, that is, the samples collected before may still be collected in the subsequent sampling process, but the samples not collected are used as tests. Assuming that the samples put back to M, we can get the data sets of M samples. It is estimated that about 36.8% of the samples have never been sampled.
Taking all aspects into consideration, this paper uses the method of reserve to divide training set and test set. In addition to data set partitioning, the problem of class imbalance is often encountered in machine learning. The category imbalance refers to the situation that the sample quantity gap is too large in different categories. In general, there are less positive samples and more negative samples. In this paper, the contribution of different classes in loss function is modified to deal with unbalanced data. First, we modify the contribution of different categories in the loss function, that is, the contribution value of a positive example error to the loss function is higher than that of a negative example error. If the number of positive cases is M and the number of negative cases is N, then the contribution ratio of positive cases and negative cases to the loss function is N/M: 1.
There are many indicators to represent the accuracy of classification, among which the most commonly used are confusion matrix, overall classification accuracy and Kappa coefficient. The confusion matrix can clearly see the number of correct classification of each figure, as well as the wrong classification and number. However, the confusion matrix can not see the classification accuracy at a glance. Therefore, a variety of classification accuracy indicators are derived from the confusion matrix, among which the overall classification accuracy (OA) and kappa coefficient (Kappa) are most widely used [39], [40].
Overall accuracy (OA) refers to the ratio of the number of correctly classified class pixels to the total number of class pixels. Although OA value can well represent the classification accuracy, it has a great influence on the extremely unbalanced multi class terrain, and can not well represent each class terrain. Kappa coefficient refers to the proportion of error reduction between classification and completely random classification, and its calculation formula is as follows:  where, N --1 total number of pixels.
x i+ --Sum of each line of confusion matrix.
x +i --Sum of each column of confusion matrix.
x ii --confusion matrix diagonal elements.
In this experiment, in order to compare the classification accuracy and results of different methods more truly and comprehensively, the overall classification accuracy and kappa coefficient are used as evaluation indexes to evaluate the classification accuracy of six different feature extraction and classification methods. In order to compare and analyze the influence of different models and methods on the classification results more comprehensively, the proposed method is also combined with convolution neural network (CNN), Gaussian kernel support vector machine (RBF-SVM), convolution neural network (PCA-CNN), convolution neural network and support vector machine (CNN-SVM) after principal component analysis and dimensionality reduction, and the experimental results of SVM algorithm Ratio [41], [42].
The training and testing samples in this experiment are randomly selected from the whole data set. In order to reflect the performance of DCGAN model structure more comprehensively, we have adopted multiple groups of experiments with different training samples to verify. In addition, in order to eliminate the interference of accidental factors and make the test results more stable, we have carried out five experiments to obtain the overall average result Fruit. In the experiment, five different sizes of annotation data sets were defined, and the number of each kind of annotation samples was 2000, 4000, 6000, 8000, 10000 respectively. Table 1, table 2 and  table 3 respectively show the average accuracy of different algorithms for classification of brain CT images, chest CT images and cervical CT images in different quantitative labeling situations.
As can be seen from table 1, for brain CT image data set, when 1000 training numbers are selected, the OA of DCGAN is 99.57%, and its classification accuracy is higher than that of CNN, PCA-CNN and SVM. In addition, the proposed method  is superior to the traditional RBF-SVM and CNN-SVM in OA and kappa by 13.81% and 11.22%, respectively. When the number of selected samples is reduced to one-fifth of the original number, for example, 2000 training samples are selected, the convolutional neural network will have a reduction in classification accuracy due to the lack of training samples.
In this case, GAN shows better performance. It can be seen from the table that in the case of less samples, the method proposed in this paper has achieved better classification results compared with other classification methods. Therefore, GAN can reduce the dependence of neural network on a large number of training samples to a certain extent. In other words, GAN can effectively reduce the over fitting phenomenon of neural network. Table 2 describes the classification effect of different classification methods on chest CT images. In the same way as brain CT image processing, 2000, 4000, 6000, 8000 and 10000 samples were randomly selected as the training set of five experiments. It can be seen from the results that when only 2000 training samples are selected, the average classification accuracy of GAN is 63.43%, which is 1.9% higher than that of CNN's OA61.53%, and also higher than other classification methods. When the training samples are 4000, 6000, 8000 and 10000, the classification accuracy of GAN is not very different from other classification methods. On the other hand, this paper expounds that GAN can play a better role in the case of limited training samples.
From table 3, it can be seen that DCGAN can achieve the classification performance with CNN only with less annotation data. For example, GAN can achieve 74.10% with only 2000 samples, while CNN needs 4000 samples to achieve the corresponding classification accuracy. In the case of less labeling, GAN improves the classification performance obviously, but with the increase of the number of labeled samples, the improvement ability of classification performance gradually decreases, but it is better than CNN and other traditional classification methods in general.  Figure 6 shows the trend of the overall classification accuracy and Kappa coefficient with the number of samples. As can be seen from the figure, with the increasing number of samples, the classification accuracy and Kappa coefficient are increasing, and the growth trend of classification accuracy and kappa coefficient is the same. When the number of training samples is increased to 10000, the classification accuracy is improved by 30%, 35.1% compared with 2000 training samples. In the data set of chest CT images, the classification accuracy was improved by 30.63% and 29.11%, respectively. On CT images of cervical spine, the accuracy of classification increased by 18.93% and 24.32%, respectively. From the above results, we can see that in the deep learning model, the number of labeled training samples has a great influence on the optimization of network parameters. With the increase of the number of labeled samples, the classification performance of the network will be further improved. The medical image classification method based on deep learning firstly inputs the image data into the neural network, optimizes the error loss function by using the forward propagation and back propagation error algorithms, and obtains the model with the best classification effect by constantly updating the weights, then applies this model to the medical image recognition analysis, and the specific process is shown in figure 7. The traditional classification method based on statistical machine learning optimizes the feature extraction and classifier separately, while the image classification method based on deep learning can automatically learn the feature of training data, and quickly learn the feature representation of training data according to the training data, which saves the part of feature extraction. Therefore, the strength of deep learning lies in self feature learning, but there are problems that the training samples are not large enough. The dynamic game can automatically generate images against neural network, which just makes up for the shortage of neural network samples leading to the low accuracy of training results.
In order to compare the classification effect of different methods more intuitively, figure 8 further shows the comparison effect of different overall classification accuracy obtained H. Tang, Z. Hu: Research on Medical Image Classification Based on Machine Learning by selecting different classification methods under the three medical image data sets of brain, chest and cervical spine. The overall classification accuracy and kappa coefficient of different sample numbers are averaged. It can be seen from the figure that the classification effect of deep convolution antagonism neural network is slightly better than that of convolution neural network (CNN), Gaussian kernel support vector machine (RBF-SVM), convolution neural network (PCA-CNN) after principal component analysis and dimensionality reduction, convolution neural network and support vector machine combined method (CNN-SVM), SVM algorithm. The deep convolution antagonism neural network effectively combines the dynamic game idea of antagonism neural network to extract the features of the samples, reduces the dependence of deep learning on the labeled samples, and ensures the robustness of the model in the case of small samples. In the experiment, by learning the distribution of real data, the generator generates an image similar to the real data. By comparing the real data with the generated image, it is found that the generated sample outline is clear and the structure is reasonable.

V. CONCLUSION
Based on the weak feature extraction ability of traditional image classification methods and the lack of training samples in practical application, this paper proposes a semi supervised image classification method based on the generated antagonism network. In the case of limited labeled samples, this method effectively combines labeled samples and unlabeled sample data, and improves the result of less labeled samples The decision plane alleviates the problem of the scarcity of labeled samples. The image classification of brain, thorax and cervical spine CT images was carried out by using deep convolution antagonism neural network, and the overall classification accuracy and Kappa coefficient of different classification methods under different sample numbers were analyzed and found. The results show that with the increasing number of training samples, the overall classification accuracy and Kappa coefficient are increasing. Especially in the case of a small number of training samples, compared with other depth neural networks and traditional classification methods, the classification accuracy of counter neural networks is about 10% higher than other neural networks and traditional classification methods, and the advantages are more obvious. The rationality and validity of the classification model are verified by comparative experiments, which provides a new idea for medical image classification.