A novel diabetic retinopathy detection approach based on deep symmetric convolutional neural network

Diabetic Retinopathy (DR) may lead to blindness in diabetic patients, which is one of the most severe eye diseases. Therefore, using automatical technology to detect DR at the early phase has very vital clinical significance. In order to detect the microaneurysms (MAs) and hard exudates (HEs) of DR, a novel detection method based on deep symmetric convolutional neural network is proposed in this paper. The symmetric convolutional structure is used to improve the effectiveness of feature extraction. The proposed method also can overcome the imbalance of positive and negative samples to avoid overfitting by increasing the width and depth of the network. Furthermore, different network structures (convolution, pooling) are used to achieve different feature filtering in the stage of feature extractions. According to the experimental results, the proposed method is superior to the state-of-the-art approach on the public dataset DIARETDB1 (DB1). The detection accuracy of the objects is 92.0%, 93.2%, 93.6%, when using different filtering structures (convolution, max-pooling, ave-pooling) respectively. The detection of microaneurysms is much improved by using ave-pooling layer for feature filtering, and the max-pooling layer can improve the detection of hard exudates.


I. INTRODUCTION
A T present, DR is one of the most serious causes of blinding eye diseases, which become the primary cause of blindness in adults between 20-74 in the world [1]. The detection of diabetic lesions as soon as possible is the most effective way to prevent blindness. The clinical diagnosis is mainly performed by ophthalmologists on the fundus images of patients now. With the development of automatical technology, more method base on machine learning and deep learning methods are used to detect DR which achieve good performance. The methods based on machine learning generally extract the object's features manually by using machine learning methods, such as support vector machines (SVM), knearest neighbors (K-NN), etc. The classification performance highly depends on feature extraction algorithms, which are complicated and difficult. The methods based on deep learning extract features automatically, which can better express the object features. The number of fundus image samples is limited, but the number of positive and negative samples is very unbalanced. In order to solve the above problem, a deep learning network based on a symmetric convolutional structure proposed in this paper to enhance the feature extraction capability, extract more complex features to improve the feature representation, which can better distinguish different kinds of lesions. Experimental results show that the ave-pooling for feature filtering can better improve the detection ability of microaneurysms, and the max-pooling can significantly improve the detection ability of hard exudates.

II. RELATED WORK
The automatic detection of DR mainly uses traditional machine learning and deep learning techniques.
Traditional machine learning methods mainly include three stages: data pre-processing, manual feature extraction and classification. R. S. Biyani and B. M. Patre using a clustering approach for exudates detection in the screening of diabetic retinopathy. Ganjee et al. extracted the candidate regions of microaneurysms using the Markov Chain (MC) method and detected them based on Gaussian distribution, density and other characteristics [2]; Halloy et al. The region of interest (ROI) is extracted from the hard exudate using Gaussian space and mathematical morphology, and then the Support Vector Machine (SVM) Classifier is used to classify hard exudates and soft exudates [3]. Mobeen et al. presented a method that preprocesses the images with the technique of histogram equalization of different objects from retina images, and then uses discrete wavelet transform (DWT) to transform the spatial domain data into course and data details. They found it more convenient to detect bleeding. After the feature extraction techniques are applied on the wavelet transform coefficients, Support Vector Machines(SVM) and k-Nearest Neighbour(K-NN) are used to further classify the image features extracted from the coefficient matrix [4]; K.M. Adal et al. proposed a technique for detecting red lesions in retinal images. This technique uses the characteristic that red lesions are the main cause of retinal changes and detects targets through small retinal features. The variants associated with diabetic retinopathy were further identified by SVM classifier, which uses the shape characteristics and intensity of diabetic retinopathy [5]. Because the early symptoms of diabetic retina mainly include microaneurysms and hard exudates, in order to detect the lesions more specifically, we use the ave-pooling operation for feature filtering when detecting microaneurysms, and the max-pooling operation for feature filtering when detecting hard exudates, so that the network model can better distinguish different lesion types. In recent years, computer vision has widely used deep learning for object detection. Budak et al. firstly extracted the object of microaneurysms with Gaussian filtering and other technologies, and took the extracted regions as the input of the convolutional neural network (CNN) model to classify the microaneurysms [6]. Omar et al. used the Local Binary Pattern (LBP) method to extract the texture features of the hard exudates, and used these features as the input of the artificial neural network (ANN) model to detect the hard exudates [7]; Tan et al. built a ten-layer fully CNN model to detect microaneurysms, hemorrhages, and hard exudates for each pixel of the fundus image [8]. V.Sudha et al. worked on a VGG-19 deep neural network that was trained using a feature set derived from the KAGGLE fundus image dataset. In their research, segmentation methods have been proposed to detect the retina defects such as hard exudates, microaneurysms and bleeding from digital images of the fundus and then divided into four grades. Subsequently, a rectified linear unit (ReLU) layer and a max pooling layer are added to each stacked convolutional layer [9]. Kwasigroch et al. introduced a method to automatically detect diabetic retinopathy based on deep learning. This method presented integrates a special class coding method during the training phase of convolutional neural networks. Quadratic weighted kappa kernel was calculated between the score of the dataset and the predicted scores expected to analyze the performance of the designed model. However, in this paper, symmetric convolutional structure is adopted to improve the feature extraction ability of the model. It is used to increase the width and depth of the network to overcome the problem of positive and negative samples imbalance and avoid overfitting of the model.

III. THE PROPOSED METHOD A. SYMMETRIC CONVOLUTION STRUCTURE
Different kinds of lesions in the fundus images detected in this paper have great differences in shape, color, size and other characteristics. The microaneurysms are small in size, more regular in shape, showing small red dots, while hard exudates have different sizes, mainly showing highlighted irregular shape, as shown in Figure 1. In order to extract more complex and higher-dimensional feature information, the proposed method uses a symmetric convolutional structure to enhance the model's ability to locate and classify the targets. The symmetric convolutional structure increases the depth and width of the network in order to overcome the imbalance of the number of positive and negative samples and avoid the overfitting of the model. Therefore, 1×1 convolutional kernel is selected for this structure in order to introduce more nonlinearity and reducing parameters, which can improve the generalization performance of this model [10], as shown in Figure 2. Rectified Linear Unit(ReLU) function is selected in the network in this paper. The pooling layers are to extract the main information of the targets. The shape of the microaneurysms is relatively regular, and its location is mainly concentrated in the center of the sample. In addition, the average pooling operation can retain more characteristics of the local center information of samples. In order to improve the detection performance of microaneurysms, the avepooling operation was used in the network to filter the target features during the detection of microaneurysms. The hard exudates occupy a large area in the sample and have irregular shapes. With the max-pooling operation, more parts of the sample containing hard exudates can be retained. Therefore, in the detection of hard exudates, the max-pooling operation is used to filter out some useless features.

C. THE PROPOSED NETWORK
The method proposed in this study mainly consists of three stages, which are the pre-processing stage, feature filtering and feature extraction, and classification stage, respectively. Figure 3 shows the framework of the method proposed in this paper. In the pre-processing stage, the fundus image is separated into red, green and blue channels, respectively. Compared with the red and blue channels, the green channel includes much more image information and can better represent different kinds of lesions in fundus images [11]. Therefore, the green channel is selected as the input of the network in this paper. In the feature filtering module, the convolutional layer, the max-pooling layer, and the avepooling layer are selected to filter and extract the features of the lesion, respectively. Table 1 and Table 2 are the parameters for selecting the corresponding network layer. In the classification stage, the SoftMax classifier is used to classify targets.

B. EXPERIMENTAL SETTINGS
The experiment in this paper was carried out on a PC with an Intel Core I7-6700 CPU and a working frequency of 3.40Ghz. Pre-processing and sample set construction are realized by MATLAB2016b. This experiment uses the deep learning framework CAFFE [12] to design the network model. This method is trained and tested on the public database DIARETDB1 (DB1) [13] which contains 89 color fundus images, all images are taken by fundus cameras, and the image size is 1500×1152. In this database, different lesion types and their positions in the fundus images are annotated by an ophthalmologist. In the pre-processing stage, the center of microaneurysms and hard exudates in the fundus image is taken as the sample center for sampling with a size of 27×27. The sample set consists of three different types of sample patches, which are microaneurysms, hard exudates and backgrounds. In this paper, the areas excluding the microaneurysms and hard exudates in the fundus images are called the background. Table 3 shows the construction of training and testing sample sets.

C. EVALUATION
To evaluate the detection performance of the proposed method on different types of lesions in fundus images, ac-curacy, sensitivity and specificity are used in this paper to quantify the detection performance of different targets. The calculation method is as follows.

D. EXPERIMENTAL RESULTS
To compare the influence of different feature filtering operations, this paper sets up three groups of experiments, which select the convolutional layer, the max-pooling layer, and the ave-pooling layer to filter and extract sample features, respectively. The detection sensitivity, specificity, and accuracy of three groups of experiments are shown in Tables 4,  5, and 6, respectively. The experimental results show that the accuracy of the network model using the pooling layer is better than the model using the convolutional layer, and the detection sensitivity of hard exudates in the experiment with the max-pooling layer is significantly higher than the model with the ave-pooling layer. The detection sensitivity of microaneurysms in the ave-pooling model is slightly higher than that of the max-pooling. Table 7 lists the comparison of the detection results of microaneurysms in the DB1 database between the proposed methods using ave-pooling to filter and extract features in this paper and some existing methods. Table 8 lists the comparison of the detection results of hard exudates in the DB1 database between the proposed method using maxpooling in this paper and some existing methods. Table 9 lists the comparison between the results of the algorithm proposed in this research and the results of different methods that simultaneously detect multiple types of lesions. Furthermore, the methods listed in the table include both traditional machine learning methods and deep learning methods. The results show that the accuracy, sensitivity and specificity of the proposed method in this paper are better than those of most comparative methods. This paper also uses FROC [14] to evaluate the performance of this method model, which describes the relationship between the sensitivity of different kinds of lesions and the average number of false positives on each image, as shown in Figures 4,5,and 6, which are FROC of the three groups of experiments in this paper. The number of training iterations of the proposed method is set 2000 in this paper. Figure 7 shows the change of the loss value with the increase of the number of iterations.  [14] 0.72 -- Manjaramkar et al.(2016) [15] 0.801 0.975 - Ram et al.(2011) [16] 0.885 --   [17] 0.960 0.920 -

V. CONCLUSION
A deep convolutional network proposed in this research is based on a symmetric convolutional structure to detect different kinds of lesions in diabetic retinal images. The symmetrical convolutional structure can extract more complex lesion features and significantly improve detection performance. In this paper, convolution, max pooling and average pooling layers are selected in the feature filtering module for the experiment, and the experimental results of the three groups are compared. The experimental results show that the overall accuracy of the pooling operation is higher than that of the convolutional operation. Moreover, when the max-pooling operation is selected, the detection performance of hard exudates is better, with sensitivity is 97.1% and specificity is 96.8%. When the average pooling operation is selected, the detection performance of microaneurysms is better, with sensitivity and specificity of 91.0% and 96.3%. In future work, we will further modify the model to detect more objects simultaneously and accurately in diabetic retinopathy images.