Multiple Lesions Detection of Fundus Images Based on Convolution Neural Network Algorithm With Improved SFLA

In order to effectively solve the problem of interlaced overlap in the fundus image lesions, large and small blood vessels packed densely and severely affected by light, and to achieve multi-label classification of fundus images. In this paper, a single population leapfrog optimization convolutional neural network algorithm (SFCNN) is proposed to detect and classify various fundus lesions. The algorithm uses the efficient search ability of the shuffled frog leaping algorithm to optimize the weight initialization and back propagation of the convolutional neural network. In order to deal with the problem of fundus image classification in the big data environment, the novel grouping optimization strategy is presented to effectively combine Spark platform and SFCNN algorithm to achieve large-scale fundus image classification and detection of multiple lesions. The experiment of the detection of fundus image lesions shows that the accuracy rate of SFCNN is better improved in both single lesion detection and overall detection, compared with other algorithms.


I. INTRODUCTION
Color fundus images are the most basic way to diagnose eye diseases [1]. At the same time, Fundus image can make people discover various eye diseases as early as possible, such as glaucoma, optic neuritis, macular degeneration, etc. This can facilitate timely and effective treatment [2]. Early diagnosis and timely treatment can effectively reduce the prevalence. However, due to the large population of China and the relatively limited number of ophthalmologists, relying solely on doctors to diagnose ophthalmic diseases requires a lot of time. For this reason, other methods are urgently needed for large-scale screening. Medical image analysis technology can not only greatly reduce the workload of doctors, but also has the advantages of being objective, fast and accurate [3].
Many scholars in the world are engaged in the research of retinopathy. In the detection of fundus image lesions [4], [5]: The associate editor coordinating the review of this manuscript and approving it for publication was Victor Hugo Albuquerque .
Ganguly et al. [6] proposed a detection method based on adaptive threshold for diabetic retinopathy with red lesions, a widespread ocular disease. The method can process each fundus image using thresholds of different intensities, so that it can be applied to fundus images with different quality and resolution acquired by different cameras. Balasubramanian et al. [7] used directional gradient histogram (hog) feature extraction and support vector machine classification method of retinal fundus image based on gabor filter blood vessel extraction to evaluate glaucoma detection. Liu et al. [8] proposed an automatic measurement system of cup disc ratio based on fundus image, which can automatically diagnose glaucoma. In terms of fundus image segmentation [9]- [11]: Samad et al. [12] used fuzzy C-means to cluster and detect the boundary of diabetic retinopathy object in fundus image, which can segment diabetic retinopathy efficiently and quickly. Hossain and Reza [13] adopted the vascular segmentation method based on MRF model, which was more effective than the highly complex image segmentation method. The deep learning method is also widely used in fundus image recognition [14]- [16]. Kele et al. [17] used the deep convolutional neural network method to automatically classify color fundus images for diabetic retinopathy. Prentasic and Loncaric [18] used convolutional neural network to detect the exudate in the color fundus image. The experiment proved that CNN can effectively detect the exudate in the fundus image.
Part of the above research is about the classification of a certain disease, such as the judgment and classification of eye diseases caused by diabetes or the detection of diseases such as glaucoma. In the other part, some algorithms are used to segment the exudates and blood vessels in the fundus image. These research results are excellent and meaningful. However, in some cases, the results of the above algorithms or methods are too single to provide specific retinopathy. Therefore, this study proposes an optimized CNN algorithm, which can identify a variety of lesions in the fundus image and provide reliable help for medical diagnosis. Many lesions in the fundus image may be intertwined at the same time, as shown in Figure 1. Moreover, few researches have proposed fundus image processing methods in the big data environment. Firstly, our study proposes an improved convolutional neural network that can identify and judge multiple lesions that are present in the eyeball, such as hard exudates and soft exudates. Fundus image content is complex, and images belonging to different categories may have high similarity. Detection and recognition of fundus images are affected by blood vessel deformation, image brightness, contrast, and other diseased areas [19]. If the traditional convolutional neural network is used for classification detection, it requires a large number of training samples, and it is easy to fall into a local optimal, and the training efficiency is low. Therefore, this study proposes a single-popping leapfrog algorithm to optimize the weight initialization and backpropagation of convolutional neural networks. The frog population is generated by gaussian distribution function, the position of the worst frog is updated constantly, the global optimal solution is found and taken as the initial weight of convolutional neural network. Then, the loss value of each training is monitored, and the abnormal gradient is corrected by the leapfrog algorithm. Finally, the final weight of the network is further optimized, and finally the high accuracy of lesion recognition is obtained. Next, since the training of convolutional neural network often needs a lot of iterative calculation, the time cost is even greater when facing the large-scale data set training. In order to solve this problem, this paper proposes Spark-SFCNN algorithm based on improved deep learning for the Spark platform. The algorithm uses the idea of group optimization based on Spark to train the data set in a distributed way, then summarizes the weights of all the training and optimizes them by evolutionary leapfrog, and takes the optimal weights as the initial values of the next group training. By this way, the uncertainty of traditional average method can be avoided effectively, and the distributed training of convolutional neural network is more effective and reliable. The experimental results show that the performance of Spark-SFCNN algorithm is much higher than that of the general algorithm in large-scale data training on Spark platform.
The rest of the paper is structured as follows. In the second section, we introduce the convolutional neural network algorithm and SFLA. In the third section, CNN model based on single population leapfrog optimization is introduced. In the fourth section, we propose a packet optimization strategy and introduce Spark-SFCNN algorithm. In the fifth section, we analyze the experimental results of fundus image classification. At last, we summarize the thesis in the ninth section, and discuss the current challenges and future research directions.

II. RELATED ALGORITHM A. CNN MODEL
Convolutional neural network is a multi-layer feedforward network [20], which was proposed by Lecun et al. [21] in 1989. In recent years, with the rapid increase in the amount of annotated data and the huge improvement in the performance of graphics processor units, the research on convolutional neural networks has risen rapidly and has achieved the latest results in various tasks [22]. Moreover, CNN is a deep neural network model containing convolutional layers, which has become a current research hotspot in the field of speech analysis and image recognition [23].
The time delay network and LetNet-5 are the earliest networks [24]. The classic LetNet-5 is shown in Figure 2. Structurally, the convolutional neural network includes an input layer, a hidden layer and an output layer. The hidden layer includes a convolution layer, a pooling layer, and a fully connected layer. The main function of the convolutional layer is to extract the feature of the picture, which contains several convolution kernels. The convolution kernel includes a matrix composed of weight coefficients and a deviation value. The convolution calculation formula is as follows: where b represents the deviation value, G l represents the input of the convolution at the l+1 layer, G l+1 represents the output of the convolution at the l + 1 layer, L l+1 represents the size of G l+1 , G l+1 represents the parameters of the convolution layer, s 0 and p respectively represent the size and step size of the Filter. After a convolution calculation, the output feature map enters the pooling layer after being calculated by the activation function. The pooling layer is used to compress the results, reducing the spatial size of the data and the number of parameters in the network, so that the computing resource consumption is reduced and overfitting is effectively controlled. Pooling is also called downsampling, denoted as S = down(C). Common pooling operations mainly include maximum pooling and average pooling. Among them the most widely used is the maximum pooling [25]. Maximum pooling is shown in Figure 3.
After the multi-layer convolution pooling calculation, the output feature map will be sequentially expanded in rows, connected into a vector to enter the fully connected network and then output through the multi-layer neural network. Like BP neural network, convolutional neural network also uses gradient descent to perform weight update and backpropagate.

B. SHUFFLED FROG LEAPING ALGORITHM
The Shuffled Frog Leaping Algorithm (SFLA) is an emerging and effective sub-heuristic group computing technique in the field of evolutionary computation [26] and a meme calculus algorithm combining genetics [27], which mainly consists of two parts: local Search and global information exchange. Local search and global information exchange continue until the convergence conditions are met [28]. The basic workflow of SFLA: A frogs group is randomly generated, and the i-th frog is represented as f i = [x 1 , x 2 , . . . . . . , x s ], and s indicates the dimension of each frog. After the frogs are generated, all frogs are sorted in descending order according to the fitness value, and divided into M populations. The division rules are: Each population contains N frog, with F = M × N . After the population is divided, the global optimal frog f g is recorded, and the local search is carried out in each population. After the j-th local search, the worst adaptive frog in the population is expressed as f w (j), and the best adaptive frog is expressed as f b (j). The worst frog in each subpopulation is updated as follows.
where Rand() represents a random number between 0 and 1, and D max represents the maximum frog jumping distance. After the frog position is updated, if there is no improvement, then f g is substituted for f w (j). If there is still no improvement, a new frog is randomly generated instead of f w (j). After all populations have been searched U mem times, all frogs are mixed together, then reordered and subdivided into subpopulations according to the fitness value, so that the meme information of the frogs in each population can be fully transmitted, and then the local search is continued. Repeat the above steps until the convergence conditions are met and the algorithm ends.

III. CNN MODEL BASED ON IMPROVED SFLA
In view of the complexity of fundus images, this study introduces a single-population frog hop into the convolutional neural network weight initialization and weight update. By constantly updating the position of the worst frog, the global optimal frog is found as the initial weight of the network. During the network iteration process, monitor the loss value calculated each time, and correct the network weights that generate abnormal loss values through the frog jumping algorithm. The algorithm flow is shown in Figure 4.

A. WEIGHT INITIALIZATION BASED ON SINGLE-POPULATION LEAPFROG ALGORITHM
In convolutional neural networks, the convolution kernel and the fully connected layer weights are generally generated in a random manner. Because multiple lesions in the fundus image may exist at the same time, they are irregularly distributed, and the healthy fundus image is not clearly distinguished from the lesion fundus image. As shown in Figure 5. When the weights generated by the network are too bad, the network needs to spend more time and samples to perform gradient descent operation to correct the weights, and it is also easy to make the network fall into local optimization and affect the final result. Aiming at this problem, this paper proposes to introduce a single population leapfrog algorithm to initialize the weights of convolutional neural network.
Shuffled frog leaping algorithm is a sub-heuristic swarm evolution algorithm with excellent global search ability. It can effectively calculate the global optimal solution. However, 97620 VOLUME 8, 2020  directly applying it to the initialization of weights in this study will generate a large number of calculations, increase the time cost of the overall algorithm operation, and affect efficiency. So this study by simplifying hybrid leapfrog algorithm, single populations leapfrog algorithm is proposed. Different from the hybrid leapfrog algorithm, we no longer divide the population of the initial samples, but treat all the frogs as a population and directly conduct global optimization, and make some improvements on the leapfrog rules.

B. WEIGHT OPTIMIZATION METHOD BASED ON SINGLE-POPULATION LEAPFROG ALGORITHM
The traditional CNN backpropagation is essentially a gradient descent method, which updates the network parameters by the loss value calculated by forward propagation to find the optimal solution. However, the fundus image has problems such as dense blood vessels, overlapping interlaced lesions, and some lesions are very small. In addition, the illumination is severely affected by the shooting. Therefore, it is difficult to identify various lesions in the fundus image. The traditional gradient descent algorithm is prone to abnormal situations with large changes in the loss value, which affects the execution efficiency of the algorithm, and even causes problems such as the algorithm's convergence to a local optimum. In view of the above problems, this study uses a single population frog jumping algorithm to correct the poor gradient of the back propagation of the convolutional neural network.
During the network iteration process, the loss value calculated by each forward propagation is monitored. If the absolute value of the difference between the i-th calculated loss value and the i − 1th loss value is greater than the threshold δ, that is |l i − l i−1 | > δ. It is considered the weight in the i-th forward propagation is an invalid weight. At this time, the single population frog jumping algorithm will be used to find the optimal weight. After the convolutional neural network meets the end condition, the weighted value f qb is obtained, and the algorithm is executed again. The single-population frog leaping calculation can directly avoid the local optimum through the last single-population frog leaping optimization.
The flow chart of the weight optimization is shown in Figure 7:

IV. SFCNN ALGORITHM BASED ON SPARK PLATFORM
With the informatization construction of medical hygience, the scale and type of medical data are also growing at a very fast speed. In the medical big data, medical image and audio-visual data account for a large proportion. As a basic and noninvasive method of retina examination, fundus image has a huge amount of data. Inmedical image and audio-visual data account for a large proportion. As a basic and noninvasive method of retina examination, fundus image has a huge amount of data. In this chapter, we mainly focus on the classification of big data and multiple lesions of fundus image, and propose a distributed solution based on spark platform in combination with SFCNN algorithm above, which can realize the weight training of SFCNN in big data environment.

A. GROUP OPTIMIZATION TRAINING METHOD BASED ON SPARK
The training process of neural network has always been a long process, because to train the optimal weight, it needs a large number of training samples and thousands of steps of calculation. Convolution neural network is no exception. In the big data environment, a large number of sample training can improve the recognition accuracy of SFCNN algorithm, but its time consumption will be very huge. In order to solve the VOLUME 8, 2020

Algorithm 1 Weight Initialization Algorithm
Step 1: Parameter initialization. Generate frog colony according to Gauss formula: where m is the number of frogs, and each frog p contains all weight parameters in the network.
Step 2: Sorting. All the frogs without calculated loss values are brought into the CNN model as shown in Figure 7. Some random images are selected from the training image library as reference images to carry out CNN forward propagation and calculate the loss value of each frog. (6) where p represents the network output value, t represents the true value, s represents the dimension of lesion labels in each group, and b represents the number of types of retinopathy that need to be detected simultaneously.
Step 3: Search and position update within the population. According to step 2, the optimal frog f b and the worst frog f w can be obtained. Then Update the position of the worst frog with the position update function. For frog position update, an offset to the traditional frog jump formula and appropriately is increased by the random interval of Rand(). The formula is as follows: Here f p represents the offset. Its dimension is the same as the dimension of each frog. The f pi represents the value in the i-th dimension of f p , and f new represents the updated frog. By adding an offset, the performance of the leapfrog can be effectively improved, and the random interval is increased to make the algorithm easier to find the optimal solution. The jumping mode mentioned in this step is mapped to 2D coordinates as shown in Figure 6: Step 4: Verify the calculation of the stop condition. Determine whether the algorithm satisfies the convergence condition. If it is satisfied, the algorithm will be stopped and the value of the optimal frog will be used as the initial weight of the convolutional neural network. Otherwise, go to step 3.
above problems, this paper combines the SFCNN algorithm with spark platform to effectively reduce the execution time of the algorithm and allocate the computing pressure through distributed computing. This paper mainly solves the problem of the unity and conflict of the network weights between master server and slave server in the distributed gradient descent. The classic data parallel gradient descent algorithm is realized by synchronous

Algorithm 2 Weight Update Algorithm
Step 1: Input the samples. Enter the fundus images and perform one-hot encoding on the label.
Step 2: Convolution and pooling. Multiple convolution and pooling operations are performed on the input fundus image. At the end of each pooling calculation, the output result is activated by using the ReLU function.
Step 3: Full connection calculation. The predicted value is obtained through multi-layer full connection calculation. Then bring the predicted value into the soft max() function to calculate the result, and the loss value is calculated by formula (6).
Step 4: Error value comparison. It is determined whether the absolute value of the difference between the loss value of the current iteration and the previous loss value is greater than the threshold value δ. If it is not greater than the threshold value, skip to step 10. Otherwise the step 5 is performed.
Step 5: Initialization parameters. Record the weight w b of the (i − 1)-th forward propagation and set the number of frogs c.
Step 6: Generate a frog. Unlike the above mentioned method of generating frogs using Gaussion distribution, the frogs here are mainly generated around w b . The formula is as follows: The w ij represents the value in the j-th dimension of the generated i-th frog, and n represents the total number of frogs. At this time, in the frog group, the frog w g must be satisfied to satisfy the condition: l g ≤ l b . Here l g and l b are the loss values of w g and w b , respectively.
Step 7: Sorting. Randomly extract a few number of training pictures as input values. Then bring all the frogs in c into the CNN model shown in Figure 2. Calculate the loss value according to formula (6) and sort all the frogs in descending order of loss values.
Step 8: Search and location update within the population. This step is the same as step 3 in weight initialization algorithm.
Step 9: Judge whether the leapfrog meets the stop condition. It is judged whether the algorithm satisfies the convergence condition. If it is satisfied, the algorithm is stopped, and the value of the optimal frog is updated to the network weight. Otherwise, the process proceeds to step 8.
Step 10: Verify the network stop condition. Determine whether the network meets the convergence condition. If it is satisfied, stop iteration, otherwise jump to step 1.
Step 11: Optimize the final weight. After the algorithm training is finished, the weight in the network is the final weight. Take the final weight as the initial frog w b . The frog group is generated by formula (10), and then step 7, step 8 and step 9 are performed in sequence. Finally, the global optimal frog w qb is obtained as the weight of the final training of the algorithm. communication between master and slave, but it is bound to have a lot of synchronous communication delay, which directly leads to the confusion of network weight training, a lot of synchronous operations make the algorithm unable to achieve real parallelism, and cannot play the advantages of spark parallel computing. In order to avoid the problem that the weights training of different servers cannot be effectively unified, this paper studies and proposes the weights parameter grouping optimization training method based on spark. According to a large number of fundus image data acquired by the master node, it is divided into N subsets, then the fundus image training data set is: Firstly, obtain the first fundus image training data M 1 , take M 1 as an RDD, and divide the RDD into n partitions, then the data in the i-th partition is M 1i , as shown in Figure 8.
In the first grouping training, the initial weights are generated by the weight generation algorithm, and saved in the local or HDFS. Then the main node distributes the parameters, configuration and network updater status to all slave nodes. Thus, each training weight is used to realize distributed parallel training. When the training data of the last RDD partition is trained, the master node can get n different weight results. Different from the idea of average parameters, this paper uses the single population leapfrog optimization algorithm to optimize the aggregated weights. The aggregated n different weights are taken as the initial frogs, and the weights carried by the optimal green frogs are selected as the initial weights for the next group training by continuously updating the position of the worst frogs. Compared with the traditional average parameter method, the weight obtained by optimization calculation has stronger stability and no uncertainty due to average. Moreover, the parameters obtained by averaging are likely to cause the network to fall into the local optimum, and even the excellence of the weights is worse than the sum of weights. However, the weights of single population leapfrog can avoid these problems. The algorithm structure is shown in Figure 9.

B. MAIN STEPS OF SPARK-SFCNN ALGORITHM
Spark-SFCNN can effectively solve the problem of multi lesion classification of fundus image in big data environment, and realize the efficient training of SFCNN algorithm. The main algorithm steps are as follows.

V. EXPERIMENTED RESULTS AND ANALYSIS
The proposed algorithm code is implemented with the help of Python3.5. The experimental environment was Windows10/2.6GHz/8GB PC. The colored retinal fundus images obtained from DRIVE (Digital Retinal Images for Vessel Extraction) Database and DIAREDB1(Diabetic Retinopathy Image Database) [29] are analyzed and examined to obtain the classification results of the proposed system.
The complexity of the fundus images generates the influence of illumination and the interlacing of lesions. In the experiment, we first use the single population leapfrog algorithm to find the optimal initial weight, and then the loss value calculated by the forward propagation of the convolutional neural network is monitored. When the abnormal loss value is found, the single-population frog hopping algorithm is used to correct the weight of the abnormal loss value. To improve the efficiency of network execution and avoid the network falling into local optimum, the experimental process is shown in Figure 10.

Algorithm 3 Spark-SFCNN Algorithm
Step 1: Initialization parameters. It can be seen from the total operation steps S entered. After the second grouping, the calculation formula of each group's running steps s is as follows: where w represents the number of servers used for distributed computing, and t represents the total times.
Step 2: Network weight initialization. The optimal initial network weights are generated by the above-mentioned single population leapfrog algorithm based on spark and saved in the specified directory.
Step 3: Training set secondary grouping. According to the formula (11) set, the data of fundus images participating in this group of training are divided into n groups. This n meets the following formula: where w ≥ 2.
Step 4: Group training. The weights of convolutional neural network are trained in groups. Through independent training, n different network weights are finally obtained, and these n network weights are summarized to the main node.
Step 5: Frog leaping optimization. Taking these n networks as the initial frogs, the global optimal frog f b is found by leaping in the same way as step (2), and the optimal frog is saved in the specified directory, covering the file saved in step (2).
Step 6: Judge the end condition. Judge whether the current grouping times t is satisfied. If t ≤ t is satisfied, skip to step 3. Otherwise, the algorithm ends.
elements are divided into four groups. The first value of each group of element is 1 to indicate no lesion, and the second value is 1 to indicate the occurrence of lesion. The predicted value y_predict calculated by the forward propagation of the convolutional neural network is also a two-dimensional array containing eight elements. Each of which will be calculated by soft max(), and then the loss value is calculated by the formula (6). In the experiment, it is found that the loss value of the individual weight calculation is too large, which exceeds the maximum value range of the programming language variable, resulting in an error. Therefore, we divide each loss value by 1000 to control the maximum loss value in a certain range in the actual experiments.

B. SFCNN EXPERIMENTAL RESULTS
The single-population frog leaping algorithm can find the optimal initial weight by global optimization, which can effectively reduce the initial loss value of the network, improve the network computing efficiency and even have a greater impact on the final classification result. As shown in TABLE 1, it can be seen that compared with the traditional Gaussian distribution randomly generated initial value, the initial loss value calculated by optimized frog leaping algorithm is much lower. The images of different lesions may have certain similarities, which makes the initial network weights generated by the traditional methods have poor stability and the calculated loss values have large gaps. The poor initial parameters require more step gradient descent calculations to optimize, and may also lead the algorithm to falling into local optimum. Therefore, the initial value can be optimized by the single-popping leapfrog algorithm, which can effectively ensure the stability of the initial value generation of the convolutional neural network in multi-label classification of fundus images, and the high quality initial network weight can effectively reduce the number of network iterations, improve the efficiency of network execution, and avoid the network falling into local optimum to some extent.
With the increase of the number of iterations, the loss value of the network shows a tortuous downward trend. However, due to the high similarity of the fundus image and it is seriously affected by the light and tiny lesions. The traditional convolutional neural network error will have a large fluctuation value. This will affect the efficiency of the algorithm's execution, and even lead to local optimality, as shown in Figure 11.
In this study, the comparison of error value and threshold value is monitored, and the weight of the abnormal loss value is corrected by the single-population leapfrog algorithm. The weight comparison method is as follows: where o 1 and o 2 are the two defined thresholds, l i is the loss value of the i-th iteration, and abs() is the absolute value function. The loss value line graph optimized by the singlepopulation leapfrog algorithm is shown in Figure 12, and it VOLUME 8, 2020  can be seen that the fluctuation after optimization is smaller and the excessive fluctuation is less than that before optimization. Thus, the possibility of falling into local optimum is effectively reduced. After the completion of the convolutional neural network, a single-population frog hopping optimization is performed on the final weight of the network, which can directly and effectively avoid falling into local optimization, and further optimize the network weight, so that the detection of lesions in complex fundus images such as darkness of light and overlapping of lesions is more accurate. Figure 13 shows the multi-lesion detection effect of CNN and the algorithm on three kinds of complex fundus images affected by different  factors. In the figure, HE stands for hard exudates, SE stands for soft exudates, M stands for microaneurysms, H stands for hemorrhages and Loss represents the total loss value of the network for four lesions (accurate to one decimal place). Figure 14 shows the accuracy and the sum of losses of two algorithms for multi-lesion recognition of similar fundus images. It can be seen from the figures that the traditional CNN algorithm does not accurately detect individual lesions on part of the fundus image, while the SFCNN accuracy rate is higher on such fundus images, and the loss value is lower. The convolutional neural network part of the experiment mainly performs multiple lesion classification detection on the fundus image through two layers of convolution, two pooling layers and one fully connected layer. The algorithm is evaluated mainly from the aspects of accuracy and sensitivity. And comprehensive recognition rate and overall recognition rate are added as reference. Comprehensive recognition rate: p = (c 1 + c 2 + c 3 + c 4 ) n, where c 1 , c 2 , c 3 and c 4 represent the correct number of four lesions, respectively, and n represents the total number of samples in the test set. The overall recognition rate is the correct percentage of the four lesions detected simultaneously for each sample. It can be seen from TABLE 2 and Figure 15 that compared with SFLA-CNN, AlexNet neural network and LetNet-5 neural network, SFCNN has been improved in varying degrees. Figure 16 compares the running time of SFCNN and SFLA-CNN. It can be seen from the figure that the running time increases with the number of steps, which has certain randomness. However, the running time of SFLA-CNN with the same number of runs is generally longer than that of SFCNN. So the improved SFLA is more suitable for the improvement of this paper.
The experiment also compares the SFCNN algorithm proposed in this paper with the traditional CNN algorithm, SVM [30] algorithm and KPCA+SVM [31] fusion algorithm to compare the accuracy of the classification of the hard exudate of the fundus image. The detection accuracy is represented by histogram as shown in Figure 17. From the classification effect histogram, it can be seen that the SFCNN algorithm improves the accuracy of the multi-lesion detection and single lesion detection compared with the existing algorithms, which proves that the single-population leapfrog optimization convolutional neural network has certain superiority and effectiveness.

C. SPARK-SFCNN EXPERIMENTAL RESULTS
We first compare the single-machine single-population leapfrog algorithm with the single-population leapfrog VOLUME 8, 2020 algorithm based on spark. In the experiment, 100 eye ground images were randomly selected as the reference set. The test frog group is set as 50 frogs. After the updated frogs are distributed adaption calculation, 100 fundus images are stored as one RDD, and ten images are stored in each RDD partition. The simulation experiment recorded the time consumption (/s) of 50 steps, 100 steps, 150 steps, 200 steps and 250 steps of leapfrog update. The time comparison line chart is shown in Figure 18. It can be seen from the figure that the training time of the two methods is very close when the leapfrog update is selected at 50 steps, and then the time consumption of the two algorithms has a certain expansion trend with the increase of the number of leapfrog update steps. However, in the 200 steps experiment, the two algorithms are close again. As a whole, with the increase of running steps, the advantages of spark based single population leapfrog algorithm will be more obvious.
In order to simulate the training experiment of Spark-SFCNN algorithm in big data environment, this study conducted 10000 training iterations for SFCNN and Spark-SFCNN, and compared the time efficiency. As mentioned above, the sample training set was expanded to about 5000 fundus images training sets, but it failed to reach 10000. In this experiment, 10000 iterations were realized through circular training.
Firstly, the number of training steps of each group was calculated by formula (11). In order to facilitate the calculation, t is taken as 5, and w = 2, then s is taken as 200, and the number of frog update steps in the leapfrog algorithm is 50. The time comparison is shown in Figure 19. It can be seen from the figure that the time consumed by SFCNN algorithm based on spark is less than that of the single SFCNN, and the execution efficiency is higher, but the reduced time is not too obvious. TABLE 3 shows the comparison of accuracy and sensitivity between SFCNN and Spark-SFCNN. It can be seen from the table that the accuracy of Spark-SFCNN in the classification and detection of bleeding and soft exudates is higher than that of SFCNN, while the accuracy of exudates and microaneurysms is lower than that of SFCNN. However, from the overall point of view, the accuracy difference between the two algorithms is not large, while the time efficiency of Spark-SFCNN is better. It can be seen that the weight parameter grouping optimization based on spark has certain advantages.

D. ANALYSIS OF EXPERIMENTAL RESULTS
Through above-mentioned experimental results, we find that the proposed SFCNN algorithm has the following characteristics: (1) In the exudate detection, the proposed algorithm in this paper has some advantages over other algorithms, as shown in TABLE 1 and Figure 14.
(2) In the detection of multiple lesions at the same time, the accuracy of this algorithm is also improved, but it still needs to be further improved.  (3) In the detection of fundus lesions, the weights trained by this algorithm are more robust than those trained by traditional CNN algorithm.
(4) In the single population leapfrog algorithm based on spark. It can be seen from Figure 18 that the time consumption of the single machine leapfrog algorithm is higher than that of the single population leapfrog algorithm based on spark, but the difference between the two is not large, and the time consumption of the two is very close at step 50 and step 200, but the overall time consumption of the two increases with the number of execution steps. For the situation that the gap is too small in the experiment, this study also makes some analysis. Firstly, Due to the poor performance of the local installed virtual machine used in the simulation test, and due to the limitation of computing resources, there are only two virtual machines for spark computing, so the distributed advantages cannot be well reflected. Secondly, due to the limitation of conditions, this test only selects the time result of one operation, and does not use the average value of multiple operations. In the calculation process, it will be affected by the physical machine where the virtual machine is located, and there are other programs or systems that occupy the calculation resources, resulting in the situation of long operation time. But from the overall results, the implementation efficiency of the leapfrog algorithm combined with spark platform is better than the traditional way. (5) In the weight training method based on spark. It can be seen from Figure 19 that the consumption time of SFCNN is about 10000 seconds longer than that of Spark-SFCNN, but it does not reflect the advantages of spark distributed computing platform. Through the analysis of algorithm code, we find that the reason is that the framework of algorithm code is not optimized in detail. In the process of distributed computing, it is necessary to create an operation chart every time the training calculation of grouping or leapfrog fitness calculation is started, while SFCNN only needs to create an operation chart in the main operation, and the experiment consumes a lot of computing resources in the aspects unrelated to the algorithm. However, this experiment is also sufficient to prove that SFCNN trained by combining spark framework group optimization is more efficient in time efficiency. From TABLE 3, it can be seen that there is little difference in accuracy between the two, and all this method has certain feasibility and research significance.

VI. CONCLUSION
In this paper the simplified SFLA algorithm is used to treat the global as a population for leapfrog calculation, and a single-population leapfrog algorithm is proposed to optimize CNN for the recognition of multiple lesions in fundus images. Due to the uncertainty of CNN initial weight selection, it is easy to affect the network execution efficiency. CNN own backpropagation calculation is easy to fall into local optimum. Then this paper proposes the single-population leapfrog algorithm to optimize the weight of the loss value anomaly, correct the parameter value without falling into local optimum, improve the recognition rate of the lesion and reduce the training cost.
The accuracy of multi-category is further improved. In the future, the training sample library needs to be expanded to improve the quality of samples and labels. At the same time, in terms of execution time, both the latest leapfrog algorithm research results and more efficient improvement methods can be used to improve the training timeliness of the proposed algorithm. In the follow-up study, the application of convolutional neural network in the fundus image segmentation and lesion localization will be explored to solve the knowledge mining problems of electronic health records that are very difficult to solve by traditional methods. He has delivered more than ten keynote speeches at international conferences and has co-chaired several international conferences and workshops in the area of fuzzy decision-making, data mining, and knowledge engineering.
YING SUN is currently pursuing the master's degree with the School of Information Science and Technology, Nantong University. Her main research interests include rough set and deep learning.
LONGJIE REN is currently pursuing the master's degree with the School of Information Science and Technology, Nantong University. His main research interests include data mining and image recognition.
HENGRONG JU received the B.Sc. and M.Sc. degrees in computer science and technology from the Jiangsu University of Science and Technology, Zhenjiang, China, in 2012 and 2015, respectively, and the Ph.D. degree in management science and engineering from Nanjing University, Nanjing, China, in 2019. He is currently a Lecturer with the School of Information Science and Technology, Nantong University. From 2017 to 2018, he worked as a Visiting Scholar with the Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada. He has authored or coauthored more than ten scientific articles in international journals and conferences. His current research interests include knowledge discovery and granular computing.
ZHIHAO FENG is currently pursuing the master's degree with the School of Information Science and Technology, Nantong University. His main research interests include data mining and image recognition.
MING LI is currently pursuing the master's degree with the School of Information Science and Technology, Nantong University. His main research interest includes big data analysis and processing. VOLUME 8, 2020