A Lung Dense Deep Convolution Neural Network for Robust Lung Parenchyma Segmentation

,


I. INTRODUCTION
Nowadays cancer has become a dangerous disease which is threatening human health to great extent. In the field of cancer, the highest mortality rate is obtained by lung cancer. Accompanied with CT imaging technology developing, CT images have become the most effective and feasible means to detect lung cancer. However, as the CT accuracy has been improved, the number of CT images, which are generated by each scan, is greatly increased too. The diagnostic workload of the radiologist is aggravated and it is possible to miss the diagnosis on account of fatigue. Qi et al. [1] proposed that there are more than 3.12 million new cancer cases per year in China and 2 million deaths from cancer patients. In addition, liver cancer has been replaced by lung cancer as the first cause of cancer death in China. Wu et al. [2] proposed that the The associate editor coordinating the review of this manuscript and approving it for publication was Mohamad Forouzanfar .
5-year survival rate of patients with advanced lung cancer is not higher than 15% in clinical studies. Furthermore, the 5-year survival rate of early lung cancer patients reaches 70%, indicating that the survival rate of lung cancer patients is closely related to early treatment. Therefore, the most effective and feasible solution to treat lung cancer is early detection.
Zhang et al. [3] proposed the truth that the accurate segmentation of lung parenchyma in lung CT images is an important step in the diagnosis and treatment of lung diseases. Furthermore, it is one of the main bottlenecks restricting the application of computer-aided detection technology in the field of pulmonary disease diagnosis. That traditional segmentation methods are difficult to segment at the boundary of lungs, is proposed by Yuan et al. [4]. On condition that there are blood vessels and small voids in the lung database, a good segmentation effect cannot be obtained by the traditional segmentation method. Meanwhile, most existing lung VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ segmentation algorithms based on deep neural networks can accurately segment the lung parenchyma regions. However, under nonideal conditions, it is still challenging to design robust lung segmentation algorithms, which can accurately segment the lung regions despite the effects of blood vessels and small voids and user cooperation. ''User cooperation'' means that the source of the CT images may be different, for instance hospitals and CT equipment. These factors will also affect the final lung parenchyma segmentation effect of the lung CT images. The remainder of the paper is organized as follows: Section 2 outlines traditional lung parenchyma segmentation methods and modern neural network-based segmentation methods, for example FCNS, U-Net and so on. Section 3 describes in detail the proposed architecture for lung parenchyma segmentation. The details of public lung image databases and corresponding ground-truth masks are represented in Section 4. Moreover, environmental configuration of the experiment, network training parameters metrics and measurements for experiments are also described in Section 4. Experimental result evaluation and comparison are represented in Section 5. Section 6 summarizes this paper.

II. RELATED WORKS A. OVERVIEW OF TRADITIONAL LUNG SEGMENTATION METHODS
Traditional method is not to use machine learning, and does not require relatively high equipment requirements. Anter and Hassenian [5] proposed that disadvantages of machine learning is long training time in comparison with the traditional methods, for instance, histogram-based, edge-based, regionbased, model-based, watershed, clustering-based. Hence the traditional methods are less time consuming in executing. Traditional lung parenchyma segmentation algorithms do not have the accuracy rate of the neural network-based algorithms. However, less time is taken by the traditional algorithms. Hence traditional algorithms are still of great value.

1) LUNG SEGMENTATION ALGORITHMS BASED ON GRADIENTS OF LUNG CT IMAGES
A computer-aided region segmentation for the plain chest radiographs is proposed by Chondro et al. [6]. Further improvement for region boundaries is performed by utilizing a statistical-based region growing with an adaptive graphcut technique that increases accuracy within any dubious gradient. Shariaty et al. [7] proposed a new method for lung segmentation in CT scans based on thresholding algorithm, which has a reconstruction operation to detect the attached nodules and add them to the lung mask. Gopalakrishnan and Kandaswamy [8] proposed a histogram utilized Adaptive Multilevel Thresholding (AMT) for estimating the total number of Gaussians and their initial parameters. In addition, the segmented lung parenchyma from the Gaussian Mixture model (GMM) undergoes an Adaptive Morphological Filtering (AMF) to reduce the boundary errors. Zhang et al. [9] proposed an object localization improved GrabCut algorithm for lung parenchyma segmentation that can automatically select the appropriate bounding box that relatives to lung parenchyma, then use GrabCut algorithm. The algorithm can adapt to different forms of lung parenchyma and effectively improve the accuracy of segmentation. Traditional regional growth method is used by Tang et al. [10]. This method is used to preliminarily locate the lung boundary contour. Afterwards the lung boundary noise is removed, and the lung boundary is repaired by the adaptive curvature threshold method. Finally, the DRLSE model in the level set method is used to accurately segment the lung region. Hao et al. [11] proposed a novel automatic segmentation method based on an LBF active contour model with information entropy and joint vector. This method extracts the interest area of pulmonary nodules by a standard uptake value (SUV) in Positron Emission Tomography (PET) images, and automatic threshold iteration is used to construct an initial contour roughly. This method has great reference significance for the segmentation of lung parenchyma. Wei et al. [12] proposed in the objective function of image segmentation, based on the Chan-Vese model method. The local boundary statistical characteristic energy term is introduced to improve the accuracy and segmentation speed of lung medical image segmentation. These algorithms need to preprocess lung CT images according to prior knowledge before or during the location of the boundaries of lung parenchyma, which undoubtedly increases the complexity of the algorithms. In addition, the correlated segmentation steps and the various thresholds that need to be determined both make the algorithm very complex.

2) LUNG SEGMENTATION ALGORITHMS BASED ON PIXELS OF LUNG CT IMAGES
Qu et al. [13] proposed a kind of traditional algorithm, in which combined with the fuzzy C-means clustering algorithm. The left and right lung can be distinguished accurately by traditional threshold method to the accuracy of 0.9832 and 0.9807. Furthermore, in paper [13], the final lung parenchymal segmentation results were obtained by clustering, labeling and merging the super-pixel sub-regions of the same type. At last, the average accuracy of segmentation of CT images of the lungs reached 0.9946. Khan [14] proposed a novel approach for segmenting lung parenchyma using the combination of colour features and improved fuzzy-C means clustering in this paper. This method overcomes the disadvantages of existing CT lung parenchyma segmentation techniques since it combines the color features of different pixels present in the entire image. The combination of improved FCM clustering along with the colour features is the major advantage of this method. A method of lung parenchymal segmentation based on improved fuzzy C-means clustering and freeman chain code algorithm is proposed by Zhang et al. [15]. The improved fuzzy C-means clustering algorithm was used to reconstruct the CT image rough segmentation. Then the missing lung parenchyma margin was repaired by the difference of three-chain code, which is generated by freeman chain code algorithm, to obtain the intact lung parenchyma region. Dharmalingham and Kumar [16] proposed a unique pathological lung segmentation method called reference-model based segmentation that uses shape property of human lung. The proposed segmentation approach constructs a reference lung model from input slices using a novel Sampling Lines Algorithm (SLA) and extracts the shape features. Nithila and Kumar [17] proposed a fast, new algorithm to segment the lung from CT images which places automatically the initial contour to locate the boundary and identify the concave edges. And the result of this algorithm is excellent compared with other active contour models. Peng et al. [18] proposed a hybrid semi-automatic method called Hull-Closed Polygonal Line Method (Hull-CPLM) to detect the boundaries of the lung Region of Interest (ROI). The first step of Hull-CPLM is that an image preprocessing method is constructed to implement the coarse segmentation by using as low as 15% of the manually delineated points as the initial points.
Majda and Abdelhamid [19] proposed an improved version of the standard graph cuts algorithm based on the Patch-Based similarity metric. The weights between each pixel and its neighboring pixels are based on the obtained new term. The graph is then created using theses weights between its nodes. Finally, the segmentation is completed with the minimum cut/Max-Flow algorithm. Similar to lung parenchyma segmentation algorithm based on image gradients, the lung parenchyma segmentation algorithms based on pixels are also easily affected by blood vessels and small voids, which leads to a decrease in segmentation accuracy. These algorithms are also not adequately robust.
These traditional methods are less time consuming and less demanding on the equipment, but the accuracy of the segmentation is not high. Comparative analysises between our proposed method and other lung segmentation methods are summarized in TABLE 1.

B. OVERVIEW OF SEGMENTATION METHODS OF LUNG PARENCHYMA BASED ON DEEP NEURAL NETWORK
In recent years, great success has been achieved by Convolutional Neural Network (CNN) in image classification [33]- [35], target recognition [36]- [38], detection in addition other fields. CNN is used as a feature extraction method by many networks, furthermore the network has been improved to achieve better results. At present, great results in the segmentation of lung parenchymal images is achieved by CNN. Long et al. [22] proposed Fully convolutional networks (FCNs) which adopts contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned VOLUME 8, 2020 representations by fine-tuning to the segmentation task. An encoder and decoder are contained in FCNs. Only the encoder is included in CNNs. Moreover, 1 × 1 sized spatial structure is used to replace 1 × 1 sized structure to acquire the prediction masks by FCNs. Yuan et al. [4] proposed an automatic segmentation algorithm for lung CT images based on U-Net, in which the accuracy of four classes reached 0.991, 0.978, 0.983, 0.997 respectively. Gaussian and Laplacian filtering are performed on the original CT images. Then the preprocessed images and the original images are taken as inputs respectively by using U-Net to execute segmentation. All the segmented lung regions are fused by linear regression to extract the parenchymal region of the lung. U-Net not only can be used to segment lung parenchyma. Wang et al. [39] proposed the structure of Dense U-Net for retinal vessel segmentation, which is similar to that of LDDNet. However, on LDDNet network, the Dense Block structure is only used in the Encoder part, not in Decoder part. On contrast the Dense Block structure is utilized in both encoder and decoder part by Dense U-Net. In addition, the method adopted in this paper is to divide the image into small pieces and then perform separate segmentation. After the segmentation is completed, the segmented small images are combined to obtain the final segmentation result. Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures for semantic image segmentation, which exceed human designed ones on large-scale image classification. Liu et al. [31] proposed to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space. Auto-DeepLab, the architecture searched specifically for semantic image segmentation, attains state-of-the-art performance without any ImageNet pretraining. Isensee et al. [32] presented nnU-Net ('nonew-Net'), a framework that automatically adapts itself to any given new dataset. The nnU-Net strips away the architectural bells and whistles that are typically proposed in the literature and relies on just a simple U-Net architecture embedded in a robust training scheme. Ronneberger et al. [40] used U-Net to segment biomedical images. The U-Net architecture is shown in the FIGURE 1.
Harriosn et al. [23] proposed progressive holisticallynested networks (P-HNNs) for pathological lung segmentation, which is a bottom-up deep-learning based approach. This method is expressive enough to handle variations in appearance, while remaining unaffected by any variations in shape. P-HNNs incorporate the deeply supervised learning framework. It is enhanced with a simple, effective, progressive multi-path scheme, which more reliably merges outputs from different network stages by P-HNNs. Jin et al. [24] developed a 3D generative adversarial network (GAN), which effectively learns lung nodule property distributions in 3D space. They used GAN to generate simulated training images where nodules lie on the lung border, which are cases where the published P-HNN (progressive holistically-nested network) model struggles. By this method, the P-HNN model learns to better segment lung regions under challenging situations. Gordienko et al. [25] proved the efficiency of 2D CXR analysis by lung segmentation and skeletal 93530 VOLUME 8, 2020 exclusion techniques by using deep learning methods. A lung CT image segmentation using the U-net architecture is proposed by Skourt et al. [26], which is one of the most used architectures in deep learning for image segmentation. This network can be trained end-to-end from very few images and outperforms many methods. A multi-stage training strategy, network-wise training, is proposed by Hwang and Park [27], in which the current stage network is fed with both input images and the outputs from pre-stage network. It is shown that this strategy has an ability to reduce falsely predicted labels and produce smooth boundaries of lung fields. Deep convolutional neural networks (NNS) architectures are used for automated multiclass segmentation of anatomical organs in chest radiographs (CXRs), namely for lungs, clavicles, and heart by Novikov et al. [28]. Delayed subsampling, exponential linear units, highly restrictive regularization, and a large number of high-resolution low-level abstract features are used by them. Anthimopoulos et al. [29] introduced a deep purely convolutional neural network for the semantic segmentation of interstitial lung diseases. The training was performed in an end-to-end and semisupervised fashion, utilizing both labeled and nonlabeled image regions. Dai et al. [30] proposed Structure Correcting Adversarial Network (SCAN) to segment lung fields and the heart in CXR images. A critic network is incorporated to impose on the convolutional segmentation network the structural regularities inherent in human physiology by SCAN. This method generalizes well to CXR images from different patient populations and disease profiles. Moreover, by using only very limited training data available, this model reaches human-level performance without relying on any pre-trained model. The segmentation accuracy of these deep neural network methods has been greatly improved, but it is more time-consuming than traditional methods. Further, it does not segment small voids and blood vessels in the lung CT images, nor does it solve the problem of left and right lung adhesion.

A. FLOWCHART OF THE LUNG PARENCHYMA SEGMENTATION SYSTEM
An encoder-decoder structure is used by our lung parenchyma segmentation system. The input lung CT images are encoded into a fixed-length vector representation by the encoder. Further, this fixed-length vector is decoded into the final segmentation result by the decoder. The original lung images serve as the input images. Furthermore, the output is binary images, in which the lung parenchyma is represented by pixel value of 255 and the background is represented by the pixel value of 0. Flowchart of the lung parenchyma segmentation system is demonstrated by FIGURE 2.

B. LDDNet FOR LUNG CT SEGMENTATION
We proposed a lung dense deep convolutional neural network (LDDNet). LDDNet is a deep neural network with direct connections between any two layers. The input of each layer of the network is the fusion of all the previous layer outputs. Besides the feature map learned by this layer is directly passed to all subsequent layers as input. In addition, the depth of the network can be reduced without reducing the performance of segmentation. Additionally, the same segmentation effect can be achieved as the deeper network structure by this network. Moreover, this kind of structure can well alleviate the disappearance of gradients, save parameters, ease underfitting, and reuse features. Furthermore, many convolutions, are used, which size are 1 × 1. Through this method, the dimension can be reduced. For instance, the amount of calculation can be reduced from 4 × 4×5 to 4 × 4×3 through 1 × 1 convolutions. As a result, the accuracy rate is slightly improved. Network structure of LDDNet can be seen in FIGURE 3.
The Dense-Block structure is used in the LDDNet. Every Dense-Block is composed of layers of Batch Normalization, Relu, and Convolution which size is 3 × 3, in addition to a layer of Concatenation. We can see from FIGURE 3, FIGURE 4 that the Dense-Block structure is adopted by the LDDNet network. As demonstrated in FIGURE 4, the top convolutional layers connect the later convolutional layers. Moreover, each convolutional layer is followed by the rectified linear unit (ReLu) layers and batch normalization (BN) layers. Assuming there are L convolutional layers, the traditional convolution network has L connections, however there are L×(L+1) connections in the LDDNet based on the dense block. There is a direct connection between every two layers, which makes the most of features of lung CT images. The effect is that features are reused to the greater extent which is the goal that deep neural network has been pursuing. The overfitting and gradient vanishing effectively are alleviated by dense-block, which adds the larger feature value of the bottom layers to the small feature value of the top layers.  In addition, since there is currently no lung CT image dataset with a large data size. Hence this will cause the deep neural network to underfit in training. Besides the disappearance of the gradient during the training process will seriously restrict the improvement of the accuracy of the neural network. The structure of Dense-Block can alleviate this problem for some extent for the dense connections.
The network structure of LDDNet is shown in TABLE 2. This neural network can be feature-multiplexed in the case of a small dataset size, reducing the speed at which the gradient disappears. The network used in this paper is divided into two parts. The first half of the network is a convolution process, namely for LDDNet-encoder. The second half of the network is LDDNet-decoder in which deconvolution is used besides fusing the output of the average pooled layer. The convolutional layer is abbreviated as 'Conv'. The deconvolution layer is abbreviated as 'Conv_d'. The final output of LDDNetencoder is defined as 'Final'. The fusion of the output of the pooling layer and the deconvolution layer is abbreviated as 'ADD'. The dropout layer is abbreviated as 'Dt'.
In the part of LDDNet-encoder, the size of input image of this experiment is 512 × 512×1. Initially through Conv1, in which the number of convolution kernels is 64, the size is 3 × 3 and the stride is 1. Furthermore, five Dense-Blocks follow behind Conv1. Each Dense-Block contains two convolutional layers, in which the number of convolution kernels is 64, the size is 3×3 and the stride is 1. The number of channels is increased in the graphics to 128 by each Dense-Block. Then five pooling layers are connected after each Dense-Block. The pooling layer we take is average pooling, in which the specification is 2 × 2, and the stride is 2. The pooling layer narrows the image size. Following the above structure are two convolutional layers for instance, Conv2 and Conv3. Conv2 is composed of 8192 kernels with 7×7 size and step 1. Conv3 is composed of 8192 kernels with 1×1 size and step 1. The Final layer is composed of n kernels with 1 × 1 size and step 1, where n is 2 in this paper, namely for the lung pixels and non-lung pixels. Besides two converged concatenations are also contained. In addition, followed by Conv2 and Conv3 respectively, there is a dropout layer, in which the probability of setting the discard is 0.5.
The details of the LDDNet structure can be seen in the TABLE 2.
In the part of LDDNet-decoder, Conv_d1, Conv_d2, Conv_d3, Conv_d4, Conv_d5 are performed to obtain the segmentation result. Meanwhile, the size of Conv_d1 is the same as the output size of Pool4. Besides Conv_d1 is merged with Pool5 to get Add1, Conv_d2 is merged with Pool4 to get Add2, Conv_d3 is merged with Pool3 to get Add3, Conv_d4 is merged with Pool2 to get Add4, Conv_d5 is merged with Pool1 to get Add5. Then through the structure BN-Relu-Conv8, Conv8 is composed of 192 kernels with 3 × 3 size and step 1. Through the structure BN-Relu-Conv9, Conv9 is composed of 192 kernels with 1 × 1 size and step 1. As described above, the result of deconvolution is as follows: Conv_d1, Conv_d2, Conv_d3, Conv_d4, Conv_d5, Conv_d6; fusion result: Add1, Add2, Add3, Add4, Add5. The final output segmentation result is a 512×512 graph, which is a binarized segmentation result. The number of classes in the experiment is 2, namely for the lung parenchyma area and the background area. In the segmentation results, the pixel value of 255 represents the part of the lung parenchyma and the pixel of 0 represents the background.

C. THE DIFFERENCE BETWEEN LDDNet AND U-Net
U-Net is improved based on FCN. The difference is that U-Net does not simply encode and decode pictures like FCN. U-Net accurately locates the high-pixel features extracted from the shrink path. The new feature map is combined during encoder to preserve some important feature information from the previous decoder process.
As can be seen from FIGURE 3 and FIGURE 1, only L connections are contained by the traditional L-layer convolutional neural network, while Lx(L+1)/2 connections are contained by LDDNet. The difference between U-Net and LDDNet is that in the encoder part of the neural network, the network layer in the U-Net does not have a direct connection with all the previous layers. In contrast LDDNet has a direct connection between each layer and the previous network layer. The fusion of all the previous layers is the input of the next layer.
As can be seen from FIGURE 3 and FIGURE 1, the U-Net network here is very similar to the dense network. The obvious difference between U-Net and LDDNet is the following Equation 1 and Equation 2.
The Equation 1 is U-Net. Here, n represents the n layers, C n represents the output of layer n, and W n represents a nonlinear transformation. For U-Net, the output of the n-layer is the output of the n-1 layer plus the nonlinear transformation of the output of the n-1 layer. The Equation 2 is LDDNet. [C 1 , C 2 . . . C n−1 ] indicates that the output feature map of the 0 to n-1 layer is concatenation. Concatenation is the merging of channels. U-Net is the sum of values, and the number of channels is constant. W n includes convolutions of BN, ReLU and 3×3. The output of the n layer is the concatenation of the output of the 0 to n-1 layer. In the end of the Dense-decoder, cross entropy function is adopted as the cost function.

IV. EXPERIMENTAL CONFIGURATION A. DESCRIPTION OF THE LUNG CT IMAGE DATABASE 1) PROPERTIES FOR CT IMAGES
Diagnostic, lung cancer screening chest tomography (CT) scans and indicates annotated lesions are included in Lung Image Database Consortium's image collection (LIDC-IDRI) [41]. It is an internet-accessible international resource for developing, training, and evaluating computer-aided diagnostic (CAD) methods for lung cancer detection and diagnosis. LIDC-IDRI is sponsored by the National Cancer Institute (NCI), further promoted by the National Institutes of Health (FNIH) Foundation, and participated by the Food and Drug Administration (FDA). Seven academic centers and eight medical imaging companies collaborated to create this dataset which contains 1018 cases and 244527 images. Each subject includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. Each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories (''nodule > or =3 mm,'' ''nodule <3 mm,'' and ''non-nodule > or =3 mm''). The detailed parameters of LIDC-IDRI are as shown in TABLE 3.
In fact, lung CT images are acquired in DICOM format by CT machine in hospitals. However, DICOM format must be converted to image format to be suitable for deep neural network training. Here PNG format was selected as the image format for LDDNet. In addition, the network training will be burdened by the extra information in the DICOM format other than images. The class packet PYDICOM is used to extract the image information inside the DICOM format. Furthermore, the image is saved in the PNG format by the class packet CV2 (OPENCV).

2) ENHANCEMENT AND LABEL FOR CT IMAGES
The number of lung CT images for experiment, which are chosen from dataset LIDC-IDRI, is relatively small. Hence the method of enhancing data is executed by us. The images are expanded by a factor of 20 in this paper. The methods like rotation, translation, clip, blurry and so on, are used in this paper. The instances of the enhancement of lung images can be seen from FIGURE 5. The images in the first line are the lung that is panning. Furthermore, the images in the bottom line are rotated.
The dataset LIDC-IDRI has been published and there is no segmentation of the image. In this paper, the database is manually labeled by us with the software LabelMe. To evaluate robustness of LDDNet, we adopt software LabelMe for labeling the lung regions and the noise regions occluding the lung regions. First, the outer ring of lung is labeled as '_lung_'. Second, the external region of lung is labeled as '_background_'. Finally, the blood vessels and spaces occluding the lung area are labeled as '_background_'. From the  third row of column c in FIGURE 9, the second row of column c in FIGURE 15, we can see that our LDDNet has good robustness. The blood vessels and spaces are segmented out from the lung parenchyma region. The process of labeling is manifested in FIGURE 6.

3) CLASSIFICATION OF CT IMAGES
This experiment was conducted on the LIDC-IDRI database. The images for experiment were derived from LIDC-IDRI and the selected images were manually labeled. Two teachers and six graduate students participated in the work of labeling. These people are engaged in lung CT image segmentation and detection research in the school of software. Furthermore, our work of labeling has been validated by the radiologist of Jiangxi Cancer Hospital in China in Jiangxi Province. Our database was marked by eight people, and each person completed five groups. The number of pictures in each group is 100. The number of pictures we selected is 4000. Among them, the number of training set images, verification set images and test set images account for 2400, 800, 800. The ratio is 0.6, 0.2, 0.2. The detailed parameters of the dataset are displayed in TABLE 4.
Database LIDC-IDRI consists of complete lung regions and lung regions occluded by blood vessels and small voids. LIDC-IDRI comes from different collection equipment and different people. Furthermore, patients' lungs also have different health conditions. For example, canceration of the lungs or stagnant water in the lungs, result in different geometrical shapes of the patient's lung CT images. In order to fully verify the robustness of the network, the images are divided into four classes according to the geometry. Furthermore, each class is trained with LDDNet to test the segmentation performance. The classification criteria are whether the left and right lungs are clear, whether the left and right lungs VOLUME 8, 2020 are adhered, whether they are upper slices of the CT image of the lung, and whether the left and right lungs are symmetrical. The specific characteristics of each class are described below.
As can be seen from FIGURE 7, all of lung CT images are classified into four classes: (1) Class 1: Class 1 is lung images whose left lung images and right lung images are clear, symmetrical. Such images occupy the largest proportion in the dataset in TABLE 4. (2) Class 2: Class 2 is lung images whose right lung images and left lung images are bonded. This kind of sticky image is relatively difficult to segment. In addition, the proportion of this king of images in the dataset is the smallest. But the accuracy of the segmentation effect is still above 0.95. It shows that LDDNET has good robustness. As is shown in the red circles in

B. EXPERIMENT ENVIRONMENTAL CONFIGURATION AND NETWORK TRAINING PARAMETERS
In this paper, TABLE 5 shows the experiment environment details. Because the memory of our experimental equipment is relatively small. The size of memory is 32G. Hence, the batch is set to 2. The learning rate is set to 0.00001.
The dropout rate is set to 0.5. The number of the iteration is set to 50000. From FIGURE 8, it can be seen that the model accuracy gets steady after 30000 of the iteration steps. Hence, we set 50000 as the iteration number to ensure that the training convergence is enough, in addition to make the most use of the lung CT images. We assume that J is the cost function.f j (z) denotes the desired label probabilities of the training image and f j (z) denotes the actual network output: For optimization, the mini-batch Adam algorithm is adopted to minimize the cost function. The lung CT images are composed of mostly non-lung regions and a few lung regions, which lead to the adoption of the standard stochastic gradient descent (SGD) algorithm. Adam algorithm increases the learning rate for sparse data and decreases it for common data, and updates quickly for sparse features and slowly for common features.

C. METRICS AND MEASUREMENTS FOR EXPERIMENTS
For the segmentation results of the experimental images, the Jaccard Similarity (JS) is used to evaluate the accuracy of the experimental segmentation. The CT image segmentation results of the lungs were evaluated. Jaccard Similarity (JS) was utilized as the evaluation index. As is shown in the following equation 4: where A represents the area segmented by the algorithm, in addition B represents the area marked manually. When the result of neural network segmentation is better, the value of   JS is larger. When the result of neural network segmentation is worse, the value of JS is smaller. The value of JS ranges from 0 to 1. Wherever FP (false positive) and FN (false negative) denote the misclassification number of non-lung pixels and lung pixels in test images respectively. TP (true positive) and TN (true negative) denote the recognition number of lung pixels and non-lung pixels in test images respectively. The accuracy represents the correct segmentation pixels and is calculated as follows.
The precision(p), recall(r), Specificity(SP), Sensitivity(SE), Dice similarity coefficient(DSC) are calculated as follows:  TABLE 6. Image preprocessing will reduce or improve the performance of deep neural networks. One of our work will compare the impact of some common preprocessing methods on LDDNet. The preprocessing methods of the lung CT images are composed of enhanced contrast, median filtering, Laplacian filtering. The median filtering method is a nonlinear smoothing technique that sets the gray value of each pixel to the median of the gray values of all pixels in a neighborhood window at that point. Laplacian is a kind of differential operator. Its application can enhance areas with abrupt changes in gray levels and weaken areas with slow changes in gray levels. For contrast-enhancing experiments, we use a mask to enhance image contrast to process the image. Contrast enhancing, median filtering and Laplacian filtering are utilized to preprocess the images before the images are inputted into the LDDNet, meanwhile LDDNet with no preprocessing is executed as the comparison. The experiments results are shown in   Furthermore, the accuracy of LDDNet is higher than the traditional methods [8], [14], [19]. From above, the conclusion can be drawn that LDDNet has better performance than most of the traditional methods. Besides, some special preprocessing methods will improve the performance of LDDNet in the field of lung parenchyma segmentation, for instance median filter, Laplacian filter.

B. EFFECT OF THE CT LUNG IMAGE CLASSIFICATION
The following experimental images are the segmentation results of the four class of lung CT images by LDDNet. In addition, there are segmented images on the bottom in FIGURE 9. Furthermore, the blood vessels and small voids can be segmented out from the lung region in class 1, class 2, class 3, and class 4, from which the robustness of LDDNet can be proved in red circles in figure 9. The specific parameters of the four classes of image segmentation results can be seen in TABLE 13. As shown in FIGURE 9 (class 2), the CT section of the lung is a relatively tight junction between the left and right lungs. Artificially, it can be well marked by magnifying the image. Nevertheless, good separation of the left and right lungs is not obtained by the traditional method.
The following FIGURE 10, FIGURE 11, FIGURE 12, and FIGURE 13 show the segmentation results of each class of images in detail. In each figure, column a represents original lung CT images, column b represents ground-truth, at last column c represents the segmentation results of the LDDNet network. In images of column d, the false positive and negative errors are presented as green and red.
For relatively good images of class 1, our network segmentation results are good. It can be seen from column a in FIGURE 10 that the images of class 1 are relatively complete of the left and right lungs, and the left and right lungs are not adhered. The image quality of this class is relatively high. In addition, the segmentation effect of this class of image is also good for LDDNet. It can be seen from the first row and third row of column d in FIGURE 10 that not only the lung parenchyma can be segmented, but also the blood vessels and small voids can be well separated here.
For the relatively good images of the CT lungs with respect to the left and right lungs, it can be seen from the first row of column d in FIGURE 11 that the left and right lungs have been segmented. Furthermore, the segmentation effect is very good for the edge regions of the lungs, for example, the red circles marked column d. As can be seen from column a in FIGURE 11, the images of this class are all images in which the left and right lungs are bonded. The image quality of this class of image is worse than that of class one, and the segmentation effect of this class of images on LDDNet is also relatively poor. It can be seen from column d in FIGURE 11 that the segmentation of the lung parenchyma is realized, moreover the bonded parts of the left and right lungs are also divided. As for the good images of the upper part of the CT lung slice, as can be seen from FIGURE12, image size of the upper part is smaller than that of class 1 and class 2, which makes segmentation difficult. As can be seen from the second row of column d, the effect of segmentation is good. As can be seen from column a in FIGURE 12, the images of class 3 are all images of the upper slice of the lung CT. The image quality of this class of image is worse than that of class 2, and the size of the left and right lungs in the image is smaller. The segmentation performance of this class of images for LDDNet is also relatively poor. It can be seen from column d in FIGURE 12 that the lung parenchyma is realized, furthermore the edges of the lung parenchyma are relatively well divided, for example, the red circled parts of column d.
As for the asymmetrical images of the left and right lungs in the CT image, it can be seen from FIGURE 13 that the segmentation results are also very good. It can be seen from column a in FIGURE 13 that the images of class 1 are asymmetric images of the left and right lungs, and most of these images are missing in the upper part of the right lung. The image quality of this class of image is worse than that of class 1 but better than that of class 2 and class 3. The segmentation effect of this class of images for LDDNet is also relatively poor. It can be seen from column d in FIGURE 13 that the lung parenchyma is realized, in addition the upper edge of the right lung here is relatively smooth. For example, column d is marked with red circles.
From FIGURE 10, FIGURE 11, FIGURE 12, and FIGURE 13 above we can know that the good segmentation results for the four classes of images is obtained by LDDNet. From this, it can be proved that LDDNet is robust on lung parenchyma segmentation.

C. EFFECT OF THE BLOOD VESSELS AND SMALL VOIDS
Lung parenchyma segmentation algorithm can be interfered by the blood vessels and small voids. These regions cannot be segmented out by traditional method. Furthermore, these regions are not obtained by their ground-truth masks too.
Here is a total of three CT slices of processing results in FIGURE 14 and FIGURE 15. Original lung CT images are revealed by column a. Ground-truth masks are shown through column b. Network segment results are represented by column c. Visual segmentation results are exhibited in column d. In images of column d, the false positive and negative errors are presented as green and red. LDDNet is used for segmentation as the contrast experiment for effect of the blood vessels and small voids. Effect of the blood vessels and small voids for LDDNet for segmentation of lung parenchyma on lung CT images can be seen from FIGURE 14 and FIGURE 15. From the red circles marked in FIGURE 14 of column c, the blood vessels are segmented out from the lung parenchyma by LDDNet. In the third row of column c in FIGURE 14, although the blood vessel segmentation in the red circle is not very clear, but the basic outline still can be segmented well. In FIGURE 14, it can be indicated from the first row of column d that the left and right lung were successfully separated.
It can be seen from the red circles marked in FIGURE 15 that LDDNet also has a good effect on the segmentation of lung parenchyma. From column c in FIGURE 15 that small voids in the lung parenchyma have been segmented out. However, in order to further test the robustness of LDDNet, we labeled the small voids in the lung parenchyma, but some small voids were not labeled. In the second row and third row of column b of FIGURE 15, we did not label the small voids, but let LDDNet directly perform the segmentation experiment. The results, manifested from the second row and third row of column c in FIGURE 15, is that LDDNet still segments out the small voids from the lung parenchyma. From the above two sets of comparative experiments, a conclusion can be made that LDDNet network is robust in segmenting the lung parenchyma which has interference terms for instance small voids and blood vessels.

D. EFFECT OF INPUT IMAGE SIZE
Medical CT images are generally huge. However, it is impossible to enter the original lung CT image into LDDNet during segmentation. Hence, in this paper, the image is resized to 512 × 512 and other pixel sizes for comparison.
We make follow experiments to compare the effect of image size on LDDNet in TABLE 7.
As demonstrated in TABLE 7, LDDNet is trained by different sizes of the input images, 512 × 512 and so on. Additionally, the performance of the bigger image size is better. Because the reduction of the image will lead to the loss of semantic information, eventually the LDDNet network cannot learn enough features. LDDNet needs to learn the pixel level information of the images during training. For instance, after reducing the size of the input image, the pixel information of the 512 × 512 image is sixteen times the pixel information of the 128 × 128 image. Hence, when the pixels obtained by LDDNet are insufficient, the deep neural network cannot learn sufficient segmentation performance by itself. At last, images with larger pixels will have a higher accuracy than images with smaller pixels by LDDNet. It can be concluded from the TABLE 7 that as the image pixels become smaller, the effect of LDDNet segmentation of lung parenchyma is gradually getting worse. 93540 VOLUME 8, 2020

E. EFFECT OF TYPE OF POOLING
The most commonly used pooling layers are maximum pooling and average pooling. Average pooling is to average only the feature points in the neighborhood. This pooling can reduce the error caused by the increase in the variance of the estimated value in the limited size of the neighborhood. Maximum pooling is to take the maximum of the feature points in the neighborhood. This kind of pooling can reduce the convolution layer parameter error which is caused by the deviation of the estimated mean value.
Two different types of the pooling layers are used for comparative experiments. The result indicated in TABLE 8, the max pooling has a little better accuracy than average pooling for LDDNet. Precision, recall, DSC, sensitivity, specificity, Jaccard Similarity in two types of pooling are very similar. Even the DSC is the same. So, a conclusion can be made that the average pooling and maximum pooling have very little effect on LDDNet for lung parenchyma segmentation. The structures of the two types of pooling layers is revealed in FIGURE 16 below.
To further explore the effect of pooling layer function on LDDNet segmentation of lung parenchyma, we designed comparative experiments. We designed comparative experiments. There are four-layer average pooling, four-layer maximum pooling, five-layer average pooling, five-layer maximum pooling, six-layer average pooling, and six-layer maximum pooling. The results of the comparative experiment can be seen in Table 9.
As can be seen from the TABLE 9, the effect of the pooling layer on the experimental results is very subtle. The pooling layer has not greatly improved the effect of LDDNet on segmenting lung parenchyma. When the type and number of VOLUME 8, 2020   pooling layers are different, the accuracy of LDDNet experiments will be slightly improved, for instance, four-layer average pooling, four-layer maximum pooling. Through the above TABLE 8 and TABLE 9, LDDNet can still achieve an accuracy rate of more than 0.99 in the case of different pooling layers, which also illustrates the robustness of LDDNet.

F. EFFECT OF DENSE CONNECTION
LDDNet utilized the dense-connection which can get the better result than the deep neural network with same depth. Secondly, we utilized the Dense-Block to reduce overfitting. The dense blocks alleviate the overfitting and gradient vanishing effectively, which adds the larger feature value of the bottom layers to the small feature value of the top layers. Hence, we have designed two comparative experiments, one with dense connections and one without dense connections, for the aim of exploring effect of type of dense connection on lung parenchyma segmentation by LDDNet.
It can be seen from the TABLE 10 that removing the dense connection from LDDNet will result in much worse results. The accuracy of the experiment dropped from 0.9943 to 0.9289. Dense connection can make full use of the image to train the segmentation model. And LDDNet will naturally drop a lot if there is not enough feature accuracy. Therefore, dense connection is also one of the methods used by LDDNet to reduce overfitting and improve accuracy.   As can be seen from the TABLE 11, Dense Block is very important for LDDNet to make full use of feature. The accuracy of LDDNet using different numbers of dense blocks can reach more than 0.994.

G. COMPARISON WITH OTHER METHODS AND ROBUSTNESS EVALUATION
Agnes et al. [20] proposed a convolutional deep and wide network (CDWN) to segment lung region from the chest CT scan for further medical diagnosis. CDWN learns the required filters to extract hierarchical feature representations at convolutional layers. Geng et al. [21] used the first three parts of VGG-16 network structure for convolution and fused the multi-scale convolution features. Furthermore, each pixel is predicted using MLP to segment the parenchymal region. Ronneberger et al. [40] presented a network and training  strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. Heewon et al. [43] proposed a novel lung segmentation method to minimize the juxta-pleural nodule issue and added the final nodule candidates to the area of the Chan-Vese model (CV) results to modify the lung contour. Chen et al. [44] designed modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates. CRF [44] models are composed of unary potentials on individual pixels or image patches and pairwise potentials on neighboring pixels or patches. The resulting adjacency crfs structure is limited in its ability to model long-range connections within the image and generally results in excessive smoothing of object boundaries. In order to improve segmentation and labeling accuracy, the crfs framework has been expended to incorporate hierarchical connectivity and higher-order potentials defined on image regions. DenseCRF means densely interconnected conditional random field, in which all pairs of variables are allowed to be connected by pairwise potentials from training data for learning parameters.
The results declared in TABLE 12 confirm that LDDNet achieves substantially better results than the other methods in terms of average accuracy, precision, recall, DSC,  Sensitivity, Specificity. The average accuracy for the segmentation of lung regions in the lung dataset is 0.99. This is significantly better than the thresholding method result [20], whose accuracy is 0.77 without applying post-processing techniques and other methods too. The accuracy of LDDNET can reach 0.994378. The performance of LDDNet provides an improvement over thresholding by 29.12%, CDWN with dropout by 9.26%, CDWN without dropout by 1.45%, VGG-16 by 0.77%, Heewon by 3.51%, Deeplab by 1.29%. Further, we adjusted the related parameters of LDDNet. Through the adjustment of parameters such as dense connection, pooling layer, input image size, etc., we can get the accuracy rate 0.994726, 0.995109 through the setting of median filtering and four-layer average pooling. Therefore, it can be concluded that the performance of LDDNet in segmenting lung parenchyma is better than other methods. LDDNet's segmentation accuracy is higher than other methods. Therefore, the proposed LDDNet is meaningful in the field of lung CT segmentation of lung parenchyma.
Comparative experiments of four classes of images were also conducted in this paper. The comparison of LDDNet's segmentation effect parameters for the four classes of lung CT images segmentation in the experiment can be seen in TABLE 13.
For the LDDNet network, the training results of our images classified into four classes are compared with the existing methods. As can be declared that in class 1, class 2, class 3 and class 4, high accuracy is obtained by LDDNet. Furthermore, the average segmentation accuracy can reach 0.98 or higher in TABLE 13. Furthermore, LDDNet has the ability to segment images on different conditions of lung CT images. It can be demonstrated that Class 1 and Class 4 have higher accuracy than other methods. On account of that class 1 and class 4 are all the relatively complete images. However, Class 2 and Class 3 cannot have better accuracy than other methods. Such results are acceptable for the reason that class 2 and 3 are not very high quality. The Class 2 image is an image in which the left and right lungs are bonded, and the edges are not smooth. Therefore, LDDNet has a poor ability to learn pixels in the edge region. However, it can be seen from FIGURE 11, FIGURE 14 and FIGURE 15 that although the accuracy of segmentation is not high, the internal blood vessels and small voids are also segmented out by LDDNet. Class 3 is the image of the upper slice of the lung CT, in which the left and right lungs occupy smaller pixels. The segmentation performance of LDDNet to learn must be insufficient, so it will cause the accuracy of segmentation to decrease. However, from FIGURE 12 the LDDNet network still segmented out the small voids and blood vessels. So, no matter when testing LDDNet for image classification or training on the dataset as a whole, LDDNet can achieve a good accuracy. In summary, LDDNet has good robustness while ensuring good segmentation performance.

VI. CONCLUSION
Lung segmentation algorithms play an important role in lung cancer detection system and directly affect the accuracy of lung cancer verification and recognition. Lung dense deep VOLUME 8, 2020 convolutional neural network (LDDNet) for lung segmentation is introduced in this paper. The grayscale and texture features of the CT image of the lung are utilized by LDDNet. The high accuracy and robustness on lung CT images can be achieved by LDDNet. Through using image preprocessing and no image preprocessing as contrast proves that some special image preprocessing methods like median filter, Laplacian filter and so on, will promote the CT lung parenchyma segmentation results of LDDNet. In contrast, some will reduce the accuracy of segmentation for instance enhanced contrast etc. Besides ground-truth for LIDC-IDRI is marked manually, blood vessels and small voids are also labeled by us for the aim of analyzing the effect of these interference factors on lung segmentation. Finally, the better segmentation result is achieved in term of accuracy and robustness. With LDDNet, the final result is improved compared to most of the traditional lung segmentation methods and deep convolutional networks for instances VGG, U-Net, CDWN and so on. Therefore, the method of segmentation of lung CT images will continue to be studied for clinically specific applications. Too many blood vessels and small voids in the lung CT images can seriously interfere with the accuracy of segmentation. In the future, designing a more robust algorithm for automatic segmentation of lung parenchyma and a more accurate method for labeling lung parenchyma is one of the future research directions. Besides more attention should be attached to the segmentation of blood vessels and small voids. Our goal is to improve the segmentation accuracy of lung CT images, furthermore to help doctors' clinical diagnosis and promote medical progress better.
YERONG WANG was born in 1993. He received the B.S. degree in software engineering from the Changchun Institute of Technology, Changchun, China. He is currently pursuing the M.S. degree with Nanchang Hangkong University, under the supervision of Prof. Y. Chen.
His main research interests include software engineering, image processing, and machine learning. His main research interests include software engineering, image processing, and cyber security. VOLUME 8, 2020