Automatic Segmentation Using a Hybrid Dense Network Integrated With an 3D-Atrous Spatial Pyramid Pooling Module for Computed Tomography (CT) Imaging

Computed tomography (CT) with a contrast-enhanced imaging technique is extensively proposed for the assessment and segmentation of multiple organs, especially organs at risk. It is an important factor involved in the decision making in clinical applications. Automatic segmentation and extraction of abdominal organs, such as thoracic organs at risk, from CT images are challenging tasks due to the low contrast of pixel values surrounding other organs. Various deep learning models based on 2D and 3D convolutional neural networks have been proposed for the segmentation of medical images because of their automatic feature extraction capability based on large labeled datasets. In this paper, we proposed a 3D-atrous spatial pyramid pooling (ASPP) module integrated with a proposed 3D DensNet encoder–decoder network for volumetric segmentation to segment abdominal organs from CT. The proposed network used a 3D-ASPP block to capture spatial information in multiscale input feature maps from the decoder side. We also proposed a 3D-ASPP block with a 3D DensNet network for automatically processing 3D medical volumetric images. The proposed hybrid network was named 3D-ASPPDN for volumetric segmentation via CT medical imaging. We tested our proposed approach on a public dataset, Thoracic Organs at Risk (SegTHOR) 2019. The proposed solution showed excellent performance in comparison with other existing state-of-the-art DL methods. The proposed method achieved Dice scores 97.89% on the SegTHOR dataset. Results presented that 3D-ASPPDN exhibited enhanced performance in volumetric biomedical segmentation. The proposed model could be used for volumetric segmentation in clinical applications to diagnose problems in multi class organs.


I. INTRODUCTION
The contrast-enhanced Computed Tomography (CT) has been used effectively for thoracic diseases and has great importance in medical field. The radiologist spent lot of The associate editor coordinating the review of this manuscript and approving it for publication was Madhu S. Nair . the time for identification and localization of organs in CT images. The automatic segmentation and localization of organs from CT images could be helpful to diagnose the thoracic organs at risk in CT images. The CT images contain multiple organs have three-dimensional(3D) structure and manually segmented of each slice from 3D volume is time consuming process. Automatic segmentation based on deep learning models could be helpful and can produce effective results for segmentation and detection of organs in 3D CT images. Lot of works have been done for automatic segmentation of CT images in biomedical fields. Mostly the segmentation in medical fields has been done based on individual slice of each patient such as 2D U-net model. The U-net model consisted of encoder and decoder part with skip connection and produce better results in medical image segmentation.
In 3D deep learning models, the segmentation of all slices of the CT images done at once. However, the first issue in 3D segmentation deep learning models is computationally complexity. The second issue is due to large number of weights parameters can make the model to overfitting due limited training data. The second could be resolve by using small 3D convolutional layers or decreasing the number of channels per convolutional layer. This idea might not be work well especially for small training dataset. Semantic segmentation produces an effective and rich representation of tumors and achieves accurate extraction of contact surface area [1], irregularity [2], and morphological traits.
Automatic segmentation is generally used in addition to semiautomatic and manual segmentation. Manual segmentation requires expertise to deal with tasks. It is subjective, requires considerable time for processing, and is poorly reproducible. It entirely depends on human-made handcrafted features; hence, it is impractical for real applications [3]. Similarly, semiautomatic segmentation initially depends on human involvement that could cause mistakes and errors. On the contrary, automatic segmentation could be accurate, produce minimal errors, and help surgeon's segment. However, structural variability, noise existence, the complexity of 3D spatial multiclass features, large-scale spatial variability, partial volume effects, and the similarity of nearby organs to make automatic segmentation a difficult task [3].
Convolutional neural networks (CNNs) have recently been used for classification and other important applications, such as object detection and volumetric image segmentation. They achieve state-of-the-art performance compared with traditional machine learning models. CNN models can automatically extract hierarchical features from raw images without depending on handcrafted features. The deep layers in CNN models capture global information in a broad way due to large receptive fields [4], while shallow layers grasp only local information. The automatic extraction of structural information and the delineation of organs from images [5] are highly needed to perform visual augmentation [6], interventions [7], and computer-assisted diagnosis [8]. The interventional and diagnostic imaging consisting of 3D images and volumetric segmentations can consider the entire volume content at once, which is the requirement for the automatic segmentation of biomedical images. The depth of 3D CNNs is limited compared with 2D networks due to many constraints, such as the memory consumption of the graphics processing unit (GPU), high computational cost, and the requirement for a large number of annotated datasets. These 3D CNN models are also not flexible and efficient for 3D image sequences due to the large number of parameters and the limited number of hardware resources.
Many approaches based on patch-wise image classification, postprocessing steps, and different techniques, such as Markov random fields [9], voting strategies [10], and level sets [11], are used in combination with CNN models to perform volumetric segmentation. These techniques globally fail to produce accurate and efficient volumetric segmentation. Patch-wise approaches could not perform well due to high computation cost and produce redundant information that makes the algorithm runtime high. Therefore, efficient computational schemes are required to tackle 3D CNN segmentation problems.
Fully convolutional networks (FCNs) [12] have been widely used for dense segmentation. They change fully connected layers with upsampling layers to retain the spatial structure. The upsampling layers use downsample feature maps that can restore input images into the original resolution. FCN models automatically produce a pixel-to-pixelbased segmentation map for medical image segmentation [12]. For the dense semantic segmentation task, the dense label provides accurate solutions for small lesions or objects and produces improved solutions with reduced classification error rate. This approach, however, has two issues. First, the receptive field could be large by using a convolutional layer with pooling or striding. Second, the resolution of input images is downsampled, and small objects with different scales and shapes in multiclass scenarios are difficult to classify [13].
Thus, the application of deconvolution or bilinear interpolation by using an upsampling layer at the decoder part of the network could not assure an accurate segmentation map. Various state-of-the-art networks, such as DeepLab with an atrous spatial pyramid pooling (ASPP) module [14], U-Net [15], FCN [12], dense FCN, and residual dense FCN, have been proposed to handle this issue. DeepLabv2 [16] introduces ASPP for semantic segmentation.
The ASPP network handles objects at multiple scales [17] to capture multiscale information on the basis of parallel atrous-based CNN layers by using multiple atrous rates. It achieves efficient and robust performance in dense semantic labeling and detects small objects in various scales and shapes.
ASPP consists of different numbers of atrous CNN layers and combines these layers in a parallel branch. Each layer uses different multiple kernel functions to capture feature maps with specific field of view. Simple 3D CNN models might not perform well, and we need a hybrid solution to tackle 3D segmentation problems and incorporate some valuable information at the dataset level and in model architecture.
In this study, we proposed a 3D-ASPP module in a 3D Den-sNet network (3D-DN) model for the automatic segmentation of CT images. ASPP block could be integrated within any CNN model. The ASPP module was embedded with the proposed 3D-DN as a bottom layer to extract contextual information with multiple resolutions. We extended 2D ASPP into VOLUME 8, 2020 a 3D-ASPP block and integrated it with the proposed 3D-DN for volumetric segmentation. In accordance with the literature review and our knowledge, the 3D-ASPP module was proposed for the first time. A deep learning (DL)-based model, which processed volumetric slices for volumetric segmentation on the basis of abdominal CT volumes, was proposed. We also proposed volumetric convolutions that processed 3D input slices as a volume. The main contributions of this work are follows: • First, we proposed 3D-DN for 3D segmentation. 3D-DN involved an encoder-decoder structure. Various 3D Den-sNet blocks were introduced into the encoder-decoder part of the network.
• Second, we introduced a 3D-ASPP module into 3D-DN at the bottom of the encoder-decoder module. The 3D-ASPP block captured the spatial information at multiscale and produced improved segmentation. ASPP extracted contextual information in a multiscale form and recovered spatial information with different fields of view to determine sharp object boundaries from encoder to decoder. Our approach for 3D volumetric segmentation provided an automatic solution by indicating the complete volume of a patient at once for accurate and robust segmentation. In our proposed method, no postprocessing steps were required for further processing and evaluation. We tested our model on abdominal CT datasets, Thoracic Organs at Risk (SegTHOR) 2019 [18]. The proposed technique was generalized for the SegTHOR 2019 dataset, and its performance was compared with that of existing state-of-the-art segmentation models.

II. RELATED WORK
DL models have been used for natural language processing, image analysis [19], image classification, and image segmentation. These deep neural network (DNN) models have been successfully used in medical imaging challenges [20]. Meanwhile, CNN-based models have shown best performance in the medical imaging domain by using either patchbased or multiscale pixel-based segmentation approaches that could increase the segmentation results.
For example, Zhang et al. introduced CNN-based brain tissue segmentation based on multimodal magnetic resonance imaging (MRI) [21]. Pereira et al. introduced a CNN model for the segmentation of brain tumors in multimodal MRI [22] and claimed satisfactory results for complete tumors. Lee et al. proposed a DL-based CNN model for brain segmentation features [23]. Li et al. proposed a 2D CNN model using CT slices for segmentation and compared its performance with that of traditional machine learning approaches, such as AdaBoost [24], random forests [25], and support vector machine [26]. This study determined that CNNs had limitation in tumor segmentation due to unclear borders and uneven density. However, existing CNN models have been used for segmentation on the basis of 2D slices that are extracted from 3D volumes. These models could be a choice on the basis of using low computational resources and memory consumptions. We required an automatic 3D volumetric segmentation solution based on DL models that could consider the axial direction of 3D volumes to assist medical doctors and would be beneficial in health applications.
In early studies, Shakeri et al. introduced a 2D CNN incorporated with a 3D conditional random field algorithm as postprocessing for tumor detection from brain slices [27] to achieve volumetric consistency. Cihik et al. used first 3D CNN-based U-Net for the segmentation of sparsely sequential volumetric images and then a 3D model on a large scale [28]. Dolz et al. proposed a 3D CNN based on an FCN model for the segmentation of brain MRI images [29]. They introduced small kernels in DNNs to reduce computational complexity and memory cost in 3D CNN models. Andermatt et al. [30] introduced a 3D recurrent neural network for the segmentation of gray and white matters in a brain MRI dataset.
Bui et al. introduced a 3D dense CNN for the segmentation of a brain volumetric dataset [31]. A densely connected 3D model captured multiscale contextual information and achieved fast convergence with an effective discrimination capability [32]. Oktay et al. introduced a 3D attention U-Net model for medical image segmentation. This novel technique captured target structures of different shapes and sizes [33] from a medical imaging dataset. Nevertheless, 3D CNNs still encounter bottlenecks due to hardware limitation, complex datasets, and memory constraints. In the literature, most deep 3D CNN models use a brain MRI dataset. On the contrary, we covered some recent advancements in abdominal datasets by using 3D CNN models.
Lu et al. presented a 3D CNN model with graph cut technique for the segmentation of liver tumors but tested it on only one dataset to judge the generalized behavior of the model [34]. Few authors [35]- [38] used the deep learning models for segmentation of other CT organs such as Liver and tumor and they applied FCNs models based on encoding and decoding techniques. The most of DL models that have been used in liver or lesion segmentation are based on 2D slices that are extracted from 3D image volumes. These models do not fully consider the spatial information from 3D volumes.
We tested our proposed model with the existing publicly available SegTHOR 2019 dataset. The main challenge in this dataset is the shape and position of each organ at each slice vary greatly and contrast in CT images is very low. This is a multiclass segmentation problem and the dataset contains 4 organs as risk such as heart, aorta, trachea, esophagus. Recently, the deep learning-based models using SegTHOR 2019 dataset has been used published for segmentation problem [40]. He et al. [41] presented Dense V-Net deep learning model and achieved optimal performance with some postprocessing steps. They used patch processing approach using Dense V-Net to avoid extra computational burden. Vesal et al. [42] introduced 2d T-Net approach using organ at risk dataset. Han et al. [43] proposed 3D U-Net with multitasking techniques and also applied some auxiliary task for generalization the proposed using SegTHOR 2019 dataset. Kim et al. [44] presented 2D dilated residual U-net model and produced better results. Chen et al. [45] presented V-Net model for organ at risk dataset. Zhang et al. [46] used cascading approach based on 3D U-Net two models that could produce more computationally complexity during training their proposed models. Lalande et al. [47] introduced ensemble approach to fuse three-axis information for segmentation of SegTHOR dataset. Milletari et al. [48] proposed two-level deep learning model and achieved better performance on organ at risk dataset. The detail of each DL-based method can be found in recently published works [40]- [48] and is shown in Table. 1 in result section.

A. DATASETS 1) SEGTHOR 2019 DATASET
The SegTHOR 2019 [18] dataset contains 40 cases in training and 20 cases in testing. The images or slices vary from 150 to 284 with a plane input image size of 512 × 512. The spatial resolution also varies from 0.90 mm to 1.37 mm with a slice thickness between 2 and 3.7 mm. In this experiment, we used 32 CT cases (the number of patients) for training, 8 for validation, and 20 for testing. The problem was a multiclass segmentation one, and each patient in the dataset consisted of four classes (aorta, esophagus, trachea, and heart). All CT scans comprised a high variation in z direction with anisotropic dimension. The dataset is publicly available on all grand challenge websites. The test sample based on SegTHOR 2019 is shown in Fig. 1. Each block with detailed explanation is provided in the following subsection.

1) 3D-ASPP LAYER
3D-ASPP comprised various atrous convolutional layers with multiple rates of convolution kernels. It used blocks of different atrous convolutional layers with a spatial pyramid pooling layer. It captured spatial information from input feature maps in multiple scales and shapes for accurate semantic segmentation and classification. However, ASPP provided contextual information by using multiple atrous layers blocks. Atrous convolution produced promising results by capturing the resolution of features with different rates by using deep CNNs. It adjusted the receptive field in such a way to acquire multiscale information from the set of features. Atrous convolution has been applied to input image (x) with some kernel filters w and produce output y, as shown as follows [16]:  where r represents the atrous factor that controls the stride for input image. Atrous convolution inserted some zeros, e.g., (r − 1) values, among kernel filters and convolved the input x with those filters. The receptive field of filters could be modified using different rate r values. The proposed 3D-DN was integrated into the proposed 3D-ASPP network to improve the performance of volumetric segmentation. ASPP used a combination of four parallel atrous convolutions by using different atrous rates and one global average pooling layer. The ASPP module consisted of a parallel 3 × 3.3D convolutional layer with atrous rates of 3, 5, and 7 and one 1 × 1.3D convolution. One average pooling layer was used with the same input feature maps as image level features. The feature maps from all branches were bilinearly upsampled. After the concatenation of all layers, 1 × 1 × 1 convolution was used. The 3D-ASPP module was applied on the feature maps from the proposed 3D decoder part. After the transition layer, the output feature maps were fed into the 3D decoder part of the network, as shown in Fig. 2.

2) PROPOSED MODEL
Our model was based on a hybrid approach using 3D-ASPP with 3D-DN, as shown in Fig. 3. We incorporated the 3D-ASPP module at the bottoms of the 3D encoder and decoder densNet network to boost the contextual information of the spatial features of the input volume. The decoder part is a regular 3D-DN based on 3D proposed dense blocks, except the bottom layer. The proposed model consisted of an encoder and decoder path with some skip connection and bottom block. Appropriate features were extracted using a 3D dense based convolutional block at each level in the decoder, and their resolution was reduced using appropriate stride with 3D stride convolutional layer. The decoder path was divided into different stages with varying resolutions. Each stage in the decoder path consisted of one to three desne based convolutional layers having volumetric kernel of 3 × 3 × 3. The input data resolution was halved by passing every stage by using a convolutional layer with 2 × 2 × 2 voxel kernels applied with stride 2. The number of features was doubled for each stage in the decoder block of the proposed network. After the convolutional layer, an activation function (PReLU) was applied throughout the network. The dense block was used at each stage of the decoder and encoder path. The densNet function consisted of convolutional and PReLU with batch normalization (BN) layers with repeated blocks. The dense block consisted of a combination of 3D convolutional layer, ReLu, and BN layer. The input for every stage in the dense block passed to the combination of convolution and nonlinearities (ReLu+BN). The information obtained from the last layer was used for the next stage of the dense block.
The size of input signal was reduced through downsampling, and the receptive field was increased in the successive network layer. The proposed 3D-ASPP module was integrated at the bottom of the decoder and encoder path of the proposed 3D-DN.
The module used the feature maps from the last bottom layers from the decoder side and fed feature maps at the de-convolutional layer from the encoder side of the network. In the encoder path, after each stage, the deconvolutional layer increased the input dimension size in the block of convolutional layers. The network extracted and expanded the spatial feature size from a low resolution to gather essential information for 3D volumetric segmentation. The feature maps from the dense block in the decoder side were concatenated with a downsampled densNet block in the encoder side. The 1×1×1 convolutional layer used to compute two feature maps produced output with the same input volume size. The softmax layer produced a segmentation map for foreground and background voxel-wise regions. The proposed model is shown in Fig. 3.

3) LOSS FUNCTION
The Dice coefficient (DC) proposed in [10] was used to compute the loss function. The loss function between predicted and GT segmentation for 3D was optimized using (2). The loss L is shown in (2) and directly evaluates the similarity of two samples.
where s i and g i denote the predicted segmentation map and manually provided segmentation map, respectively; and N is the total number of voxels.

4) IMPLEMENTATION DETAIL
The proposed model was built using PyTorch library. All models were trained from scratch. The Adam optimizers used the defined learning rate of 0.0008. The number of epochs was set to 10000 for SegTHOR 2019. The batch size was set to 2. The training of the proposed model was conducted using an NVIDIA GTX 1080 GPU having 12GM memory.

IV. RESULTS
Eighty percent of the dataset was used for training, and the remaining 20% was used for the prediction on the SegTHOR 2019 dataset. The input slices for all cases were downsampled to 256 × 256 in plane resolution to simplify the computation. In the SegTHOR 2019 dataset, the input resolution size of each volume was set to 256 × 256. Slices of lesion with five additional slices empty from the start and end of each volume slice were selected. The qualitative and quantitative results are demonstrated in the next section. Fig. 4 demonstrates the visualization segmentation results of the proposed model. Two different cases have been described in this manuscript. The axial, coronal, sagittal, and 3D views for GT and predicted segmentation visualization are shown in Fig. 4. Any interpolation in tumor slices to avoid information loss from input volume was not needed. Lesion or masks with a large size were segmented, and some lesions with small organs were hardly segmented using our proposed model. Some other organs, such as esophagus, trachea, and aorta, have a small size, and a low contrast exists surrounding the heart. Our proposed model could still segment such small organs. In Fig. 4, the first row represents the GT, and the second row represents the predicted values based on the proposed model. Fig. 5 shows the segmentation map for another subject based on the SegTHOR 2019 dataset.

B. PERFORMANCE METRICS
Performance metrics [47] were used for the test and evaluation of our proposed model on the datasets. Specifically, the following evaluation metrics were adopted to test the proposed model. The results were compared with those of existing DL models.

1) SENSITIVITY
Sensitivity was used to compute the positive portion of voxels by using GT and predicted segmentation masks.
2) SPECIFICITY Specificity, also called true negative rate (TNR),was used to compute the performance on the basis of GT and predicted segmentation masks.

3) JACCARD COEFFICIENT (JC)
JC is defined as where A is the GT, and B denotes the predicted volume.

4) DICE SIMILARITY COEFFICIENTS (DCS)
DC is commonly used for the validation of medical volume segmentations. It is also called overlap index. It is used to measure the overlap between GT and predicted VOLUME 8, 2020 FIGURE 5. The first row represents the manual masks for axial, sagittal, coronal, and 3D views of the SegTHOR dataset. The second row shows the predicted values of axial, coronal, sagittal, and 3D views by using the proposed model. The heart (green color), esophagus (red), trachea (blue), and aorta (yellow) are presented.
segmentation masks. For GT and predicted masks, DC is defined as Two types of DCs could be measured. The first is called global Dice that is applied on complete volume for all cases and second is the local dice that was used for individual test sample.

5) VOLUME OVERLAP ERROR (VOE)
VOE is defined as

6) RELATIVE VOLUME DIFFERENCE (RVD)
RVD is expressed as follows:

8) HAUSDORFF DISTANCE (HD)
Symmetric HD was used to compute the (symmetric) HD between the binary objects in the two segmentation masks. It is defined as the maximum surface distance (MSD) between the objects.

C. COMPARISON WITH STATE-OF-THE-ART METHODS
The segmentation models provided an automatic segmentation map and could be evaluated using different performance metrics.
Various performance metrics, such as accuracy, DCs, JC, MSD, RVD, and average symmetric surface distance (ASSD), were used to compute the performance of the proposed and existing state-of-the art DL models. High values of Dice and three other metrics (JC, sensitivity, and specificity) indicate improved segmentation performance.
The proposed 3D-DN with a 3D-ASPP block was evaluated and compared with existing state-of-the-art methods for SegTHOR 2019 dataset. Our proposed model produced optimal results compared with existing methods. It was avalidated to have robust performance by using the recently published SegTHOR 2019 dataset. We reimplemented 3D simple-VNet and 3D attention U-Net models and compared their results with those of our proposed model and existing state-of-the-art results. Our proposed model outperformed existing segmentation models on SegTHOR 2019. The DCs for 20 test patients based on SegTHOR 2019 were reported. These Dice scores were computed using the proposed, attention U-Net, and simple-VNet models. As shown in Fig. 5, the proposed model produced excellent results compared with existing models. The Dice values by using the proposed and existing models based on 20 test samples from the SegTHOR 2019 dataset are presented in Fig. 5. The proposed model clearly showed good performance compared with existing models. The correlation coefficients based on GT and predicted values by using the proposed and existing 3D models are shown in Fig. 6. These correlation coefficients validated that our proposed model obtained better performance compared with existing models. The results confirmed that the proposed model successfully segmented on the SegTHOR 2019 dataset and could be used for the evaluation of 3D volumetric biomedical images. A comparison of the proposed model with state-of-the-art models by using various performance metrics on the basis of the SegTHOR 2019 dataset is shown in Table 1. The simple 3D-VNet and the 3D-VNet with an attention module were reimplemented, and the results were compared with those of the proposed model. The proposed model produced better DCs and other performance metrics compared with existing 3D models.

D. DISCUSSION
The considerable improvement in computer hardware and data availability in 3D medical imaging has enabled 3D medical segmentation by utilizing spatial information. Comprehensive information can be produced in any direction on the basis of volumetric images rather than be viewed in a single direction in 2D approaches. A deep model could usually be used to extract highly informative features from complicated organs via segmentation algorithms for volumetric images. The main challenge is the training of these deep networks for 3D models. Most DL architectures are based on FCN, and hybrid-based FCN models are proposed for tumor and liver segmentation tasks. The FCN used a fixed receptive field and could fail in the segmentation of objects with varying sizes. The fixed-size receptive field issue in FCN could be resolved by increasing the field of view.
A sliding window based on complete images by using uniform patches can be used in the FCN model to handle the problem of fixed-size receptive field. The proposed model based on ASPP extracted multiscale information from input images and provided an efficient solution for fixedsize receptive field networks. DNNs encounter some issues during the training of these networks. The first issue is the overfitting, which would occur during the training of the proposed model due to the less samples of the datasets in comparison with the weights of 3D DL models [30].
A small training dataset could produce an overfitting problem, which could be minimized using the data augmentation and dropout layer. The dropout layer was used in the proposed model to handle the overfitting issue. The considerable training time would be the second issue if we have limited hardware resources. Recently, the computation time for training could be minimized using convolution based on stride [19] that would provide the same effect as that with a pooling layer while achieving faster convergence. BN was also used for rapid convergence of the deep models. Pooling and downsampling techniques could reduce the performance and might loss beneficial information. Gradient vanishing is the third issue, which could happen during the deep network training. We used a desne with carried feature maps from previous layers to handle the vanishing gradient problem during the backpropagation while training the proposed model.
Target organs present a heterogeneous appearance that depends on shape, location, and size from patient to patient [42] and could induce a great challenge in pixel-based image segmentation.
The limited contrast and ambiguous boundary that arise with target organs and surrounding tissues refer to the fourth issue and are usually caused by the attenuation coefficient in CT [23]. Superpixel information and different weight-based techniques with different weighting used in class imbalance could be used to handle such an issue [12].
The aforementioned issues in 3D DL models might lead to decreased performance when handling 3D volumetric datasets because of less data samples and a large number of parameters compared with input 3D data samples and the low variance among voxels with neighboring voxels. The proposed model used multiscale contextual feature information by utilizing an ASPP module and handled the heterogeneous appearance and varying sizes, shapes, and locations of target organs and neighboring tissues.
The main objective of this study is to enhance the accuracy and quality of segmentation based on volumetric 3D CT images. A model based on 3D-ASPP was proposed to improve the segmentation accuracy by capturing multiscale features. It could estimate the pixels at the decoder side of the network with an improved capability to reconstruct small organs with different sizes, shapes, and irregular structures from abdominal CT medical images. The visual and quantitative experimental result reported that the 3D-ASPP-integrated denseNet model achieved better performance in 3D medical segmentation compared with existing segmentation models.
Automatic segmentation based on abdominal CT scans could provide enhanced solutions based on morphometric features. However, the computational complexity would also increase if the number of feature maps and the filter size are large in the encoder-decoder part of the architecture. In consideration of the limited computational resources and GPU memory, we need to design an effective and deep 3D model for improved segmentation performance. The proposed model produced excellent results by using limited memory and computational resources.

V. CONCLUSION
The proposed model in this paper was applied on two different datasets for the segmentation of liver and tumor from CT images and other abdominal CT scans. The proposed approach utilized 3D-ASPP and could acquire multiscale features in a 3D way with 3D-DN based on encoder-decoder to extract discriminative features. Consequently, the proposed model achieved accurate and detailed segmentation results on the basis of CT images. It also achieved excellent performance metrics compared with existing 3D models in the biomedical segmentation field. It could be used for volumetric segmentation in clinical applications to diagnose problems from medical images without the intervention of medical doctors and might be helpful in the medical community for the classification and segmentation of medical images.
In the future, we will explore new models based on the attention mechanism or incorporate different parameters into VOLUME 8, 2020 the proposed model to generalize and train well for another medical dataset in the biomedical domain.