MCMC Guided CNN Training and Segmentation for Pancreas Extraction

Efficient organ segmentation is the precondition of various quantitative analysis. Segmenting the pancreas from abdominal CT images is a challenging task because of its high anatomical variability in shape, size and location. What's more, the pancreas only occupies a small portion in abdomen, and the organ border is very fuzzy. All these factors make the segmentation methods of other organs less suitable for the pancreas segmentation. In this report, we propose a Markov Chain Monte Carlo (MCMC) sampling guided convolutional neural network (CNN) approach, in order to handle such difficulties in morphological and photometric variabilities. Specifically, the proposed method mainly contains three steps: First, registration is carried out to mitigate the body weight and location variability. Then, an MCMC sampling is employed to guide the sampling of 3D patches, which are fed to the CNN for training. At the same time, the pancreas distribution is also learned for the subsequent segmentation. Third, sampled from the learned distribution, an MCMC process guides the segmentation process. Lastly, the patches based segmentation is fused using a Bayesian voting scheme. This method is evaluated on the NIH pancreatic datasets which contains 82 abdominal contrast-enhanced CT volumes. Finally, we achieved a competitive result of 78.13% Dice Similarity Coefficient value and 82.65% Recall value in testing data.


Introduction
Accurate organ segmentation is the prerequisite of many subsequent computer based analysis.In recent years, with the rapid development of deep neural network, automatic segmentation of many organs and tissues have achieved good results, such as segmentation of the cortical and sub-cortical structures, lung, liver, heart, etc.. Mittal et al. (2018); Park et al. (2019); Qin et al. (2018); Li et al. (2018); Avendi et al. (2016); Ngo et al. (2017).For the pancreas, however, while the segmentation result has been improved substantially from the pre-deep learning era, the accuracy is still relatively low compared with other organs Zhou et al. (2017); Fu et al. (2018); Roth et al. (2015); Farag et al. (2017); Roth et al. (2016aRoth et al. ( , 2018)).This is mainly caused by several aspects: 1) the shape, size and position of the pancreas vary greatly among different patients in abdomen; 2) the contrast between the pancreas and surrounding tissue is weak; 3) pancreas is relatively soft and easy to be pushed by surrounding organs, which could result in large deformation; 4)pancreas occupies a very small portion in the CT image.All these factors make accurate segmentation still a challenging task.

Related works
In recent years, more attention has been paid to to the segmentation of pancreas.Many pancreas segmentation methods have been proposed.In Karasawa et al. (2017), authors adopted a multi-atlas framework for the pancreas segmentation.Specifically, the region of pancreas is extracted by the relative posi-tion and structure of pancreas and liver.Then, using the vessel structure around the pancreas, images from a training dataset are registrated to image to be segmented.Then, the best registration, evaluated based on the similarity of the vessel structure, is chosen.This method reports a result with a dice similarity coefficient (DSC) of 78.5±14.0%.
Due to the low constrast nature of the pancreas, more advanced deep learning and machine learning approaches are used.Such as using CNN in classification Zhou et al. (2017); Fu et al. (2018) as well as a combination of CNN and random forest Roth et al. (2015); Farag et al. (2017); Roth et al. (2016aRoth et al. ( , 2018)).
Unlike the top-down approach based on multi-atlas in Karasawa et al. (2017), researchers in Farag et al. (2014Farag et al. ( , 2017) ) proposed a bottom-up pancreas segmentation strategy.It decomposes all 2D slices of a patient into boundary-preserving superpixels by over segmentation.Then, it extracts superpixel-level feature from the original CT image slices and the dense image patch response maps to classify superpixels as pancreas and non-pancreas by using a a two-level cascade random forest classifier.Comparing with Karasawa et al. (2017), these methods have less data requirements, but the results have a slightly lower DSC of 68.8±25.6% in Farag et al. (2014) and 70.7±13.0%in Farag et al. (2017).
Similar to Farag et al. (2017), the methods proposed by Roth et al. (2015Roth et al. ( , 2016aRoth et al. ( , 2018) ) all combine the random forest classification and the CNN.In Roth et al. (2015), authors presents a coarse-to-fine approach in which multi-level CNN is employed on both image patches and regions.In this approach, an initial set of superpixel regions are generated from the input CT images by a coarse cascade process of random forests based on Farag et al. (2014).Serving as regional candidates, these superpixel regions possess high sensitivity but low precision.Next, trained CNN are used to classify superpixel regions as pancreas and non-pancreas.3D Gaussian smoothing and 2D conditional random fields are used for post-processing finally.Different from Roth et al. (2015), researchers in Roth et al. (2016a) proposed a method which is using random forest classification to classify superpixels.This is similar to Farag et al. (2014) and Farag et al. (2017).But the superpixel and feature are generated via Holistically-Nested Networks, which extract the pancreas interior and boundary mid-level cues.Before this step, this model gets the interesting region by the method in Farag et al. (2014).Based on Roth et al. (2016a), authors in Roth et al. (2018) made some improvements, and the major improvment is that that they get the interesting region by a new general deep learning based approach.In Roth et al. (2018), the algorithm learns mid-level cues via Holistically-Nested Networks firstly.Then, it obtains the interesting region by a multiview aggregated Holistically-Nested Networks and the largest connected component analysis.Finally, the random forest classification is adopted to classify superpixels, too.
There have also been some methods purely using the CNN for the segmentation.For example, a fixed-point model that shrinks the input region by the predicted segmentation mask was proposed in Zhou et al. (2017).While the parameters of network remain unchanged, the regions are optimized by an it-erative process.Contrasting to Zhou et al. (2017), the approach in Fu et al. (2018) pays more attention to the structure of network: It proposed a novel network which is based on the Richer Feature Convolutional Network.This network replaces the simple up-sampling operation into multi-layer up-sampling structure in all stages.

Contribution of this paper
In this paper, we propose a robust and efficient segmentation approach based the Markov Chain Monte Carlo (MCMC) guided CNN.The proposed aproach mainly consists of three parts: First, it locates pancreas in the abdominal CT image by image registraion, we get a irregular candidate region with a large number of background pixels being rejected and almost all foreground pixels being preserved; Then, the MCMC samples guide the training of the 3D CNN to classify pixels in the candidate region; Finally in the segmentation, the MCMC guides the trained network to perform the fused segmentation across all the image domain.
The contributions of our work are mainly summarized in the following four points: 1) Inspired by coarse to fine segmentation method and considered the fact that pancreas' region is small in abdomen, we try to use registration method to get a coarse segmentation to locate pancreas in abdomen, but in order to improve the accuracy of this coarse segmentation, we use the method of multi-atlas; 2) Adding some background pixels to the coarse segmentation's pancreas region, we generate a irregular candidate region which is like a bounding box, and the number of background pixels and foreground pixels which are entered into the CNN data will be more balanced because of these process; 3) Comparing with 2D CNN, 3D CNN can use the information between slices, but 3D CNN faces the problem of insufficient data.We fetch 3D image patches from candidate region to training CNN instead of the whole candidate region to avoid this problem ; 4) When training and testing CNN, we employ Markov Chain Monte Carlo method to guide this precess focusing more on the irregular candidate region so that CNN can focus on learn and test the internal and marginal features of pancreas shielding the interference of surrounding organs and tissues.
The remaining parts of the manuscript are organized as follows: the proposed learning and segmetnation framework is detailed in Section 2.Then, experiments are conducted on data set in Section 3. Finally, the work is concluded with furture direction discussed in Section 4.

Method
In this section, the proposed MCMC guided CNN learning and segmentation framework is detailed.Specifically, a Markov Chain Monte Carlo process is utilized to guide the learning in the sample space focusing more on the target region in Section 2.1.After the learning, the segmentation is also governed by the optimal filtering, which results in a robust and fast 3D segmentation, detailed in Section 2.3.

Joint learning of appearence and location
Denote the image to be segmented as I : R 3 → R. The segmentation of I is seeking for a indicator function J : R 3 → {0, 1} whose 0 valued pixels indicate the background and the 1s indicate the pancreas.Such a characteristic function J can be viewed as a special case of the probability density function (pdf) p : R 3 → [0, 1] where the value is a measure for the pixel being inside the hippocampus.
Viewed this way, the pdf p can be estimated using a filtering based convolutional neural network framework.Specifically, the samples are drawn from p based on an MCMC process, which guides the CNN to train on the key regions of the target as well as the border area.Then in the segmentation stage, the same MCMC process again guide the trained CNN to sample and segmentation each patches.This effectively avoid sampling from the entire image domain.Instead, it only learn and discriminate on the regions with higher probability of finding the target.The iterative process provides a final accuracy segmentation for the pancreas in I by fusing the patch segmentations.
More explicitly, given x ∈ R 3 and its neighborhood N(x) . Then, the probability p(L(x), x) ultimately provides the information whether the point x belongs to the pancreas in I.
Apparently, we have where p(x) is the prior distribution and the p(L(x)|x) depicts the probability of a certain patch L(x) being inside the pancreas, conditioned on its spatial location.The goal is to obtain an estimation for p(L(x), x).
To that end, we have a set of training images I i : R 3 → R, i = 1, . . ., M segmented with their pancreas segmentation denoted as J i : R 3 → {0, 1}, i = 1, . . ., M. The training set provides us with the location and contexual information for the pancreas.Specifically, the prior distribution p(x) in Equation 1 can be learned from J i directly.Moreover, the second term will be learned through an optimal filter guided CNN.
To proceed, it is realized that the training set has a large variance on the shape and size of the pancreas.Ideally, all such variations could be learned with the above framework.However, normalization registration would be helpful to reduce the variance and leaves less burdern on the learning.
Before the emerging of the convolutional neural network approach, multi-atlas is a robust algorithm that addresses the problem of medical image segmentation by (multiple) registrations Gao et al. (2015).It achieved very high segmentation accuracy, especially for the brain structures Wang et al. (2013); Gao et al. (2012); Huo et al. (2016); Erus et al. (2018); Huo et al. (2018).Unfortunately, comparing to brain segmentation, the performance on other sites is much worse.
Such discrepency is understandable: the shape, size, and image appearence in abdominal images vary much severely than that in the brain image.As a result, non-linear registration performs worse on abdominal images.
Based on this rationale, in this study we only use affine registration among the images to mitigate the training variance, leaving the rest to the machine learning framework.
To proceed, a random training image is picked from I i : i = 1, . . ., M and we denote the image as Ĩ and its segmentation as J.Then, we have the registration computated as: where T can be written as D(I i , I j ) denotes suitable dis-similarity functional between the two images I i and I j .The optimal registration transformations T i can be computed through regular gradient or Newton based procedures.As a result, Ĩ ≈ I i • T i =: Ĩi , ∀i and J ≈ J i • T i =: Ji , ∀i.
Once registered to a common space, the collection of Ji represent the spatial distribution of the pancreas in the training data.Specifically, compute P : R 3 → R + as : Ji dx (3) and we get the prior distribution of the pancreas in P.

MCMC sampling guided training with 3D CNN
Learning the segmentations of patches from 2D or 3D images and applying the learned model patch-wise for the testing image, is a common approaches in learning-based segmentation.
However, one difficulty is how the samples are drawn from the entire image domain.If the patches are taken uniformly from the entire image domain, then a large portion of the patches are empty (all-zero mask).The learning of such patches is therefore a process of mapping certain grayscale image to an all-zero image using, for example, convolutional neural network.
Unfortunately, learning such empty masks does more harms than wasting time.Indeed, it has a trivial global optimal solution: When all the parameters are set to zero, the output will be an empty mask.Especially when the area of the object takes a small portion of the entire image domain, uniformly aquiring patches will result in such a situation.Therefore, many researchers use an extra mechanism, such as a bounding box, to limit the volume from where the patches are generated.However, how to determine such a bounding box for the testing image also poses new problem.
Therefore, on one hand, we want the training patches to contain a certain amount of empty patches to learn the negative appearence.On the other hand, we don't want too many of them to stear the learn towards an unwanted global optimal.
In order to address such an issue, we propose to use the prior distribution P(x) defined in Eq 3 to guide the sampling process.Indeed, the high value in P(x) is an indication of being more likely to be inside the pancreas.Consequently, the image appearence in those regions are more typical for pancreas.Therefore, in the training process, more emphasis should be put on those regions.Specifically, one can draw samples {s i ∈ R 3 : s i ∼ P(x)}.Then, patches can be taken around s i 's and be fed to the neural network for learning.
Due to the arbitrariness of P(x), one can't use regular sampling schemes as for a normal distribution.In this work, we use the Markov Chain Monte Carlo draw samples from P(x).
Markov chain Monte Carlo (MCMC) is an class of algorithms, which are often applied to produce samples from multidimensional probability distribution, and these samples can approximate this original probability distribution Hastings (1970).We construct a Markov chain based on a stationary probability distribution, which is equal to the original probability distribution firstly when we draw samples by MCMC.Then we draw samples by this Markov chain.Finally we can get the target samples when the Markov chain achieves a stationary distribution.
Specifically, the Metropolis-Hastings (MH) algorithm is used in this work to obtain random samples from P(x) which is hard to sample directly.The main steps of MH guided patch generation algorithm Chib and Greenberg (1995) are detailed as follows.
Algorithm 1 Metropolis-Hastings guided patch generation 1: Input a state transition matrix q of this markov chain and stationary probability distribution P(x) 2: Set the threshold of the times of state transition as n 1 and the needed number of samples as n 2 3: Generate an initial state x 0 4: for t = 0, 1, 2, ..., n 1 + n 2 − 1 do 5: Generate a proposal state x * from q(x|x t ) 6: Draw a random number u from Uni f orm(0, 1) 7: Calculate the acceptance probability α(x t , x * ) = min{ p(x * )q(x * , x t ) p(x t )q(x t , x * ) , 1} 8: accept the proposal, set x t+1 = x * , extract a 3D patch at x t+1 10: else 11: x t+1 = x t 12: end if 13: end for After some steps n 1 , the samples {x n 1 , x n 1 +1 , . . ., x n 1 +n 2 −1 } correspond to the stationary probability distribution P(x).Once the samples s i 's are drawn, 3D patches Q i of the size z are taken around each s i : where j ∼ U({1, . . ., M}). (4) These 3D patches are then set as the input for the 3D CNN to learn.
Various convolutional neural network architectures have been proposed in the recent years Krizhevsky et al. (2012); Simonyan and Zisserman ( 2014 2015) which have achieved substantial success in particular the biomedical image segmentation field.U-Net adopts the symmetric encoder-decoder structure.In the process of encoder, the image was down-sampled several times, the size of feature map becomes smaller, but the feature channels increase.After this encoding, the network captures the low-level information of the image.In the process of decoder, the image was up-sampled four times.After this up-sampling, the network captures the high-level information of the image.Both the low-level information and the high-level information of the image are captured.This symmetrical structure has achieved good performance in the applications of various biomedical segmentation.
Compared to oringally proposed 2D U-Net, 3D U-Net uses the information between slices to make the segmentation results of adjacent slices more coherent and smooth, while 2D U-Net loses the information between slices Cicek et al. (2016).Since our CT images are 3D images, we directly use 3D U-Net.The structure of 3D U-Net and the detailed information of the 3D U-Net we used are shown in Figure 1.
Moreover, it is worth mentioning that many variants of U-Net have been proposed since the original version.While improvements are made in certain scenario, we only use the vanilla U-Net to demonstrate the benefit of proposed MCMC guided network.The proposed framework can certainly be synergized with more specifically designed U-Net or other CNN structures.

Prior guided segmentation
Given a new image I to be segmented, it is first registered, with an optimal affine transformation T , to the Ĩ in the training image, i.e., Once registered, the image Î falls into the same domain as the prior map P(x).As a result, similar MCMC sampling scheme can be used to generate sample regions from the testing image.Denote the regions to be r i ⊂ R 3 , i = 1, . . ., R.
Applying the trained CNN model to the image patches U i := Î| r i , we get the segmentation V i 's.Note that each V i only fills a portion of the entire image domain.In order to form a segmentation W(x) for the entire domain, we use a voting scheme: However, it is noted that the r i are not uniformly sampled from the entire domain, and different patches do overlap.As a result, certain regions in the domain may have a higher voting simply due to more patches are taken from that location.To mitigate this bias effect, a "sample prior" K is constructed as: The final segmentation for Î is therefore and that for I is

Implementation, Experiments and Results
We detail the algorithm implementation and experiments in Sections 3.1, and 3.2, respectively.

Implementation
In Section 3.1.1,we detail the necessary data preprocessing, including the image registration.After that, we generate the prior distribution, train the 3D CNN, and perform segmentation in Section 3.1.2.

Data Preprocessing
In this work, the image registration is performed using the DeedsBCV library Heinrich et al. (2013).As discussed in Section 2.1, the affine registration of DeedsBCV is used, which takes about one minutes for a 3D registration task.The registered training mask images forms the prior P(x) in Eq 3.
The shape of pancreas varies greatly in some people.So that the accuracy of some segmentatioan is not satisfactory by using single moving image to registrate.In order to improve the accuracy, we try to use multiple moving images registrate to fixed image.Adopting the multi-atlas idea, We use N moving images to register to one fixed image and botain N results of pancreas segmentation about this fixed image.The resulting prior image P can be thresholded to form a banary image with a threshold d: R is the new image which is a banary image.But R is just a rough segmentation of pancreas,too.
The false negative, false positive, true negative and true positive should be classfied by 3D U-Net.But pancreas is surrounded by many organs,this leads to the peripancreatic morphology being variable, so that this classification work is more difficult.What's more, pancreas is small in abdomen, so true negatives are far more than the others, and the most of these true negatives have nothing to do with the features of pancreas and the border of pancreas.This lead the input of network does not adequately represent the morphological features of the pancreas.
In order to reduce the burden of network's classification work and make sure cover all pancreas region, we expand the pancreatic region in the image R by k pixels.After this step, there are false positives, true positives and true negatives in the new image E. The region of pancreas in E is the region where we extract patches, so that the input of network is just the morphological feature of pancreas and the edge feature of pancreas.Besides, pancreas is small in the abdomen so that the number of true positives are much more than others, this candidate region could avoid this problem.Patches are extracted on the corresponding raw image and ground truth.

3D U-Net Output to Final Segmentation
Once the prior is formed, the MCMC samples drawn from P(x) guide the CNN training.In particular, the U-Net has the structed given in Figure 1.The 3D pathes extracted from the image have a size of 16 × 16 × 16, depicted in Figure 2.

Evaluation Criteria
In the experiments, we evaluate the results by Precision (positive predictive value), Recall (sensitivity), dice similarity coefficient (DSC), and Jaccard similarity coefficient.tp, f n, f p, and tn represent the number of true positives, false negatives, Precision is the proportion of positives correctly predicted among all positives predicted in prediction image.
Recall is the proportion of positives correctly predicted among all positives in ground truth.
Dice similarity coefficient(DSC) is a statistic used to measure the similarity of prediction image and ground truth.
Jaccard similarity coefficient is a statistic used to measure the similarity and diversity of prediction image and ground truth. 3.2.Experiments

Dataset and Pre-processing
To facilitate the comparison of results across different publications, we use the dataset provided by NIH Roth et al. (2016b, 2015); Clark et al. (2013).The dataset contains 82 abdominal contrast enhanced 3D CT images and has been manually labeled the segmentations of pancreas as ground-truth slice-byslice.Among them, 72 are picked for training and the remaining 10 are used testing.

Parameter optimization
After the images registration, we get the prior distribution P. In order to get banary image R, we set a threshold d to classify pixel value as 0 and 1.To determine the value of d, an image is randomly selected and its pixels are classified into 0 and 1 by d whose valuse is in (0, 72).We set d's value in (1, 50), then compute the average DSC of different values of d.The result is shown in Figure 3.It is found that when the value of d is 24, the DSC between R and ground truth is maximum, so we set threshold d as 24.In the next step, we expand the pancreatic region in the image R by k pixels.It is found that when the value of k is 5, the candidate region contains most of the pancreatic region.In this candidate region, the non-pancreatic region is in a suitable range.So we set k as 5.As for the 3D U-Net, we set the patch size as 16 × 16 × 16, and set batch size as 100.In our experiment, we use the binary cross entropy as loss function.
Finally, we search the threshold f in {1, . . ., 20} for the best average DSC.The result is shown in Figure 4.As can be seen, the average DSC of train data is the highest when f = 10.

Experimental Results
In this work, we proposed a pancreas segmentation method based on MCMC guided deep learning.These method effectively reduces the burden of network training, as well as allowing the network to locate the target more robustly.Benefitting from these, our model finally get a competitive result of an average 82.65%Recall value and an average 78.13%DSC value.
The training of the 3D-UNet takes 8 hours for 50000 epochs on a GPU(Nvidia GTX Titan X).Table 1 shows the performance after multi-atlas.The mean performance is 74.48%Recall after the process of multi-atlas which means that the position of pancreas is located roughly in abdomen.
Besides, as for the extracted candidate regions, there are 90% of cases being above 88.94%Recall with the mean Recall reaching 93.04% Recall.Based on this, it can be seen that most of the pancreas region is covered in these candidate regions.Table 2 shows the model's performance in training images, and Table 3 shows the model's performance in testing images.Moreover, only one outlier cases has a Recall value below 70%.For the Precision measurement, there are 80% of cases being above 71.56%Precision, respectively.It can be seen that the Recall of our model is higher than precision.It may indicate that our model could reserve pancreas area excellently in the prediction processes.
The average Hausdorff distance of our predict cases and ground truths is 11.90mm.And the maximum Hausdorff distance of our predict cases is 23.88mm, the minimum Hausdorff distance of our predict cases is 4.49mm.
Figure 5 shows the ROC curves after the process of multiatlas and the process of U-Net.But the left side of this picture is too crowded to see clearly, so we make false positive rate be semilog.In Figure 6,we find that when 0 < f < 19, the ROC curve which is from the process of U-Net is above the ROC curve which is from the process of multi-atlas distinctly.These reveal that when we extract the candidate regions by f = 10, our model successfully reject substantial amount of false-positive regions.
In Figure 7, we shows three examples of outputs from the outputs from the process of multi-atlas, the candidate regions Fig. 5.The red curve shows the ROC curve after the process of multi-atlas, and the blue curve is the ROC curve which is from the process of U-Net, the red curve is the ROC curve which is from the process of multi-atlas Fig. 6.ROC curves which make false positive rate be semilog.and the final segmentation results in testing.The first row is the best performance image with 87.49% DSC, 92.68% Recall and 82.85% Precision.The second row shows the image whose performance is close to the everage performance with 76.72% DSC, 76.26% Recall and 77.19% Precision.The third row is the worst performance image with 66.50% DSC, 81.54% Recall and 56.15% Precision.And in Figure 8, we show the 3d rendering of this three examples.From right ro left, the DSC of these examples are increasing.As can be seen, we can approximately locate the pancreas' region in its vicinity after the process of multi-atlas, but there are some pancreas regions that are missed.With the process of extracting candidate regions, more pancreas regions is included correctly.In addition, the marginal region of pancreas is included so that we could get the marginal information of pancreas.
The results are compared with some recent state-of-the-art methods on pancreas segmentation.Comparing the DSC in diffrernt models, our model's lowest DSC is higher than other Fig. 7. Displays three examples of the outputs from the process of multi-atlas, the candidate regions and the final segmentation results.And we compare these outputs to ground truth, the ground truth is marked as green curve, and the output is marked as red curve.From left to right, these images are the original CT images, the outputs from the process of multi-altas, the candidate regions and the final segmentation results models substantially.What's more, our model's lowest Reacll and Precision are higher significantly with the mean Recall and Precision being similar.These show that our model is robust.

Conclusion, Discussion, and Future Directions
In this work, we proposed a general purpose segmentation framework that uses the Monte Carlo Markov Chain (MCMC) to guide segmetnation of the 3D images.Specifically, the prior spatial distribution is learned and an MCMC scheme is utilized to generate samples from the prior.Such samples are used to guide the sampling of patches from the training images, which are input to the convolutional neural network.During the segmentation, the MCMC is employed again to sample from the high probability regions in the target image.The sampled regions are fed to the trained CNN, from which the final segmentation concensus are constructed.The proposed framework is applied to the abdonimal CT images to extract the pancreas, and an average Recall value of 82.65% with and an average DSC value of 78.13% are achieved.
Future directions include investigating the variances induced by the imaging parameters, such as the field of view, with/without contrast agent, slice/slab thickness, etc.Moreover, the proposed method will be used in conjunction with the classifcation of various pancreatic diseases.

Fig. 1 .
Fig. 1.The structure of 3D U-Net and the detailed information of the 3D U-Net we used

Fig. 3 .
Fig. 3. the average DSC changes with different values of d

Fig. 4 .
Fig. 4. The average DSC changes with different values of f

Fig. 8 .
Fig. 8. Displays three examples' results in 3D rendering, this three examples have been shown in Figure7, the ground truth is marked as green volume, and the output is marked as red volume.

Table 1 .
Performance after multi-atlas in testing images

Table 2 .
Model's performance in training images

Table 3 .
Model's performance in testing images