Automated Fetal Head Classification and Segmentation Using Ultrasound Video

During pregnancy, fetal ultrasound provides essential insight into a baby’s growth and development. In this ultrasound, accurate assessment of fetal head biometry is critical to the clinical management of pregnancy. Current methodologies used for fetal head biometry heavily rely upon sonographer skills and experience to locate a baby’s head. In this paper a novel approach is proposed to automate the fetal head biometry using live ultrasound feed; which is also capable of tackling low abdominal contrast against surroundings. Proposed model is trained on ALEXNET and UNET for classification and segmentation of headframes respectively from ultrasound video. To compute biparietal diameter (BPD) and head circumference (HC); which are essential requirements to compute gestational age, an ellipse is drawn on the contour of the annotated segmented fetal head. It should be noted that to validate gestational age estimate, ellipse are drawn on multiple best classified headframes obtained using ALEXNET. The proposed system is able to estimate gestational age within clinically acceptable ± one week of observed gestational age with an accuracy of 96%. Moreover, it uses robust machine vision features to reduce the sonographer’s interaction with the system, thus reducing the overall procedure time and making it independent of sonographer’s skill.


I. INTRODUCTION
T He current process of fetal ultrasound requires sonographers to locate the fetal organs such as the head, abdomen, and femur by examining the video feed while moving the probe [1] - [3]. The slightest movement in the probe results in a drastic change in the appearance of an organ's image, while the noise produced in the video feed also hinders locating the organ [4]. Consequently, the resultant error in fetal biometry and time required are inconsistent, which may vary over the period as the sonographer gains experience [5] - [7].
Furthermore, continuous movement of the probe may irritate the mother that can lead to undesirable movement, which in turn, enhances the resultant error [8]. It should also be noted that the availability of skilled sonographers poses a great challenge [9], [10]. Consistency in ultrasound (US) measurement during the second and third trimester is of paramount importance as any error in measurement may result in inaccurate prediction with regards to fetus health [11] - [15].
For example, an 8mm error in HC causes a one-week error in gestational age estimation in the second trimester. While, in the third trimester, a 5mm error in HC causes a one-week error in gestational age. [16] importantly, this error leads to the wrong Estimated Delivery Date (EDD) that increases the risk for mother and child health [17] - [19].
Head circumference (HC) and biparietal diameter (BPD), shown in Figure 1; are essential biometry parameters to estimate fetal gestational age (GA) [20], [21]. In the recent past, several systems have been proposed to automatically calculate fetal HC and BPD [22] - [32]. Lu W et al. used randomized hough transform and K-mean algorithms to automate the measurements of BPD and HC [33]. Furthermore, McManigle et al. applied a novel approach of boundary fragment model using random forest (RF) edge classification to automate segmentation and estimation of HC ellipse [34]. Similarly, Li et al. used RF with Ellipse fitting method and developed a learning-based framework that used prior knowledge of gestational age (GA) and ultrasound scanning [35]. Ciurte et al. proposed a framework based on semisupervised segmentation and minimum cut problem to help in formulating fetal head segmentation and tumors detection [36]. Moreover, in addition to Hough Transform Foi et al. [37] and Van den Heuval et al. [38] have used the Global multiscale and multi-start Nelder-Mead algorithm, respectively, with dynamic programming, to automate and detect fetal head based on 2D images. With dynamic programming to automate and detect fetal head based on 2D images.
Recently, some studies have been performed to enhance the accuracy of fetal measurements examinations. For instance, Danny Rahayu et al. [39] have used a Gaussian Filter, Morphology Operation and Canny Edge Detection to reduce noises and enhance the scanning results for accurate fetal measurements. Similarly, Kirthana. LP et al. [40] have used a deep-learning-based methodology named UNET with Hough transformation to improve the process of head segmentation.
Despite the advancements in fetal biometry automation, accuracy and reliability of these proposed approaches highly depend on sonographer skills, as input images are handpicked by the sonographer, which results in very limited the sample size for any given fetus. It is therefore desirable to have an automated ultrasound measurement, i.e., making the procedure independent of the skill and experience of the sonographer.
This study presents a robust automated fetal gestational age estimation technique using live ultrasound feed. The process starts from classifying headframes using ALEXNET obtained from the video instead of the image captured by a sonographer. Furthermore, occipitofrontal diameter (OFD) measurement, shown in Figure 1, is used to validate classified headframes.
Subsequently, segmentation on head classified frames is performed by a UNET with mask and annotated images, and then Least Square Ellipse (LSE) is employed to get the HC and BPD measurements. These measurements, in turn, are used by Hadlock formula [41] to compute the fetus's gestational age (GA).

A. PROBLEM STATEMENT
Standard practice to measure fetal biometry by a sonographer or a doctor is to take images and measure linear contour and ellipsoid diameter on screen as required by American Institute for Ultrasound in Medicine (AIUM). All measurements are taken manually by the physician or sonographer; this requires both knowledge and experience. Ultrasound images are produced from sound waves, full of obnoxious noises [42], such as Gaussian noise, Poisson noise, and atmospheric absorption and scattering [43].Owing to these noises, identifying organs becomes difficult and time-consuming for examiners [44], [45]. Furthermore, manual measurements are usually inconsistent [46] and require substantial experience [47] for repeatability. Furthermore, continued keystrokes during image acquisition may cause muscle injury to the examiner [48] - [49].
There is a shortage of trained sonographers, especially in developing countries [50] - [51]. This situation warrants a tremendous demand for an automated fetal biometry system. This the study employs clinically applicable automated classification, detection, localization, segmentation for fetal head biometry using ultrasound video to estimate (GA).

II. METHODOLOGY
This research proposes an approach to automate the fetal head biometry in real-time rather than on static images as employed by [22] - [32], [36]. Initially, the system autodetects i.e., classifies frames of interest to measure fetal head biometry from an ultrasound video as shown in Figure 2. For this purpose, multiple ultrasound videos of fully developed singleton fetuses, i.e., during second and third trimesters; have been used courtesy of Agha Khan Hospital, Karachi.
Once the fetal head has been extracted from ultrasound video, then segmentation (as shown in Figure 3) is performed to obtain parameters i.e., BPD and HC; required for fetal Gestational age estimation.

A. DATASET CONSISTENCY
This study utilizes two different datasets; provided by AKUH. The first of these datasets contain 10,000 labeled images of each class, i.e., fetal head, femur, and abdomen. For fetal head biometry such as HC, BPD, and Gestational age (GA) specified by the sonographer. These images used to train ALEXNET for classification purposes. A second Dataset consisted of 1,000 ultrasound videos (DICOM Files) validated to validate the proposed model. Each tape contained all three classes, i.e., fetal head, femur, and abdomen.

B. INCLUSION & EXCLUSION CRITERIA
Inclusion criterion were:  • Only fetal ultrasound of second and third trimester with gestational age between 18 and 42 weeks; and • A singleton pregnancy where external fetus body fully developed to measure (abdomen, femur, and head). Exclusion criterion were: • Suspected fetal growth restriction or malformation of the head; • Obesity; as it make it difficult to visualize fetal structures; • Oligohydramnios; determined by amniotic fluid index less than 5cm; and • Fetal distress or unstable maternal condition.

C. CLASSIFICATION
ALEXNET was used to classify, i.e., extract fetal head from ultrasound videos, which in addition to fetal head contains abdomen and femur. For this purpose, training and testing of ALEXNET were performed using 10,000 static images of each class, i.e., fetal head, abdomen, and femur. Furthermore, augmentation was performed to increase the data size and simulate various organs orientations in ultrasound videos. Figure 4 enlists steps taken for data augmentation.
Classification score of ALEXNET architecture are shown in Table 1.
Whereas, Figure 5 depicts accuracy and loss of training and validation of ALEXNET.
After successful training of ALEXNET, the model is evaluated on a video that contains all three classes, i.e., the fetal head, abdomen, and femur. Figure 6 shows ultrasound head classified video with a speed of 60 frames per second. The ultrasound video clip containing the head is approximately eight seconds in length, with approximately 480 frames.  Figure 7 depicts ALEXNET accuracy in extracting fetal head, which in turn is used to obtain threshold value used to identify the best fetal head frames from the video as shown in Figure 8. After extensive trials and in consultation with gynecologist; this threshold value has been identified as 90% i.e., for fetal head biometry, only those frames used which are given a score of 90 or above by ALEXNET. Figure 9 shows some of the best head classified frames obtained using ALEXNET.

1) Segmentation
Upon completion of classification, all valid head frames are used to produce a video containing fetal head only. A single frame of such video is shown in Figure 6, which enlist the probability of each organ and frame number. This video is then used for head segmentation and biometry purposes.
For segmentation purposes, UNET model is trained with 10,000 labeled fetal head images. UNET model is trained with different variations; enlisted in Table 2 using both Masked and annotated approaches. Results show that highest accuracy obtained using Mask and Annotated approach are 98.44% and 97.82% respectively; and increase of convolve and transpose layers to UNET architecture does not improve results. Figure 10 shows result of best performing UNET configuration model (selected from Table 2) with Mask approach.
However, during validation (some results shown in Figure  11, it was determined that sometimes a complete ellipse is not found using Mask approach, which is an essential requirement for fetal head biometry.
After training of UNET model with mask; localization and segmentation are performed on all classified head frames. To obtain a perfect ellipse depicting fetal head, Least Square Ellipse [55] - [57] approach is used on the contour of the predicted mask. It can be seen from Figure 12, that resultant ellipse is not a true representation of fetal head Training of UNET model with mask approach was therefore abandoned instead annotated approach was used as shown in Figure 13.
Furthermore, Figure 14 shows results for different precisions of head class obtained using ALEXNET. As discussed earlier; only frames with precision higher than a threshold of 90 % or more are used for segmentation purposes.
It can be seen from Figure 15 that better segmentation results i.e. Biparietal bones are clearly visible; are obtained for valid head frames (threshold > 90%) obtained using VOLUME 4, 2016   The primary objective of this study is to identify the best frame on which fetal biometry can be performed. It should be noted that a sonographer spends a significant amount of time ascertaining the best frames for repeated measurements [58]. To measure fetal head circumference, initially, grayscale images, shown in Figure 15 were binarize using eq(1) as shown in Figure 16.  clearly visible. Now for drawing of an ellipse, as shown in Figure 17; Least Square Ellipse method is used. It can be seen that for some cases, i.e., Frame 0,2,6,8 and 9, ellipses are not properly touching the contour of Biparietal bones.
Afterward, for validation purposes; the resultant Ellipses are superimposed on the actual frames, as shown in Figure  18.
As discussed earlier, in some cases ellipses are not properly touching contour of Biparietal bones, which results in erroneous measurement. To exclude such frames; initially for each frame an annotated image is created by superimposing respective binary ellipse on a dark image i.e. containing all pixels of 0; as shown in Figure 19. Now; interception is performed between each binarize segmented frame (shown in Figure 16) and corresponding annotated image (shown in Figure 19). This will yield only common pixels as shown in Figure 20. Based on pixel connectivity, it can be seen that for cases in which overlap occurs between ellipse and Biparietal bones, there are only two objects as shown in Figure 20. Whereas; otherwise number of objects are greater than two as shown in Figure 20. However, this interception process is quite time consuming. To overcome this problem, frames with invalid OFD are identified and discarded. For this purpose, major axis of ellipse i.e. OFD is computed for each frame. This value of OFD is then compared with specified range of OFD defined for second and third trimester [59] [60]. Now, interception is performed using eq(2) for only frames with valid OFD.
where n = no of classififed head frames from ultrasound video, A = Binarized segmented frame B = Annotated frame I = Intercepted image Fetal gestational age is then computed for all frames through two parameters i.e. HC and BPD separately using respective Headlock formulas [61], as shown in Table 3. It is evident from Table 3 that as expected frames with only two objects yield gestational age with better accuracy when compared with gestational age observed by sonographer. To compare between frames containing two objects and more than two objects; two different videos were made. One video contained all frames (shown in Figure 21) and the other video contained only frames having two objects (as shown in Figure  22).
It can be seen from these videos that as expected gestational age measured using video containing frames with only two objects is closer to gestational age observed by sonographer. However, still for such frames i.e. containing two objects only, occasionally age measured using BPD and HC are far apart as shown in Figure 22. It is also noted that age estimation using BPD is accurate in comparison to age computed using HC, it is primarily due to the fact that due to surrounding noise biparietal bones do not appear to meet each other, hence approximation is involved in computing HC. To overcome this problem, age computed using HC and BPD are compared with each other, as shown in Table 4. Afterwards, only frames for which difference in estimated age using both parameters i.e. BPD and HC is within one week of each other using eq(3,4 and 5) i.e. within allowable tolerance are shortlisted for fetal biometry as shown in Table 5. Finally, a video is made containing only shortlisted frames; as shown in Table 6 and as shown in Figure 23. As discussed, ultrasound video of 36th week gestational age is used; all figures are shown for pre-processing and postprocess of model classification, and segmentation are used to show frames of a sample video. Then, all steps are combined to produce an ultrasound video clip containing the fetal head of 36th week gestational age with approximately 480 frames, having a speed of 60 frames per second. Finally, the result is obtained as mentioned in Table 7 with 27 frames selected  in the ultrasound video clip. Each frame is classified with a fetal head frame. Besides, a system proposed in this study produces only 27 frames, and all frames give a result with a minimum error on the estimation of fetal gestational age as shown in Figure 23.

III. RESULTS AND DISCUSSION
A total of 1000 ultrasound videos of the second and the third trimesters (range: 18 to 40 weeks gestational age) were used to evaluate the performance of the proposed system in predicting fetal gestational age. It should be noted that allowable tolerance is ±1 week. For this purpose, initially, gestational age is calculated using fetal biometry parameters, VOLUME 4, 2016   i.e., HC and BPD separately, as shown in Figure 24 and Figure 25, respectively.
Comparison of calculated age with observed age for different trimesters using HC and BPD measurement, respectively; shows reasonable accuracy of 92 % -97 % as depicted in Figure 26 and Figure 27, respectively.
It was observed that age estimation using BPD parameter provides better accuracy in comparison to that with HC parameter for both second and third trimesters as depicted in Table 8. Primarily, in few instances presence of excessive noise occurs in the region where due to the invisibility of solid tissues of the fetal head causes gaps, as shown in   Figure 28; which in turn causes error in drawing the best fit ellipse. Furthermore, accuracy improves in the third trimester in comparison to the second trimester as with the progression of pregnancy organs growth implies that they appear with more the clarity in ultrasound. It can be seen that when both parameters i.e., HC and BPD are used simultaneously to calculate gestational age, accuracy improves slightly, as shown in Table 8.
It is therefore proposed to use both parameters in comput-ing gestational age, as it results in the accuracy of 96 -97 %.
Week wise results of proposed approach are shown in Figure  29.

IV. CONCLUSION AND FUTURE WORK
The proposed system can estimate gestational age within clinically acceptable ± one week of observed gestational age with an accuracy of 96%. Moreover, it uses robust machine vision features to reduce the sonographer's interaction with VOLUME 4, 2016 FIGURE 15. Segmentation on all best classified fetal head frames of ultrasound video with annotated UNET model. the system, thus reducing the overall procedure time and independent of the sonographer's skill. The extension of the study can be to automated measurements of femur length and abdomen circumference; for fetal weight estimation.

V. ACKNOWLEDGMENTS
The authors would like to thank Dr. Noruddin Badruddin, Head of Obstetrics and Gynecology Department, AKUH, and his team's helpful advice during the data acquisition.        35W3D  36W1D  0W5D  35W6D  2  36W0D  36W1D  35W2D  0W6D  35W5D  2  36W0D  36W1D  35W1D  1W0D  35W4D  2  36W0D  35W1D  36W0D  0W6D 35W4D Proposed Age 35W5D Where; • BI: Number of objects identified during interception; • Obs Age : gestational age observed by sonographer, • Cal Age: Calculated gestational age using proposed approach • diff: difference in age between age calculated using BPD and HC