Abstract:
Deep learning-based grading of the fundus images of the retina is an active area of research. Various existing studies use different deep learning architectures on differ...Show MoreMetadata
Abstract:
Deep learning-based grading of the fundus images of the retina is an active area of research. Various existing studies use different deep learning architectures on different datasets. Results of some of the studies could not be replicated in other studies. Thus a benchmarking study across multiple architectures spanning both classification and localization is needed. We present a comparative study of different state-of-the-art architectures trained on a proprietary dataset and tested on the publicly available Messidor-2 dataset. Although evidence is of utmost importance in AI-based medical diagnosis, most studies limit themselves to the classification performance and do not report the quantification of the performance of the abnormalities localization. To alleviate this, using class activation maps, we also report a comparison of localization scores for different architectures. For classification, we found that as the number of parameters increase, the models perform better, with NASNet yielding highest accuracy and average precision, recall, and F1-scores of around 95%. For localization, VGG19 outperformed all the models with a mean Intersection over Minimum of 0.45. We also found that there is a trade-off between classification performance and localization performance. As the models get deeper, their receptive field increases, causing them to perform well on classification but underperform on the localization of fine-grained abnormalities.
Date of Conference: 08-11 April 2019
Date Added to IEEE Xplore: 11 July 2019
ISBN Information: