I. Introduction
Taking in multiple levels of features or depiction of the data, deep learning (DL) has manifested as an efficient machine learning (ML) tool which gives excellent results. Its application demonstrated remarkable performance in a variety of areas, especially in object detection, image classification, and localization. Latest developments of DL methods give positive affirmation to fine-grained image classification which aims to differentiate secondary-level categories [1]. These tasks have become steeply challenging due to its high intra-class and low inter-class variance. Object recognition and classification have witnessed significant advancements with the advent of DL and Deep Convolutional Neural Networks (DCNN). These modern techniques have outperformed traditional ML algorithms such as Support Vector Machines (SVM) [2] and Naive Bayes [3] by leveraging their ability to extract higher-level features directly from raw data.