On the Frontiers Of Rice Grain Analysis, Classification and Quality Grading: A Review

Rice is a high valued subsistence crop that feeds more than 3.5 billion of the world population. Its importance can be gauged from the fact that the top five rice exporting countries had a combined net export worth of around 19 billion dollars in 2018. A robust rice grain analysis and classification system can significantly improve performance both in terms of accuracy as well as time. In recent decades, this research area has garnered a lot of attention due to its socio-economic impact. In this paper, we reviewed the work done in image-based rice classification and gradation. The contribution of this study is three-fold. First, it divides the algorithms and techniques of this area into five different approaches namely; geometric, statistical, supervised, unsupervised, and deep learning. Among these, deep learning techniques have shown more promising results and gained attention for future research. Secondly, it divides the rice grain literature historically into three different eras. Thirdly, it summarizes various algorithms and techniques related to rice quality grading and rice disease identification.


I. INTRODUCTION
R ICE is an important staple food that is harvested from an area spanning 163 million hectares in more than 100 countries to meet the food requirements of around 3.5 billion people worldwide [1], [2]. It is the third highest cultivated product after maize and sugarcane [3]. Rice is cultivated in several areas; with a predominant presence in China, and south and east Asia regions [4]. Moreover, the top five rice exporting countries had a combined export upwards of 19 billion dollars in 2018 1 . The revenue generation 2 , and yield 3 of the top ten rice producing countries is shown in Figures 1 and Figure 2 respectively. For around 520 million people living below the poverty line in Asia, rice meets up to 50% of the dietary caloric requirements. Over the years, rice cultivation has evolved significantly, becoming a primary source of income for around 200 million households across the developing world. Factors such as low cost, easy and quick preparation, and long shelf-life have contributed to Fig. 3a gives a breakdown of all the papers that we have reviewed by the publication type, namely; conference or journal. It further classifies the reviewed papers based on the techniques used by the respective authors. These can broadly be classified into geometric, statistical, supervised, unsupervised, and deep learning based methods. Rest of the paper is organized as follows. In section 3 and 4, we give a detailed survey of the rice classification and grading techniques respectively. We also provide comprehensive literature review on the identification of rice diseases, pests and foreign particles in section 5.

II. RESEARCH METHODOLOGY
Following are the research questions that guided our study. RQ1: What are the types of approaches proposed by research community for rice classification? RQ2: What is the research trend followed by research community chronologically? RQ3: How automated grading approach presided over manual rice grading? RQ4: What are the different algorithms and techniques available for automatic identification and classification of rice diseases?

III. QUALITATIVE ANALYSIS
This study presents two different views for analyzing various research papers related to rice grain classification and grading. The former view classifies 107 research articles into various approaches including geometric, statistical, machine learning and deep learning. While the later highlights the research trends by dividing them into time based eras. Moreover, this study also discusses various algorithms and techniques for rice disease identification and classification.

A. STAGE 1: SCREENING
We selected 211 articles, published after 1996, from various databases such as IEEE, Springer, Elsevier, ACM based on keywords ("Machine Learning", "Rice Classification", "Rice Grading", "Rice Pests and Diseases" and "Rice Production") and search criteria. The articles were then refined based on title and abstract. Those articles which complied with rice grain or rice disease classification were included in our study whereas the rest of the papers were excluded. At the end of screening process, 155 papers fulfilled the inclusion criteria.

B. STAGE 2: ELIGIBILITY ANALYSIS
After shortlisting the papers, we reviewed them in detail to determine their suitability for inclusion in this work. For this purpose, we took into consideration the paper's contribution to the field, novelty of the solution, and it's relevance to rice and pest classification and identification. We identified those articles that were more concentrated on the nature of approach and its impact on rice grains or their diseases. This further reduced the sample size to 115 papers, which was sufficient enough to culminate conclusions and inferences about the impact of different classification approaches implemented on rice grains or their diseases.

C. STAGE 3: DATA EXTRACTION
The following information was summarized for each selected paper: (i) type of the algorithm (Geometric, Statistical, Supervised, Unsupervised and Deep Learning), ii) research trends, iii) pest and disease classification, and iv) future works. This information was extracted, analyzed and presented in this manuscript to respond to each research objective.

IV. APPROACHES OF RICE GRAIN ANALYSIS AND CLASSIFICATION
Research objective 1: What are the types of approaches used for rice classification? Over the years, researchers have explored different solutions for rice grain analysis and their classification. These approaches can broadly be classified into geometric, statistical, and learning based approaches including unsupervised (such as k-means, k-NN) as well as supervised (such as neural networks, support vector machines, and more recently deep learning). As seen in Figure 3b, the supervised approach contributed maximum share among all the approaches. The use of handcrafted spatial features in combination with various classifiers helped the supervised approaches to achieve better results. The recent evolution of deep learning can also be observed in this Figure. A. GEOMETRIC APPROACHES Geometric approaches consider the key features of rice grain morphology like compactness, length, and ratios of major and minor axes, slenderness and spread computed on the binary image, for grain classification. Ajay et al. in 2013 presented a quality evaluation method for milled rice [23]. Filtering was first applied to the input rice image in order to remove noise. This was followed by image segmentation to separate connected objects for extracting shape and texture based features. Minimum rectangular method was then ap- VOLUME 4, 2016 plied to the resultant image to determine the quality of rice grains.
In 1996, Sakai et al. presented a rice classification algorithm using image processing [24]. The scanned rice image underwent preprocessing operations including thresholding, smoothing, labeling, searching contours, and filling holes. After that, geometric features including area, maximum length and width, compactness, and elongation of the rice were extracted. Based on these features, the image was classified into brown and polished rice with 95.4% accuracy.
Asif et al. introduced PCA based rice grain classification and quality analysis algorithm [25]. Five different types of rice were considered: Super Colonel, Khushboo, Basmati, Kainat, Sella, and Old Awami. Six colored images of different rice varieties were captured with black background from which noise was removed. Canny edge detection and segmentation were applied to these images after binarizing them. Five morphological features were then extracted from these segmented images. PCA was used as a classifier to identify rice variety which resulted in an accuracy of 92.3%.
Vu et al. presented a rice variety inspection method using geometrical and morphological features [26]. The rice image, containing 48 rice seeds, underwent a segmentation, followed by seed normalization process. Once the seeds were normalized, morphological and geometrical features were extracted. These features were used with Adaboost classifier which outperformed other classifiers including DT and RF. The geometric features were also used in combination with statistical and learning based approaches specifically deep learning that are also discussed in Sections IV-B and IV-C respectively.

B. STATISTICAL APPROACHES
Statistical approaches primarily focus on summarizing the data and making inferences regarding the population. The authors in [27] proposed a method for assessing Indian Basmati rice quality. They applied ISEF edge detection algorithm [28], an advanced form of the Canny edge detection algorithm [29], to a greyscale image and extracted geometric parameters including area, major and minor axes lengths, and eccentricity. A histogram was created corresponding to each parameter. The parameter values of rice samples were then compared with the histograms and the rice quality was classified.
In 2014, [18] proposed a rice quality estimation system based on geometric and color features. For the input rice image, the contrast was enhanced and edge detection was then applied. Furthermore, red, green, and blue values of the resultant rice image were extracted and their corresponding histograms were developed. The histogram with maximum readings was used as an appropriate light filter for classification of the rice sample. Along similar lines, [30] proposed a visual framework to classify the varieties of rice seeds using various features including color, size, and shape. Moreover, their framework also distinguished between sticky rice grains and the non-sticky variety using RGB color model and its histogram.
Another method for quality assessment of Indian Basmati rice grains was presented in [31] in which authors applied morphological closing and opening operations [32] to a greyscale image, followed by a top-hat transformation [33]. In doing so, the resultant image was segmented from which major and minor axes lengths of the rice grains were computed and added to their corresponding histograms. The proposed algorithm then classified the rice grains into small, medium, and large. In 2015, [5] proposed an algorithm to classify rice grains using NIR spectroscopy based on starch content. Starch is one of the main components found in rice grains. Three different rice samples (boiled, brown, and raw) were selected and placed in a spectrometer. Ten different reflectance spectra of each rice sample were obtained at a wavelength range of 400-2500 nm. Those spectra were further pushed in for pre-processing techniques, MSC and SNV. PCA with MSC and SNV was repeatedly used until different rice samples were categorized clearly. Then, PCA was used further to classify each rice variety group.

C. MACHINE LEARNING APPROACHES
Machine learning approach can be broadly classified into supervised, unsupervised, and deep learning [?], [34].

1) Unsupervised learning
The unsupervised learning techniques do not require large datasets to train the classifier. These approaches rely on clustering to separate the data into different classes based on their similarities. One such technique for rice classification relying on clustering is presented in [16]. The authors captured two images of a rice sample consisting of eight different rice varieties. The images were then pre-processed with the removal of noise and lens distortion, thresholding, and edge detection techniques. Once the techniques were applied, reliable morphological features of various rice grains were acquired including average length, shape, and compactness ratio. Two dendrograms were developed for each captured image to show similarities of features between rice samples which helped in classifying short, medium, and long brown/white rice.
A PCA based approach for classification of various Basmati rice varieties was introduced in [35] which used KNN based clustering instead of a dendrogram. Rice image was preprocessed for noise removal and smoothing in order to enhance and clarify the input image. The image was then segmented followed by binarization. Once the stated operations were performed, morphological features such as area, major axis length, minor axis length, eccentricity, and perimeter were extracted. KNN was then used as a classifier to cluster the different rice grain varieties. Along similar lines, in [36], the authors devised a rice quality classification system that also used KNN clustering. Six different rice seeds were used for experimentation. The colored rice image was segmented and vital morphological, color, and textural features were extracted from the resultant image which was then given classified using KNN.
In 2016, Watanachaturaporn presented a symbolic regression method for rice identification [37]. Nine morphological and six color features were extracted from the colored Khao Dawk Mali (KDM) rice image. The symbolic regression model was created with Eureqa software [38] using extracted features. These features were added in Equation 1 to calculate the number of KDM rice grains correctly. MinorAxisLength, IntegratedDensity and Area denoted features from rice image respectively.

2) Supervised learning
In [15], the authors proposed a new approach based on twolayer tan-sigmoid/log-sigmoid neural network [39] for rice seed identification. Various morphological features including area of the seed, seed boundary, bounding box around seed, width, major and minor axes length, thinness ratio, aspect ratio, rectangular aspect ratio, equivalent diameter, filled area, area under major axis of the ellipse, convex area, solidity, and extent were acquired from rice image. In addition to that, various color features including red color band, green color band, blue color band, hue, saturation, intensity, and standard deviation of hue were also extracted. Out of these features, 4 principal components were extracted using PCA to perform dimensionality reduction. These features were used along with neural network for rice seeds classification. An automatic quality evaluation framework for rice kernels was introduced in [17]. The pre-processing of input rice image was done using background segmentation and color blob extraction [17]. Then, the resultant image was utilized for feature extraction process. The process included the extraction of geometric features of rice via shape descriptor, and acquiring image statistics in RGB and CieLab format [40]. PNN classifier was trained and later utilized for classification. In the same year, an approach was presented for distinguishing rice grains and their quality using pattern classification [41]. The rice grain image was pre-processed using image enhancement techniques. The resultant image was then segmented before the final feature extraction. Under this process, color, textural, and morphological features were extracted. These features along with feed-forward neural network were used for rice grain classification and its quality recognition.
In 2010, Verma proposed an approach for rice grain identification in which rice image, obtained from a flatbed scanner, was pre-processed with several image smoothing operations [42]. Once the operations were performed, the average length, width, and perimeter of the rice grains were extracted and used with the neural network for classification.
The authors of [43] proposed a BPNN based rice classification algorithm using color and texture features. The rice image, obtained from a flatbed scanner, underwent a background segmentation process. From the resultant image, a total of 60 color and texture features were extracted. These features were passed to four feature selection algorithms, Branch and Bound [44], Standard Forward Sequential [45], Standard Backward Sequential [45] and Plus-1-takeawayr [46] to acquire 22 optimized morphological feature set that was later used in building BPNN model. Once the BPNN was built, it was evaluated with the test data, resulting in an average of 96.67% accuracy. Similarly, the authors proposed another rice kernel classification algorithm using optimal morphological features and BPNN [47]. The rice image, containing 300 kernels, was acquired from the flatbed scanner using a black background. It was then pre-processed with background segmentation to extract the rice objects. For each rice object, 18 morphological features were extracted. Similar to their previous work, these features were passed to four feature selection algorithms, branch and bound, standard forward sequential, standard backward sequential and plus-1-takeaway-r to acquire six optimized morphological feature set which were later used in building BPNN model. Thus, BPNN was evaluated with the test data, attaining 98.4%. Kong et al. presented a rice seed identification method using hyperspectral imaging and multivariate data analysis [48]. The hyperspectral images were obtained from a laboratory based imaging system and from those images, spectral data, ranging between 1039 nm to 1612 nm was extracted. The extracted data was then used to build KNN, SVM, RF, PLS-DA, and SIMCA classification models. When the stated models were evaluated, RF performed better classification than others.
In the same year, a NN based approach was presented by Silva C.S. and Sonnadara U. for classification of rice grains [49]. In this approach, the input rice image was pre-processed with several operations: Gaussian filter, morphological opening, contrast stretching, dilation, and erosion. As a result, a unique binary image was produced, containing distinct rice grains' representation. From this image, a total of 34 features (13 morphological, 6 color, and 15 textural) were acquired. PCA was applied to perform dimensionality reduction. As a first step, an individual neural network was created for each feature set, and later the combination of feature set model was implemented yielding an overall classification accuracy of 92% on a dataset which included grain samples from nine rice varieties grown in Sri Lanka.
In 2014, Pazoki et al. proposed an approach that extracted 24 color, 11 morphological, and 4 shape features. These features were fed to the neuro-fuzzy network [50] and MLP, attaining accuracy of 99.73% and 99.46% respectively [51]. Singh K.R. and Chaudhary S. presented an efficient technique for the classification of rice grains using BPNN and wavelet decomposition [52]. A total of 45 features (texture, wavelet, and color) were acquired from the rice image. Those features were then given to BPNN. Once the training VOLUME 4, 2016 phase was complete, BPNN was evaluated with test data and according to the study, it classified rice varieties with an accuracy rate of more than 96%. In the same year, Kuo et al. presented a sparse-representation classification (SRC) method using an image processing technique [3]. In this method, the input image was acquired under a controlled lab environment with static lighting conditions. After background segmentation, 9 color, 12 morphological, 7 textural, and 20 Fourier features were acquired. These features were given to SRC classifier. Once the classifier was trained, it was then evaluated on a set of test rice images that classified rice variety with 89.1% accuracy.
It has been observed that occluded and overlapping rice kernels adversely affect classifier accuracy. In [53], this problem was addressed by the use of contour detection and Watershed algorithm [54]. They developed a rice segmentation and classification system based on color and texture features using SVM, achieving an accuracy of 88%. The color and texture features were computed using LBP. In the same year, the authors of [55] introduced a heuristic feature based guided machine vision approach for rice variety classification. Each rice image was converted to grayscale, before applying noise reduction and binarization operations. For detecting edges, the Canny algorithm [56] was applied. Once the preprocessing phase was complete, 10 morphological and 13 color features were extracted. These features along with training data were fed to the ANN model. Once the model was trained, it was evaluated with features from test rice images. In doing so, it classified rice variety with an accuracy rate of 82.21%.
In 2019, Srimulyani et al. conducted a comparative study which observed that a rice identification system using BPNN outperformed LVQ based approach [57]. In the preprocessing phase, the image segmentation using thresholding and contour tracking was performed. Six color, two texture, and four shape features were obtained from the segmented rice image. These features were fed into both BPNN and LVQ for training and testing. In 2019, Ibrahim et al. proposed an approach using morphological and HSV features. These were then fed to a multi-class SVM for rice grain classification [58]. The pre-processing phase consisted of applying Roberts segmentation [59]. Another BPNN based classification method was proposed which predicted rice quality using electronic tongue [60]. The characteristic current and potential arrays were obtained from rice sample which were then converted to peak current and potential phasor plane values as main features. These features were used to train tandem BPNN model. Once the model was trained, it was then evaluated with the extracted features of test rice samples yielding 90% accuracy.
Duong and Hoang presented a rice variety recognition approach based on feature selection [61]. In this approach, the rice image was coded in 8 different color spaces (RGB, HSV, . For each coded rice image, histograms oriented gradient (HOG) [62] descriptors were applied to extract different features. Fisher's Scoring [63], a supervised feature selection algorithm, was then applied to each random set of coded features to calculate their score and rank accordingly yielding an accuracy of 93.34%. This approach helped in reducing dimensions and running time while performing recognition process.
In 2020, the authors presented a feature based cascade network for the classification of rice grains which achieved an accuracy of 97.75% on the UCI dataset [64] [65]. These features included 11 morphological, 18 color, 27 texture, and 24 wavelet while the pre-processing was performed after the conversion of RGB rice image to HSV as the hue channel performed better discrimination between rice kernel and background. The proposed network consisted of four outputs and two hidden layers. Hidden nodes were computed based on the Equation 2. The authors of [66] proposed a rice seeds classification system using spectral and spatial features. The rice image was captured by RGB camera and hyperspectral imaging (HSI) camera. The input images, including RGB and hyperspectral, were normalized using lens distortion and planar calibration techniques. The spatial features (Area, Major Axis Length, Minor Axis Length, Aspect Ratio, Perimeter over Area Ratio and Eccentricity) were extracted from RGB images whereas spectral features were obtained from HSI image. These features were used with Random Forest and classified rice seeds with a test accuracy of 78.8%.
where N, I, O and Y denote number of hidden nodes, input nodes, output nodes and number of training data respectively.

3) Deep learning
In 2017, Patel et al. proposed two approaches based on CNN [67] for rice grain classification, one with transfer learning [68] and one without it [69]. Transfer learning consists of CNN models pre-trained from some other related dataset. It is used when datasets of one domain are not sufficiently large to accurately create a model. After pre-processing, 4000 training images were used to train both the CNNs while the testing was performed using 1000 images. It was observed that employing CNN with transfer learning outperformed the other approach. In [70], a rice identification model was introduced using CNN which resulted in 99.52% accuracy. The pre-processing was done with image enhancement operations. The authors of [71] proposed a rice variety classification system based on the combination of hyperspectral imaging with CNN. The rice image was captured from a visible nearinfrared hyperspectral imaging system. The acquired image was corrected and pre-processed with a wavelet transform and image segmentation process. Images with a spectral range of 441-949 nm and 975-1646 nm were selected for training and testing phases. 100-3000 rice samples of every rice variety were used to train SVM, KNN, and CNN whereas a total of 8907 rice seeds were used testing purpose. It showed that CNN classified rice varieties with rice variety with better accuracy than KNN and SVM.
Chatnuntawech et al. developed a mixed rice variety inspection system which classified them using a spatio-spectral deep convolutional neural network [72]. A near-infrared hyperspectral imaging system was utilized to obtain spatial and spectral data of the rice sample. The former is based on visual appearance while the latter corresponds to chemical properties. These spatio-spectral data was also known as datacubes as it contained information of two spatial and one spectral dimension. These datacubes were then passed to a deep CNN known as Residual Network (ResNet) [73] that achieved an accuracy of 91.09%. To evaluate the proposed classifier, two different datasets were used; namely processed rice, and paddy rice. One drawback of the proposed approach is that it required the grains to have the same orientation in the image. Another DCNN architecture based rice variety detection system was presented in [74]. The main purpose of this system was to minimize human involvement while attaining maximum level of accuracy in rice variety classification. The colored rice image was first converted to grayscale, followed by segmentation of each rice kernel. It was then processed using simple scaling, mean subtraction and feature standardization to compute dimensionality of the rice kernel which was used to tune CNN network. Furthermore, stochastic gradient descent (SGD), with momentum 0.9 and decay 0.0005, was used to avoid back propagation error and enhance network parameters during training phase. Once the CNN framework was trained, it was employed with test dataset, resulting in 95.5% accuracy rate.
In 2019, the authors of [75] presented rice quality classification algorithm using CNN. Rice objects were detected from the rice image and were cropped according to whole or broken rice. The object was preprocessed with background subtraction operation to eliminate noise and error. Then the processed rice object was fed in to CNN, SVM, and KNN for training purpose. Once the models were trained, they were evaluated with the test data. It showed that CNN delivered better accuracy rates than SVM and KNN in predicting whole and broken rice. Another CNN based rice grain classification algorithm was proposed in [76]. In the algorithm, the rice images were preprocessed with cropping, scaling and autoalignment operations. Then, marker-based Watershed algorithm [77] was applied to extract area and contour of each rice grain which were then used with ResNet-50 [?] model yielding an accuracy of 80%.

V. CHRONICLES OF RICE GRAIN ANALYSIS TECHNIQUES: FROM BEGINNING TO DATE
Research objective 2: What is the research trend followed by research community chronologically? In this section, we take a bird's eye view of different research trends in this area over the years. Based on the significance of research developments, the literature is divided into three different eras. Supervised approaches are mainly used in Era 1 while Era 2 added more approaches including statistical, unsupervised, and geometrical. Era 3, on the other hand, sees the advent of deep learning. As can be seen in Fig. 4, Era 1 has maximum citations 4 while number of publications and impact factor 5 are highest in Era 2.

A. ERA 1 (1996-2010)
This Era laid the foundation of algorithms and techniques for rice grain classification. Since this was a time when research on rice classification took root, the immediate goal of the researchers was to improve classification accuracy, even if the running times of the proposed solutions were not optimal. Later eras emphasized on the efficiency of their proposed solutions as well. Table 1 provided an overview of the key papers published in this Era with various attributes, advantages, and limitations. It can be observed that accuracy as high as 94% was achieved using the Probabilistic Neural Network. In this Era, around 72% of the number of publications belonged to supervised learning as can be seen in Fig.5a.
The authors of [24] presented a rice classification method achieving 95.4% accuracy rate. The method involved rice image being preprocessed with image enhancing operations. Once the image was enhanced, geometric features were extracted, and based upon them, rice image was classified. Liu et. al presented an algorithm in which rice seed varieties were identified using a neural network [15]. A total of 21 features including 7 color and 14 morphological ones were extracted from 240 rice kernels and fed to the neural network for training. Once the network was trained, 60 rice kernels were used for the testing phase, yielding an accuracy of 84.43%. The algorithm even performed well on rice grains with a high degree of similarity. Along similar lines, Hobson et. al proposed an unsupervised clustering technique for rice classification [16]. The clustering is done based on the morphological features of 8 different rice grains.
In 2008, Agustin et al. proposed a histogram based quality evaluation framework for milled rice kernels [17]. The color histograms of RGB and Cielab 6 were used to acquire 24 color features of milled rice which were fed to Probabilistic Neural Network (PNN). PNN was trained with color features of 17,420 kernels and was tested with 1161 kernels yielding an accuracy of 94%. However, such accuracy rates were highly dependent on the selection of an appropriate Cielab threshold.
In 2010, Shantaiya et al. proposed feature based learning system for classification and quality estimation of rice grains [41]. These features which included 9 color and 9 morphological attributes were fed to neural network yielding an accuracy of 84.43%. Similarly, Verma proposed learning based system for the classification of rice grains [42]. The rice image, containing 500 rice kernels, was pre-processed and 10 different features were extracted out of it. Rice grains ranging from 7,500 to 10,000 were used to train the neural network yielding an accuracy of 90-95%.
Pabamalie and Premaratne proposed 21 texture and 10 color based features for the classification of milled rice [78]. These features were given to Neural Network achieving a test accuracy of 80.5% on a data split of 72% for training and 28% for testing. The data set consisted of 360 samples where each sample had an area of 10 × 7 cm 2 containing various rice grains. The limitation of this approach was the controlled testing environment having studio light settings.

B. ERA 2 (2011-2016)
This era is characterized by the diversity of the proposed solutions. Even though supervised learning-based techniques still formed a major chunk, this era also saw novel rice clas-6 https://spie.org/publications/fg04_p71_cielab?SSO=1 sification proposals relying on unsupervised, geometric, and statistical approaches as shown in Fig.5b. Table 2 provides an overview of the key papers published in this era with various attributes, advantages, and limitations. This era saw an accuracy as high as 99% which was achieved using neurofuzzy based system [79].
Rad et al. presented a BPNN based classification algorithm [43]. 60 color and texture features were extracted from the segmented rice image and within this feature set, 22 features were used to train and evaluate BPNN. In this study, features of 350 images were used to train BPNN whereas features of the remaining 150 images were utilized for the testing phase. BPNN classified the test data with an accuracy rate of 96.67%. Similar to their previous work, the same authors presented another rice kernel classification algorithm using optimal morphological features and BPNN, achieving 98.4% accuracy [47]. Here, 1050 kernels were used to train BPNN whereas 450 kernels were used as test data.
In 2013, Kaur and Singh proposed an automated method to classify rice kernels using multi-class SVM [80]. Smoothing, segmentation, and binarization operations were applied to the input rice image before extracting geometric features. The data set including 400 rice kernels was given to the neural network for training using these features. Another 400 kernels were then used for testing our model, which yielded an accuracy of 86%. In the same year, Silva and Sonnadara proposed another neural network based rice grain classification using 34 features [49]. These features, which consisted of 13 morphological, 6 color, and 15 textural ones, were given to a neural network achieving a testing accuracy of 92%. 315 images were used to train the model which was then tested using an additional 68 images. In the same period,  [48].
In 2014, Auttawaitkul et al. developed a histogram based visual classification system for identifying rice seeds based on their color and appearance features [18]. Three different color histograms were obtained from the RGB intensity values of the rice image, where each rice image contains several rice grains. The system was tested on a dataset of 300 rice images yielding an accuracy of 91% under a controlled laboratory environment. In the same year, Mahale and Korde introduced a semi-automatic approach based on the edge detection algorithm [82]. After extracting the rice boundaries, the manual measurement of rice length, breadth, and their ratio were made using Vernier Calipers and these were then used for classification.
Pazoki et. al introduced MLP and neuro-fuzzy based rice classification algorithm [51] using 39 features. These consisted of 24 color, 11 morphological and 4 shape features.
The system was trained and tested using data image repositories of size 300 and 150 respectively. The model achieved accuracy as high as 99.73%. Kamboh et al. developed a morphological feature based classification system using k-NN and PCA. The pre-processing was done using smoothing and segmentation techniques. The system achieved an accuracy of 79% on a diverse dataset, consisting of 260 Classic, 187 Rozana, and 339 Mini rice grains.
Mahajan and Kaur proposed a histogram based rice grain classification system in which pre-processing was done using morphological opening and closing operations followed by top hat transformation [31]. Another rice grain classification method was proposed that used only morphological features along with SCG-NN achieving an accuracy of 98.2% [81]. The limitation of this approach included its inability to address occluded and overlapping rice grains.
Priya et. al used NIR spectroscopy followed by PCA for rice grain classification [5] [5]. NIR spectroscopy was used to classify rice samples based on the carbohydrate content and starch, present in the rice. It was computed on 250 grams of rice samples within the range of 1100 nm to 2200 nm. In the same year, Zareiforoush et al. proposed a hybrid automatic system based on fuzzy logic for quality measurement of milled rice achieving an accuracy of 89.8% on unknown milled rice images [83]. Another geometric feature based rice quality and grading system was proposed by Patil achieving an accuracy of 93% [84].
Another work, which relied on feature based symbolic regression algorithm, was proposed to identify adulteration of rice varieties [37]. These features, which consisted of 9 morphological and 6 color features, were extracted from 800 rice grains. Meanwhile, Singh and Chaudhury developed a rice classification technique using BPNN and wavelet decomposition achieving 96% accuracy on a dataset of 400 rice images [52]. The feature vector consisted of 18

C. ERA 3 (2017-2020)
This Era marked the application of deep learning approaches for rice grain classification and quality grading as shown in Fig.5c. While deep learning falls under supervised approaches, this study considers it as a separate approach due to its importance and promising results. Table 3 provides an overview of the key papers published in this Era with various attributes, advantages and limitations. It can be observed that the on average, the accuracy achieved in this Era was higher as compared to the previous ones. In 2017, Lin et al. presented an algorithm that utilized Convolution Neural Network (CNN) for rice kernel classification [70] on a training and testing data set of 2854 and 965 images respectively achieving an accuracy of 99.52%. The pre-processing of rice images was done using re-scaling, mean subtraction, and feature standardization processes. In the same year, Patel used VGG-16 on a training and testing dataset of 4000 and 500 segmented rice images respectively achieving a testing accuracy of 94.20% [85].
Asif et al. proposed a classification system using PCA that achieved an accuracy of 92.3%. The features used were area, eccentricity, perimeter, length of major and minor axes, eigen values, and eigen vectors [25]. Nagoda and Ranathunga developed a rice segmentation and classification algorithm based on color features using SVM achieving 88.0% accuracy rate [53]. However, for their approach to work, they required a uniform lighting conditions. In the same year, Chatnuntawech et al. provided a rice classification algorithm that utilized deep CNN for identification [72]. Even though their classification accuracy was as high as 98.7%, they performed well only for a particular orientation of rice grains. Another DCNN based rice classification system was proposed which predicted rice species with an accuracy of 95.5% [74]. The CNN was trained and tested on 5554 and 1854 images respectively. The addition of Stochastic Gradient Descent approach further enhanced CNN's accuracy.
In 2018, Wijerathna and Ranathunga proposed a rice classification system using ANN which achieved an accuracy of 82.21% [55]. The feature vector consisted of 10 morphological and 13 color attributes. In the same year, Mandal [86] also proposed a rice classification system with 7 morphological features using ANFIS [87] and achieved an accuracy of 98.6%. However, this approach required the rice grains to be non-overlapping and well separated. Vu et al. proposed a novel rice variety inspection method using morphological and geometric features [26]. The rice image was segmented and normalized, followed by the extraction of geometric and morphological features of the rice seed. Those features were evaluated using DT, RF [36], and Adaboost classifiers from which Adaboost dominated by achieving a 95% accuracy. Qiu et al. presented a hyperspectral-CNN based rice variety classification algorithm [71]. Rice image from the hyperspectral imaging system was pre-processed and segmented from which hyperspectral data was acquired. Rice variety recognition algorithm based on Fisher's scoring for feature selection was proposed that exhibited an accuracy of 93.34% [61]. A BPNN based rice grain classification method was proposed in [60] which used potential and current phasor plane values to achieve a 90% accuracy. Aukkapinyo et al. proposed a rice grain classification method using ResNet-50 which achieved an accuracy of 80% [76]. Another supervised approach using spatial and spectral features was proposed for rice seed classification obtaining an accuracy of 78.8% on a data set of 8640 images.
Ibrahim et al. proposed an automatic rice classification algorithm using multi-class SVM [58]. The pre-processing underwent image enhancement techniques followed by computation of 4 color and 4 shape based features. Most recently in 2020, another automatic cascade network based rice classification algorithm was developed that used various morphological, color, texture, and wavelet features achieving an accuracy of 97.5% on a data set of 340 images [65].

VI. AUTOMATED GRADING OF RICE GRAINS
Research objective 3: How automated grading approach presided over manual rice grading?
As was the case with rice breed classification, manual approaches for rice quality grading are too slow and have a greater chance for human error. It is therefore essential to automate the process [82]. In this regard, the approach proposed by Mahale and Korde used colored images of rice grains samples which were processed to remove noise using image morphological operations of dilation and erosion [32], followed by segmentation. Later, the edge detection method was applied which computed the boundary of rice grains. Length, width, and their ratio were also computed for rice grain classification. In 2014, Tanck et al. proposed an approach that focused on grading rice quality based on Agmark 7 standards [88]. The colored image was pre-processed with noise removal and image enhancement techniques. The binarization process was then applied to convert it to black and white. By doing so, it aided in more precise extraction of morphological features including area, perimeter, and major axis length that were used for classifying the rice grains.
Zareiforoush et al. introduced a hybrid Fuzzy Inference System (FIS) to automate the qualitative grading of milled rice [83]. The RGB rice image, captured from a CCD camera, was converted to grayscale which was then used to calculate the percentage of broken kernels and degree of milling. These two variables were fed to the FIS for qualitative grading purpose, yielding a test accuracy of 89.83%.
In [78], an approach based on texture and color features was proposed for milled rice quality recognition. A total of 17 features were extracted for grading the rice grains into Premium, Grade 1, Grade 2, and Grade 3. These 17 features consisted of 7 textural (correlation, energy, homogeneity, maximum probability, inverse different moment, angular second moment and dissimilarity) and 10 color (variance of hue, saturation, intensity, red, green, blue and mean of hue, red, green, blue) which were fed to a Back-Propagation Neural Network (BPNN) that classified the quality of rice grains.
In 2013, an algorithm was proposed in [80] which performed grading and classification of rice using a multi-class SVM. The rice image was pre-processed through image enhancement operations. Unlike previous approaches, the proposed system also took into account the chalkiness of rice grains as a feature, in addition to geometric features.
In 2014, Selvaraju et al. presented a novel system for the classification of grades between Basmati rice granules [81]. The rice image was smoothed with median filtering, segmented using adaptive thresholding following by edge detection using the Canny algorithm. Once the edges were identified, morphological features were computed which were then forwarded to train scaled conjugate gradient training based neural network (SCG-NN). After training the network, it was then evaluated with test data which resulted in an accuracy rate of 98.7%.
Patil et al. proposed a system that performed quality analysis and rice grading based on geometric features using decision tree [84]. Three different rice varieties were selected for this system. Background noise was removed from the rice image and it was later converted to a binary image using the Otsu method [89]. Each rice grain in the resultant image was segmented and its geometric features were extracted that were then classified and graded.
In 2018, Mandal et al. proposed a rice grading system based on geometric features which were fed to an adaptive neuro-fuzzy inference system (ANFIS) [86]. The proposed solution outperformed KNN and SVM based systems and exhibited an accuracy of 98.5%. The preprocessing was done using the morphological operation of opening to eliminate noise. The benefit of this approach was that unlike [72], it did not require the rice grains to have the same orientation for the classifier to work. Randomly scattered rice grains did not impact classifier accuracy.

VII. IDENTIFICATION OF RICE DISEASES, PESTS AND FOREIGN PARTICLES
Research objective 4: What are the different algorithms and techniques available for automatic identification and classification of rice diseases? Various rice diseases and pests can lead to a loss in annual crop yield, which could be as high as 30% [90]. Misidentification usually leads to incorrect control measures, such as indiscriminate and untimely use of pesticides [91]. It has been observed that untrained farmers, relying on manual methods, often fail to detect the rice crop diseases such as Leaf blast, Brown spot, Sheath blight, and Leaf scald in a timely manner and subsequently fail to take preventive or corrective measures [92]. Because of this reason, identifying rice diseases through an automated process becomes critical. Zhou et al. in 2013 [93] presented a rice plant hopper classification algorithm based on fractal dimension values and Fuzzy C-means (FCM). The rice image preprocessing stage consisted of smoothing, denoising, color space conversion, frequency domain transformation, and segmentation of ROI. Features based on fractal dimensions were extracted using the Box-Counting Dimension Method and were classified using FCM, achieving a classification accuracy of 63.5%. In 2017, an automated identification algorithm was proposed which identified rice planthoppers using image processing [94]. The rice planthopper image was pre-processed with binarization operation and morphological filtering. Finally, a logical AND operation between the binary and grayscale images were used for insect segmentation. Then the Fourier transform was used to convert the insect image into a frequency domain to quantify it's color and texture features. These features were used along with SVM yielding an accuracy of 92%. Table 4 shows the key research papers of various rice diseases, pests and foreign particles.
In [102], the authors presented a statistical approach for grading rice paddy using thermal technology. This approach graded rice paddy quality based on moisture content, maturity and chaff. The thermal rice images were gray-scaled to extract thermal index or pixel intensity. The moisture content was determined using Pearson's correlation coefficient with an accuracy of 92%. The authors also used thermal indexing to determine rice paddy maturity with 90% accuracy and to identify chaff with 100% accuracy. The high accuracies were achieved by trading off the classifier's execution time.
Yao et al. developed an application that classified rice disease using shape, color and texture features of rice leaf [91]. Rice bacterial blight (RBLB), rice sheath blight (RSB) and rice blast (RB) diseases were considered in this application. Noise removal and median filtering were applied to the rice images to avoid inaccurate segmentation of disease spots. Then the image was segmented using Otsu's method [89] from which 4 shape features (compactness, roundness, rectangularity and elongation), 60 texture and 3 spatial (H, S, V) feature of the disease spots were extracted and given to the SVM. The classifier yielded an accuracy of 97.2% on a dataset of 216 images.
In 2012, Phadikar et al. presented a supervised method for rice disease classification based on morphological alterations [100]. This approach mainly focused on detecting leaf brown spot and blast diseases. The rice leaf image was preprocessed with a mean filtering technique and later Otsu's threshold algorithm was used for segmentation on the hue plane. The segmentation separated infected spot areas on the rice image and then Radial distribution function was applied to color values. These color features were used with Bayes and SVM classifiers on data set of 500 images achieving an accuracy of 79.5% and 68.1% respectively. To overcome the limitation of selecting specific threshold value for segmentation, the authors presented another rice disease detection method using rule and feature selection [97]. In addition to the brown spot and rice blast disease, this work also classified sheath rot and bacterial blight ailments. The rice leaf image was segmented with the proposed Fermi energy-based method which identified the infected region from it. Fifteen color, one position and nine shape features were extracted from the segmented image and given to a rulebased decision system that yielded an accuracy of 92.29%.
In 2016, a new technique was proposed which classified rice diseases using image processing [99]. This technique highlighted four different diseases such as rice blast, rice bacterial blight, rice brown spot and rice sheath rot. At first, the infected rice leaf image was normalized with mean values of R, G and B components and multiplied with the scaling factor to remove the effect of outdoor illumination. The resultant image was converted to Y C b C r color space for segmentation. After segmentation, the mean and standard deviation of R, G and B plane, and ten shape features were calculated from it. These sixteen features were used with minimum distance classifier (MDC) and kNN, yielding an accuracy of 87.02% and 89.23% respectively. Phadikar S. and Goswami J. proposed another automated rice disease classification algorithm using vegetation indices based segmentation [103]. Unlike their previous work, the authors focused their work on just two diseases: brown spot and rice blast. The rice disease image was segmented using four different vegetation index metrics [104] including normalized difference vegetation index, green indexed vegetation index, enhanced vegetation index and soil adjusted vegetation index to extract infected regions. These indices along with five textural features were used along with Otsu's threshold algorithm to classify both rice diseases and achieved an accuracy of 84%.
In 2017, a prototype system was proposed for classification and detection of rice plant diseases [98], including bacterial leaf blight, brown spot and leaf smut. The rice image was converted to the HSV color model from which the saturation component was mainly considered. A mask was then applied to the saturation component of the image, isolating the leaf portion with disease spots from the background. The resultant image was then segmented, using K-means clustering, into 3 different clusters: background, diseased portion and green portion of the leaf. Three segmentation techniques; LAB color space based K-means clustering, Otsu's segmentation, and HSV color space based K-means clustering were then applied to acquire the diseased area on the leaf separately. As a result, three different models were developed containing a combination of various color, shape and texture features. These images were then used to train an SVM with 10 fold-cross validation. The resulting model yielded 88.57% accuracy.
A deep CNN based method was proposed for the identification of rice diseases [95]. This method dealt with 10 VOLUME 4, 2016 rice disease varieties namely; rice blast, rice false smut, rice brown spot, rice bakanae disease, rice sheath blight, rice sheath rot, rice bacterial leaf blight, rice bacterial sheath rot, rice seeding blight and rice blight wilt. The rice image was preprocessed with scale normalization and mean normalization techniques. Then PCA and whitening techniques were applied to the resultant image to acquire the required feature. This feature was fed into DCNN for training purpose.Once trained, DCNN classified rice diseases with an accuracy of 95.48%. Sengupta S. and Das A.K. proposed a supervised incremental classifier based on particle swarm optimization (PSO) algorithm for the prediction of rice diseases [101]. A total of 17 shape and texture features of the diseased region were extracted from rice images and were used to train the classifier. Once the classifier was trained, it yielded an 84.02% accuracy rate. Yao et al. proposed a microscopy image identification based rice disease detection method using decision tree confusion matrix [105]. Rice disease microscopic image was preprocessed with several image enhancement techniques to locate the infected area. Those techniques include greyscale conversion, histogram equalization, image smoothing, image sharpening, threshold segmentation, inverse contour extraction and distance transformation-Gaussian filteringwatershed (DT-GF-WA) algorithm. 4 shape features (Area, perimeter, ellipticity and complexity) and 3 texture features (entropy, contrast and homogeneity) were then extracted and utilized in decision tree -confusion matrix (DT-CM). 2000 microscopic rice disease images were used for training while 500 images were used for testing purpose. Thus, DT-CM classified rice smut and rice blasts with an average of 94% accuracy rate. Hasan et al. introduced a hybrid rice disease detection method using SVM and DCNN [106]. In this method, Inception-V3, one of the deepest and complex CNN architecture, was used to extract features from rice disease images as it does not require the intervention of a handcrafted feature selection algorithm for extraction. Those features were utilized to train multi-class SVM. About 1080 images were used to train SVM whereas 216 images were for testing purpose. SVM classified 9 diseases (Bacterial leaf blight, rice blast, brown spot, false smut, leaf smut, red stripe, leaf scald, sheath blight and Tungro) with 97.5% accuracy rate.
The authors of [96] proposed another method which used textual feature for rice disease classification [107]. Three diseases were considered in this research: brown spot, bacterial leaf blight and false smut. At first, the rice disease image was converted to grayscale and then SIFT transform was applied to extract regions of various diseases from the image. Those sets of regions were forwarded to the Bag of Words (BoW) technique and descriptions were clustered using the K-means clustering algorithm. Then Brute Force matcher histograms and SVM classifier was used to classify these diseases achieving an accuracy of 94.16% on a dataset of 400 images.
Rahman et al. presented another CNN based approach for the identification of rice diseases and pests [108]. Six classes of diseases (Bacterial leaf blight, brown spot, sheath blight, sheath rot, neck blast, false smut) and three classes of pests (Brown planthopper, hopper pest and stemborer) were considered in this paper. Different image transformations such as random rotation (-15°to 15°), random distortion, shear transform, vertical flip, horizontal flip, skewing and intensity were applied and the transformed images were stored separately according to their classes. These transformed images were later used to train VGG16, ResNet50, InceptionV3, InceptionResNetV2, and Xception. Each one of them was trained using fine-tuning, transfer learning and training methods. Once trained, they all were evaluated on test data set from which VGG-16 architecture outperformed yielding a 99.53% accuracy rate. Furthermore, the authors also introduced two-stage training for their memory-efficient stacked CNN architecture. Devi et al. in 2019, proposed feature based rice disease classification method using various classifiers. Median filtering was applied to eliminate noise from the rice leaf image. Then the K-means clustering was used for the segmentation process. Thirteen features were extracted using a hybrid of SIFT, DWT and GLCM methods. The features were then utilized to train ANN, kNN, multiclass-SVM and Bayesian classifiers. These classifiers were trained with 350 images and were evaluated on 150 images, yielding 86.63%, 96.78%, 98.63%, and 85% accuracy respectively. In 2020, Sharma et al. proposed a rice disease detection method using Bayes and minimum distance classifiers [109]. The rice disease image was preprocessed with grayscale conversion, median filtering and segmentation process. Color, texture, morphological and structural features were extracted and were later fed to Bayes and MDC classifiers for training purpose. 140 images were used for training purpose while 60 images were used for testing purpose. Bayes and MDC classified leaf blast, sheath blight, false smut, stem rot and brown spot with 69% and 81% accuracy, respectively. Patidar et al. set forth a deep learning approach in which rice diseases were detected and classified using deep residual learning [92]. This approach mainly focused on leaf smut, brown spot, and bacterial leaf blight rice diseases. The rice image was transformed and normalized so that every pixel attained the same mean and standard deviation. Then these images were given to the Resnet-34 model for training purpose. The classifier was trained with 86 images while it was evaluated on 36 images, yielding 95.83% accuracy.
The authors of [110] presented an automated rice plant disease classification algorithm. The method used color features of diseased rice plant image to classify sheath blight, rice blast, bacterial leaf blight and healthy leaves. So the diseased image was first preprocessed, converting background to black, to minimize complexity and computational cost. The RGB color was fetched from processed image and it was then converted to 13 different color spaces (normalized-RGB, YCbCr, HSV, HSI, CIE XYZ, CIE Lab, CIE Lch, CIE Luv, Hunter-Lab, SCT, opponent, CMY and CMYK). In doing so, a total of 172 features were extracted including 4 statistical features (mean, standard deviation, skewness and kurtosis). These features were then used to train 7 different classifiers such as SVM, DC, KNN, NB, DT, RF and LR. The classifiers were trained and evaluated with the features of 619 images using 10-fold cross validation. The authors further conducted 10 more trails with different training and testing set upon which the accuracy rates were averaged and considered as final accuracy rate. As a result, SVM dominated with 94.65% classification rate, followed by DC (92.34%) and RF (92.52%). A deep learning method was introduced by [111] which detected rice disease using image processing. It used to identify healthy and unhealthy (brown spot, leaf blast and hispa) rice leaves. The dataset was taken from Kaggle 8 database and some of the images were taken manually. About 70% of the total images were used to train CNN whereas remaining images were used to evaluate trained CNN. Based on a confusion matrix which counts correct and incorrect predictions, an accuracy rate of 90% was attained.

VIII. CONCLUSION AND FUTURE WORK
This study provides a detailed analysis of existing rice grain classification and quality grading techniques in chronological order and categorizes them into geometric, statistical, supervised, unsupervised, and deep learning approaches. This paper reviews the history of rice grain algorithms starting from 1996 and segments them into three different eras.
Each era is demarcated on the basis of some novel researches and developments. More specifically, in Era 1 (1996-2010), supervised approaches were dominant, while Era 2 (2011-2016) marked the significance of statistical and geometrical approaches as well. The focus of research took a new turn towards deep learning in Era 3 (2017-2020) which is expected to increase significantly in upcoming years. Keeping in view the negative impact of rice pests, diseases, and foreign particles, the study further surveys the automated classification techniques proposed in this area.
The study predicts that the deep learning approaches will bring promising results in the future. It will further pose new challenges in meeting high computational requirements. Moreover, the paucity of datasets poses a big challenge to building accurate deep learning models. The development of large rice grain repositories, in an uncontrolled environment, will gain importance in the coming years. Data sets with non-uniform lighting, occlusion among rice grains, and indistinguishable and camouflaged rice grains will remain a big challenge in future research.
In recent years, research focus is shifting towards agricultural techniques that are much more efficient as compared to the traditional methods. In this direction, some methods do away with the need for soil as a growth material. These include techniques like hydroponic 9 and aeroponic 10 . The advantages of using these approaches include i) better utilization of water, ii) more production in lesser space, iii) better use of nutrients, and iv) protection against pests without the need for pesticides. These are more eco-friendly and also maximize profits by reducing costs and increasing the output. Other techniques, like drip irrigation 11 , need soil for growth, but the water delivery mechanism is much more precise leading to water savings by minimizing water wastage and evaporation. Though, the operational and resource cost is low, the water distribution uniformity is poor in drip irrigation. All these techniques (hydroponic, aeroponic and drip irrigation) are managed and monitored by humans due to which there are high chances of error. This is where machine learning approaches would come in handy which not only eliminates human errors but it would perform same human activities with greater precision and high accuracy.