Towards Sustainable Agriculture: A Novel Approach for Rice Leaf Disease Detection Using dCNN and Enhanced Dataset

Rice is one of the foremost food grains that dispenses sustenance to about half of the world’s population. It is cultivated all over the world. The leaf disease detection of this crop is one of the chronic agricultural obstacles that farmers and planting experts have been struggling with for a long time. As a result of the leaf diseases, producing the amount of rice required to feed the world’s rising population has become very challenging. Hence, automatically detecting rice leaf diseases is an inevitable task to increase productivity. Numerous deep learning based methods have been proposed for rice leaf disease detection, which we found rather inefficient considering the size of the models. In this article, we introduce a lightweight deep Convolutional Neural Network (dCNN) based method for rice leaf disease detection, that outperforms contemporary state-of-the-art methods and showcases competitive performance against 21 established benchmark architectures, including AlexNet, MobileNet, ResNet50, DenseNet121, ResNeXt50, ShuffleNet, ConvNext, EfficientNet, GogoleNet, SwinTransformer, VisionTransformer, and MaxVit, to name a few, with significantly lower trainable parameters. Notably, our method achieves an accuracy score of 99.81%, a precision score of 0.99828, a recall score of 0.99826, and an f1-score of 0.99827. Moreover, we enhance the rice leaf disease dataset by merging two existing datasets and supplemented them with an additional 95 manually annotated images gathered from publicly available sources on the internet. We also develop a comprehensive crop health monitoring system for farmers, and develop an open API for the automatic annotation of new instances, benefiting the research community at large.


I. INTRODUCTION
Rice serves as a vital staple, providing sustenance to billions of individuals across the globe.Remarkably, it is cultivated The associate editor coordinating the review of this manuscript and approving it for publication was Hazrat Ali .
in more than three-fifths, precisely 61.54%, of countries worldwide [1].However, the confluence of a burgeoning population and dwindling arable land has led to an escalating reliance on rice, exacerbating the issue of gradual food scarcity.Moreover, rice production faces persistent challenges, including the annual onslaught of various maladies, particularly the detrimental rice leaf diseases.These afflictions disrupt the natural growth, structure, and coloration of rice leaves, arising from internal abnormalities often caused by fungi or viruses.Bangladesh, for instance, has a predominantly agrarian economy where rice is the staple food for the majority of the population [2].Positioned as the largest delta globally, Bangladesh is bestowed with rivers cascading from the Himalayas.Despite having fertile soil and advantageous seasons, the country has been unable to fully harness its agricultural potential due to limited access to cutting-edge technology [3] as well as the hesitancy to adopt new technologies due to wide number of factors including misconception, confusion and uncertainty [4].Currently, it produces approximately 35 million metric tons of rice annually, falling short of meeting the demands of its growing population [5].The underlying cause of this shortfall can be attributed to rice leaf diseases, unequivocally responsible for diminished crop yield [6].To enhance productivity, regular crop health monitoring and proper care are essential.But manual detection of rice leaf diseases is a timeconsuming, labor-intensive, and costly process, rendering it impractical [7].
Earlier efforts in rice leaf disease detection focused on machine learning algorithms-based, neural network-based, and hybrid methods.The conventional machine learning algorithms, while effective in some cases, often demonstrated limited performance as they heavily relied on manual, handcrafted feature engineering.This reliance on human-crafted features not only posed challenges in capturing complex patterns but also led to increased development time and effort [8], [9].The recent emergence of neural networks has introduced CNN-based approaches, which incorporate transfer learning with pre-trained models and customized architectures.Unfortunately, these approaches often entail high asymptotic complexity due to their extensive trainable parameter size, making them challenging to deploy on resource-constrained devices and limiting their practicality in real-time applications.Our extensive study found that the development of an effective rice leaf disease detection method is hindered by two fundamental barriers: a large trainable parameter size of the model and a small dataset size.To address these limitations, we propose an advanced end-toend method which aims to automatically and reliably detect rice leaf diseases, providing valuable assistance to farmers and contributing to the agricultural development of the country.Specifically, we present a lightweight deep Convolutional Neural Network (dCNN) architecture and an enhanced dataset.
The objective of this study is to develop and deploy deep learning models capable of accurately predicting rice leaf diseases.In particular, this article delves into the detection of the five frequently occurring rice leaf diseases, including bacterial leaf blight, blast, brown spot, sheath blight, and tungro, using a lightweight dCNN model.To aid farmers in effortlessly monitoring crop health, we have developed a comprehensive crop health monitoring system comprising a user-friendly website and an Android application.Moreover, we introduce an open API and enhance the rice leaf disease dataset, making it a valuable resource for the research community.The dataset is enhanced by collecting data from the internet and manually annotating them with the assistance of domain experts, resulting in a broader range of rice leaf disease variations.As for the API, which accepts input images and returns disease labels along with insights into the disease's etiology and suggested subsequent actions, it facilitates automatic annotation of new instances, benefiting the research community at large.
The key contributions of this article are summarized below: • We propose a lightweight dCNN architecture for rice leaf disease detection that outperforms several contemporary state-of-the-art methods.For instance, it surpasses [10], [11], and [12] with 16, 811, and 152 times fewer parameters.It also demonstrates superior performance compared to [13] and [14].
• We compare the performance of our proposed method with 21 benchmark architectures, comprising 16 convolution-based and five transformer-based methods.It outperforms the majority of these methods, achieving competitive performance with the remaining ones, where the differences in performance are negligible, with a much lower trainable parameter size.
• We conducted extensive experiments including a wide range of scenarios and varying environmental circumstances.These scenarios included images with natural backgrounds, diverse camera angles generated through random rotations, varying distances captured using zoom-out procedures, and alterations in image quality using both downsampling and upsampling approaches.Furthermore, the model was evaluated using datasets obtained from different geographical locations including Indonesia, China, and Taiwan.
• We enhance the rice leaf disease datasets in [10] and [15] by gathering additional 95 unique RGB images from the internet and having them annotated manually by domain experts to guarantee accurate and high-quality labeling.
• We develop a comprehensive crop health monitoring system for farmers, encompassing a user-friendly website, an intuitive Android app, and an accessible open API, with the aim of assisting both farmers and the research community.The remainder of this article is structured as follows.Section II presents a comprehensive literature review on the detection of rice leaf diseases.Section III of the paper examines the limitations of the existing dataset found in the literature and proposes data preparation procedures to address these shortcomings.In Section IV, we delve into the challenges encountered, elaborate on our network architectures, and outline the proposed methodology.The experimental results are expounded upon in Section V, encompassing comparisons with benchmark architectures and state-of-the-art methods.Further details regarding model deployment can be found in Sub-section V-H.Section VI addresses the benefits and limitations of the proposed technique.Finally, Section VII concludes our study and outlines future possibilities for research in this domain.

II. RELATED WORKS
A wide range of approaches has been proposed for detecting rice leaf diseases.These methodologies primarily fall into three categories: machine learning algorithms-based, neural network-based, and a fusion of both.The performance of the methods deliberately relies on the dataset and discerning strategy employed for feature extraction.
Conventional machine learning algorithms based methods have been proposed in [5], [13], [14], [16], and [17].For instance, [13] and [14] use XGBoost and Support Vector Machine (SVM), respectively, whereas [16] and [17] utilize random forest classifier to detect rice leaf diseases.The performance of these methods is not up to the mark and solely relies on feature engineering.Due to the advent of deep learning, which tends to outperform machine learning algorithms, most of the studies propose a Convolutional Neural Network (CNN) based approach to address the problem.These studies can further be classified into transfer learning based [18], [19], [20], [21] and custom model based [22], [23], [24], [25], [26] approaches.Among transfer learning based methods, [11], [12], [18], [20], and [19], [21] employ DenseNet, VGG, and Insception-ResNet pretrained on ImageNet.Recently, [18] presents a two-stage CNN architecture by adopting and fine-tuning VGG16 and InceptionV3.Likewise, [20] and [27] modify VGG16 and ResNet18, respectively, to reduce the model parameter size.Among custom model based approaches, [23] proposes a MobileNet like architecture by incorporating attention mechanism and named it ADSNN-BO.A custom CNN architecture has also been proposed in [24], [27], and [26].Lately, [27] further utilizes Generative Adversarial Network (GAN) to generate synthetic data.However, [28] tackled the same problem by utilizing edge computing concepts.A few studies e.g., [29] and [30] propose such hybrid approaches which are an amalgamation of CNN and machine learning algorithms.In [29], they perform two approaches including CNN with fully connected layers and CNN with SVM.They use a CNN architecture as a baseline which is identical to LeNet.The first approach includes the base CNN followed by a SVM classifier.The second approach consists of the base CNN and two additional fully connected layers.In contrast, [5] and [14] propose methods that combine both CNN and machine learning algorithms.One of these methods, [14], removes the background of an image based on the saturation threshold.Afterward, disease-affected regions are segmented using the threshold mask on the hue plane of the HSV images.Finally, an extreme gradient boosting decision tree ensemble(XGBoost) is used for classifying the diseases with the Logistic loss function.In another work [5], the background is removed using a segmentation technique done by Otsu's threshold method that determines the optimal value for the global threshold.Then, they use the extracted features from CNN for classification using SVM.They consider three kernel functions namely linear, polynomial and radial basis function (RBF), and report the highest performance using SVM with polynomial Kernel and HOG.
Neural network based approaches tend to outperform statistical and typical machine learning algorithm based approaches [5], [13], [14].Azim and colleagues [14] remove the leaf background depending on the color of the leaf, e.g., they remove the green part of the leaves and take only the affected portion of the image.In real life scenarios, the color of the leaves does not always remain the same.In that case, they may not be able to extract the features properly in the case of dark or light green.''Histogram of Oriented Gradients (HOG)'' is used to describe features in [5].They use SVM with a polynomial kernel function as the dataset is tiny, but they do not consider all the features of the images.There might be some noisy leaf images in the dataset where SVM can not perform well.Another study [30] uses ResNet50 and SVM in a dataset that consists of 5932 images of four rice leaf diseases.The SVM classifier is not suitable for large datasets.At the same time, ResNet50 requires extensive data which is creating an anomaly in the performance.Recently, [11] proposes a CNN architecture that uses the pretrained VGG16 backbone and transfer learning.VGG is a huge model and takes more time to process an image than other models like ResNet.They do not consider other pretrained models.Reference [13] uses color features to explore 14 distinct color spaces and extract four from each color channel.Although the color of the leaves is not always the same, it varies depending on the lighting.Additionally, they have used an SVM classifier, which does not perform well with a large amount of data.Our proposed method resolves the limitations found in the literatures by considering all the local features of an image with a much smaller model parameter size than existing methods.

III. DATASET PREPARATION A. OVERVIEW OF EXISTING DATASETS
We collected several existing rice leaf datasets from various online sources [10], [15], [31], [32], [33].However, after a thorough scrutinization, we found that the field of rice leaf disease detection faces a significant challenge due to the lack of a publicly available large enough dataset.It was also observed that many publicly accessible datasets (e.g.[31]) lack reliability due to the inclusion of identical or augmented versions of images from the train set to the test set.This phenomenon results in an artificial inflation of performance metrics when evaluating the model on the test set.Such models are unlikely to meet the expected performance when applied to real-world data.The lack of extensive publicly accessible datasets presents a major challenge in this field of study, which is worsened by the arduous task of collecting leaf data with subtle disease variations, diverse environmental conditions, and the tedious task of accurately annotating samples.
To tackle this issue, we merged two datasets obtained from sources [10] and [15], and combined them with 95 quality images that we collected from various sources of internet.The datasets [10] and [15] were chosen based on their superior quality.There were a total of 3876 (augmented) and 120 raw images in [10] and [15] respectively.We selected 80 of 120 images from [15] based on the image classes that we are interested in.The 95 quality images that we collected from the internet which were annotated manually.The images from the internet and the 80 images from [15] were combined and then augmented to form a total of 1409 augmented images.This along with the 3876 images from [10] form the image dataset of 5285 images.This curated dataset of 5285 images were used in this work.Considering the crucial role a large dataset plays in achieving notable performance from neural networks, our enhanced version holds immense potential in advancing the field.Hence, our curated set of 95 distinct RGB images, meticulously collected and annotated, serves as a substantial enhancement to the existing dataset.

B. DATA ACCUMULATION
As our extensive study found that the paucity of a publicly available large enough dataset is the main barrier towards the development of efficient rice leaf disease detection, we amalgamate two datasets [10], [15] and further enhance them with images procured from the internet.The manual annotation technique employed in this study encompasses the following components: The symptoms of diseases, along with their corresponding visual representations, were acquired from BARI (Bangladesh Agricultural Research Institute), which is the largest agricultural research institute in Bangladesh.Subsequently, the collected samples were meticulously examined.Additionally, internet data that exhibited a clear resemblance to the visual representation of specific diseases were incorporated and annotated.Annotation was carried out by three members of the group who studied the visual representation and independently classified the diseases.Only the images that received unanimous agreement from all three members were included.Inclusion of the additional images from the internet resulted in the increase in the number of diseased classes.
Firstly, we collected data from the UCI Machine Learning Repository [15], which contains 120 images distributed among three classes: bacterial leaf blight, brown spot, and leaf smut, each containing 40 images.Next, we acquired an additional 95 images from the internet, denoted as I = {I 1 , I 2 , . . ., I 95 }, introducing three new disease classes: blast, sheath blight, and tungro, which are very relevant to the rice leaf diseases encountered in Bangladesh.We combined these collected images with the manually annotated images from the UCI dataset, resulting in a total of five classes.The leaf smut class was omitted from the dataset [15] due to its significantly lower number of images compared to the other classes.Then, we performed data augmentation (which is detailed in the following subsection) on the combined images, generating a set of 1409 images.Subsequently, we merged this augmented dataset with the one used in [10], which already contained 3876 augmented images.The resulted dataset comprises 5285 images, representing five disease classes: sheath blight, tungro, brown spot, blast, and bacterial leaf blight.

C. DATA AUGMENTATION
Convolutional Neural Network (CNN) models demand a substantial volume of training data to effectively discern underlying patterns and attain optimal performance during inference.In this context, image augmentation emerges as a pragmatic and widely adopted approach [34], [35], [36], [37] to construct a resilient image classifier with limited training data.By augmenting the dataset through various transformations, it substantially increases the number of images, thereby bolstering the capability of deep learning models to achieve better performance.
A significant amount of synthetic data was therefore generated using conventional data augmentation techniques, encompassing eight distinct transformations, namely cropping, horizontal and vertical shifting, horizontal and vertical flipping, zooming in and out, and rotation.Each of these transformations play a crucial role in creating a unique representation of the original image.We ensure that augmented instances do not overlap across multiple sets, thereby eliminating any potential data fabrication issues.To achieve this, the augmentation process commences with cropping each instance of our combined dataset, preserving their spatial dimensions while resizing them to 240 × 240 pixels, ensuring uniformity in image size.Subsequently, horizontal and vertical shifts are applied, with a height and width shift range of 0.2, respectively.This leads to the random truncation of the image within the selected negative or positive range, effectively creating shifts both horizontally and vertically.Furthermore, the original instances are flipped horizontally and vertically with a probability score of 0.5.These horizontal and vertical flips produce unique images by transforming rows into columns and vice versa, thereby expanding the dataset's diversity.In addition, the rotation transformation is employed, randomly rotating the images clockwise within the range of 1 to 45 degrees randomly, which introduces further variability to the dataset.Lastly, we adopt the zoom in and out transformation with a range of 0.3, allowing us to alter the aspect ratio of the resultant instances, further enriching the dataset with varied representations.
By employing these data augmentation techniques, our dataset is substantially augmented, providing an extensive and diverse collection of instances for robust model training.

D. ENHANCED RICE LEAF DISEASE DATASET
The enhanced dataset contains 5593 images from 5 disease classes including sheath blight, tungro, brown spot, leaf smut, and bacterial leaf blight.A few sample instances of the  dataset are shown in Figure 1.We split the dataset into train, validation, and test sets containing 3158, 1277, and 850 images respectively.The statistic of the dataset can be found in Table 1.The validation and test sets are balanced, whereas the training set exhibits some degree of imbalance.Notably, the training set contains the highest number of images for bacterial leaf blight, whereas sheath blight has the fewest instances.Specifically, bacterial leaf blight and sheath blight represent approximately 33% and 11% of the total images in the training set, respectively.Similarly, tungro, brown spot, and bacterial blast make up roughly 13%, 30%, and 13% of the training set images, respectively.

IV. METHODOLOGY A. OVERVIEW
The proposed method takes an image as input and classifies it into one of the disease categories based on the local features of the image.It begins by taking an image as input and resizing it according to the input image size of the model.Then it extracts the features of the image with the help of convolution and pooling layers.Finally, it uses the extracted features to classify the image.Figure 2 depicts the approach for rice leaf disease detection.

B. APPROACH
Training the model involves two major steps: forward propagation and backward propagation.Firstly, we initialize the weights of the model randomly.In the case of forward propagation (equation 1), we pass a batch of images through the model.The input data moves forward and generates predictions.The calculations in the neurons of the hidden layers and output layer are as follows.
Where, l refers to the hidden layer, n refers to a neuron of a hidden layer, j refers to a neuron of previous hidden layer, W is the weight matrix, a is the output of a neuron, b is the bias and f is the activation function.
To update the weights in backward propagation, the loss is calculated using the prediction from forward propagation and the actual label.The cross-entropy loss function is employed because our extensive scrutiny revealed that the Adam optimizer and categorical cross-entropy loss function together yield the best performance.The loss is calculated as the negative logarithm of the softmax output for a specific class, along with the true label.The equation for calculating softmax probability is as follows: Softmax returns the probability of the input belonging to each of the classes.The probability value ranges between 0 to 1, and the sum of all probabilities equals 1.We then compute the cross-entropy loss, which is given in equation 3.
The cross-entropy is the product of the true label of a certain class and the negative logarithm of the softmax of its prediction.Hence, the t i of the equation will be zero for all other classes, but it will be 1 for the true label or class.We calculated the goodness of fit using prediction and actual label by employing equation 3. Based on the value of the loss, we updated the weights in backward propagation.Since we use mini batches, the weights of the model get updated at each mini batch.The Adam optimizer is used to minimize the cost function which is an amalgamation of RMSprop and momentum.The formula for updating the final weight (equation 4) and bias (equation 5) is as follows.
where, W is model weights, B stands for bias, t is the current state, η is the step size, and mt and vt are bias corrected estimators for the first and second momentum estimator.

1) SHORTCOMINGS OF EXISTING ARCHITECTURES
We commence the experiment by utilizing established benchmark architectures, ensuring a robust foundation for our study.Our selection includes seven widely recognized architectures: AlexNet [38], MobileNetV2 [39], MobileNetV3 [40], ResNet50 [41], DenseNet121 [42], ResNeXt50 [43], and ShuffleNetV2 [44].Each of these models is proficient at taking an image as input and accurately classifying it into its respective disease category.Their individual performances showcase high efficacy.The empirical outcomes of these models can be found in Table 3.
The vast majority of these well-known architectures perform admirably.However, the total number of trainable parameters in these models is massive.For example, ResNet50 contains 25.5M trainable parameters.Deploying these models is expensive and impractical for operational settings with constraints like low-resource devices or limited bandwidth, common in remote regions in the Global South.Therefore, we develop a tiny dCNN model for rice leaf disease detection that is efficient in terms of performance and practical for deployment.For instance, our rice leaf disease detection (RLDD) model is 150 times smaller than ResNet50, yet it performs similarly.The extensive experiments support the validity of the proposed method, which is successful and efficient in classifying rice leaf diseases.

2) PROPOSED ARCHITECTURE
Our proposed model consists of mainly two parts -one part of the model is used for feature extraction while the other part is used for classification.The feature extraction component involves convolution and pooling layers to effectively capture relevant patterns in the data.On the other hand, the classification component consists of dense layers, also known as fully connected layers, which aid in making accurate predictions based on the extracted features.Moreover, Dropout layer is used to avoid overfitting of the model.The proposed model is illustrated in figure 3.
A non-linear mapping f (x, θ) is exploited by the model.In all of the hidden layers of the model, we used Rectified Linear Unit (ReLU), R(z) = max(0, z).But in the output layer we used Softmax activation function which is σ = e z i / k n=1 e z k , for j = 1, . . ., k, to produce probabilistic predictions.To mitigate the risk of overfitting, a dropout ratio of 10% is consistently applied across the model.The details of all layers of the model are available in the table 2. Our proposed deep learning model is super lightweight, comprising only 0.17M parameters.This is remarkably small compared to other models such as AlexNet, ResNet50, DenseNet121, etc.Despite its significantly smaller size compared to the models mentioned in sub-section IV-B1, our model outperforms most of them.For instance, it is 150 and 364 times smaller than ResNet50 and AlexNet, respectively, while still delivering competitive performance with ResNet50 and surpassing AlexNet.Furthermore, it outperforms models [39] and [43], even though it has 20.5 and 147 times fewer parameters, respectively.This remarkable performance makes our lightweight model an excellent choice for efficient and effective rice leaf disease detection.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

V. EXPERIMENTAL ANALYSIS A. HYPERPARAMETER TUNING
We fine-tune the model's hyperparameters, resulting in further performance improvement.We conduct empirical experiments with different dropout ratios, activation functions, optimizers, batch sizes, hidden layers, and loss functions.Through this experimentation, we determine that a specific combination, as shown in Figure 4, yields the best result.
Regarding the size of the model, we explore the impact of adding more convolution layers.We observe that increasing the convolution layers does not improve the model's performance but does increase the total number of trainable parameters.Conversely, reducing the number of convolution layers dramatically decreases the model's performance.Additionally, we experiment with dropout ratios of 10%, 20%, and 40% and discover that as we increase the dropout ratio, the model's accuracy gradually decreases.Figure 4(a) depicts the accuracy vs. dropout ratio graph, illustrating the changes in accuracy for different dropout ratios.In terms of activation functions, we utilize ReLU, ELU, TanH, and Sigmoid and find that ReLU delivers the best performance, while Sigmoid yields the lowest results.

B. EVALUATION METRICS
We employ performance measures including Accuracy, Precision, Recall and F1 Score to measure the performance of our method.The confusion matrix is utilized to determine the accuracy, precision, recall and f1-score.False Negative (FN) or Type-II Error: The model predicted negative but it is false, which means the predicted value was falsely predicted.The actual value was positive but the model predicted as negative.
• Accuracy: Accuracy refers to the proportion of correct predictions made by our model among all cases.It is determined by dividing the total number of right predictions by the total number of instances in the dataset.
• Precision: Precision indicates how many of the instances that were accurately predicted turned out to be positive.It decides whether a model is trustworthy or not.It's beneficial in situations when a False Positive (FP) is more of a concern than a False Negative (FN).
The formula for calculating the precision is following: • Recall: Recall indicates how many of the actual positive cases our model was able to correctly anticipate.It is a useful metric in cases where False Negative (FN) trumps False Positive (FP).The formula for calculating the precision is following: • F1 Score: F1 Score also known as F-Score or F-Measure is a harmonic mean of Precision and Recall.F-Score comes handy where it is difficult to compare to models with low precision and high recall or vice-versa.The formula for calculating the precision is following: We compare the performance of our proposed method with 21 well-known benchmark models and several recently published cutting-edge methods [10], [11], [12], [13], [14], [30], [45], [46], [47], [48], [49], [50], [51], [52].The extensive comparison validates the effectiveness of our method.

1) COMPARISON WITH BENCHMARK ARCHITECTURES
We conducted a thorough comparison of our proposed method with 21 benchmark architectures to rigorously assess and validate the efficacy of our approach against a diverse set of established models.Specifically, our method was evaluated against 16 convolution-based and 5 transformerbased architectures.To do so, we train these models on our enhanced dataset and report their performance on Table-3, showcasing the effectiveness of our proposed dCNN.Our approach demonstrated superior performance in terms of accuracy, precision, recall, and F1 score when compared to 13 convolution and transformer-based methods, including well-established models such as SwinTransformer, SwinTransformerV2, ResNet50, and ConvNext, to name a few.Moreover, our method achieved competitive performance across the remaining benchmark architectures while maintaining a significantly reduced parameter size.For instance, it outperforms [39], [43], and [38] with 20.5, 147, and 364.7 times fewer parameters, respectively.In contrast, some architectures [40], [41], [42], [44], [53] gives slightly better performance than ours -however when considering the size of the model parameters, our model outperforms all of them by a great margin.
Our proposed method outperforms each of these existing methods.It improves the accuracy by 1.8%, precision by 4.87%, recall by 4.66%, and f1 score by 5.1% than [10], which is second best to our method, with 21.7 times fewer parameters.It outperforms [11] and [12] by 7.2% and 5.6% higher accuracy with 811 and 152 times fewer model parameter size.It also outperforms [13] and [14] by attaining 6.3% and 13.1% higher accuracy score respectively.

D. PERFORMANCE ANALYSIS
Our proposed method, RLDD, achieves an impressive 99.65% accuracy on the test set, with a precision of 0.99667, recall of 0.99657, and an f1 score of 0.99674.The performance of our method is explicitly illustrated through the confusion matrix in Figure 5, demonstrating its accuracy in classifying instances of leaf blight, bacterial blast, and tungro.However, it makes a few minor mistakes while classifying instances of sheath blight and brown spot.6(a) displays an upward trend, indicating that as the number of epochs increases, the model's accuracy also improves.In contrast, the loss vs. epoch graph in Figure 6(a) shows a downward slope, suggesting that as the epochs progress, the model's loss decreases.These trends indicate that the model is effectively learning to generalize new test cases and improve its performance over time.

E. INTERPRETABILITY OF THE MODEL
Incorporating interpretability into our model enhances its reliability and trustworthiness.The Gradcam technique, introduced by Selvaraju et al. [54], is employed to pinpoint the specific area of the input image that significantly influences the model's prediction.The depiction of the prominent regions, as seen in Figure 7, enhances clarity and reinforces the reliability of our model.

F. FURTHER VALIDATION
We further validate the performance of our proposed dCNN across various environmental conditions and geographical datasets to assess its generalizability.

1) MODEL ROBUSTNESS IN DIFFERENT SCENARIOS
To bolster the resilience of the proposed model, a comprehensive series of experiments were conducted, encompassing various challenging scenarios encountered during testing.These situations included images placed against natural backgrounds, where the model demonstrated an impressive accuracy of 99.615%±0.156%.Furthermore, the model's performance was evaluated under diverse camera angles, showcasing its ability to maintain accuracy at 86.5033%±2.054%even when subjected to random rotations.Additionally, the model's capability to handle varying distances was assessed, resulting in accuracies of 85.8667%±1.646%for zoomed-out images.Moreover, the model's capacity to endure changes in image quality was scrutinized through downsampling and subsequent upsampling, achieving notable accuracies of 99.525%±0.0173%and 99.5233%±0.265%,respectively.These findings underscore the robustness and adaptability of the proposed model across diverse environmental conditions and scenarios.

2) DIFFERENT GEOGRAPHICAL DATA
We assess the robustness of our model across diverse datasets obtained from various geographical locations.Our analysis involves three datasets sourced from distinct countries, namely China, Indonesia, and Taiwan.The performance metrics presented in Table-5 highlight our model's effectiveness on these geographically diverse datasets.Notably, we observe a considerable performance drop when training the model on our data and testing it on datasets from different geographical locations, emphasizing the nuanced differences in data distribution.However, through meticulous fine-tuning on diverse geographical data, our model successfully regains its performance, showcasing its adaptability and generalization capabilities.

3) COMPUTATIONAL COST ANALYSIS
We conducted a comprehensive analysis, comparing the training, validation, and inference times of our model against 21 benchmark models.Table-6 presents a clear demonstration of the superior performance and efficiency of our model, surpassing all other models listed by achieving exceptional results with minimal training, validation, and inference times.

G. VARYING LIGHT INTENSITY
We compared the performance of our model with all 21 benchmark architectures under varying light intensities.To adjust image brightness, we increased and decreased the intensity by 20%. Figure -8 illustrates examples of normal, brighter, and darker versions of an instance.To empirically validate the performance, we pursued two approaches.Firstly, we trained all the models on normal data and tested them on lighter or darker data, resulting in lower performance across the board.This outcome was expected, as the models were not trained on similar data and therefore struggled with recognition.Secondly, fine-tuning the models on brighter and darker data enabled all models to resume effective operation.

H. EXPERT SYSTEM
We deploy our proposed model through a website and an android app.To do so, an open API has been developed which is essentially a gateway between server and web or app interface.

1) APPLICATION PROGRAMMING INTERFACE (API)
An API is a set of programming code that allows data to be exchanged between two or more software products, like websites and apps.It is quicker, flexible, secured and responsive.We develop an open API that is accessible for everyone.It takes an image as query and returns the crop name, disease name, and disease details. Figure 9 illustrates the endpoint of our API.
We deploy our proposed model and kept it on the server.The website and the android app interact with the pre-trained model via the API.As a result, the user's device uses very little memory which makes it very convenient from the user's perspective.For instance, when a user uploads an image to our website or app, the API takes the image as an input query.Then it sends the image to the server and makes a prediction about it using the model.Finally, it returns the prediction of the model from the server to web or app interface.

2) WEB INTERFACE
A web user interface, often known as a Web app, allows a user to use a web browser to interact with data or software on a web server.A web app is platform independent as well.As a consequence, we have seen a huge increase in popularity of web-based application in recent years.Websites are flexible as it can be accessed through browser application (web  browser) from both computer and mobile devices.Figure 10 depicts the interface of our developed website.
As previously stated, the website communicates with the server via the API.The website's interface is simple enough for someone with only a rudimentary understanding of browsing websites to use.A user would simply upload an image from the device by clicking the Choose file button, navigating to the desired image location, and clicking the Submit button.The predictions will be displayed on the website, along with other pertinent information such as crop name, disease details, and possible next steps.Figure 11 depicts the disease detection process for rice leaf disease.
Website also provides extra information pertaining to rice leaf diseases.For instance, it has a section which contains the reasons of a disease and the remedies.A user can obtain these information from the website as well.In navigation bar, an option for API is included.Users can find useful  information pertaining to the API that is working behind the website.

3) APP INTERFACE
Due to the affodability and availabilty, smartphones have now become ubiquitous in developing countries.Consequently, android application can play a significant role in improving the model's usability.Figure 12 illustrates our developed android app interface and steps for predicting rice or leaf diseases using it.
We develop the app considering low resource devices and its users.We kept the interface simple, so that, users can use it without any supervision.The interface directly redirects a user to the main page.The user gets an option for taking a picture or selecting an image from the gallery via the app.After capturing or selecting the appropriate image, the user can send it for prediction by clicking the Check button.Once the image is sent, the API receives the image and the results obtained will be returned to the app interface.The app requires very little space as the model does not run on the user's device.Hence, users with low configured devices will be able to use the app smoothly.

VI. DISCUSSION
We conducted a comprehensive evaluation of our model, comparing it with 21 established benchmark models utilizing We also tested our model with images from different geographic locations.This analysis aimed to demonstrate the robustness and generalizability of the models across a broad range of scenarios.Our model demonstrates high performance under different situations.Moreover, our model shows a much superior performance in terms of the trainable parameters of the model.This makes our model useful to be operate on edge devices and in offline mode, particularly in remote areas where internet connectivity is limited or unreliable.Furthermore, our developed API can be leveraged for annotation purposes, effectively addressing the challenge of data scarcity.
One of the limitations of our model is that it can predict five rice leaf diseases.Other rice leaf diseases as well healthy rice leaves cannot be predicted by this model.In future, we plan to overcome these challenges by collecting more relevant data and training with dCNN.
We encountered a number of challenges while developing and implementing the proposed technique and system.One significant constraint was the limited availability and quality of the rich rice leaf disease dataset.This contributed to an imbalance in class distribution in the training dataset.Another challenge we encountered was the need for computational resources during training and testing stages.

VII. CONCLUSION AND FUTURE WORK
Rice leaf disease detection is an inevitable task to increase production.Early identification of rice leaf diseases will assist farmers in saving their harvest from getting affected by diseases.Existing methods for rice leaf disease detection are found to be rather ineffective for several reasons.From a pragmatic viewpoint, the solution needs to be operational within resource-constrained environments.This means the model would need to work with fewer parameters and be as light as possible.This paper presents a lightweight dCNN based model for detecting five of the most common rice leaf diseases including brown spot, tungro, bacterial blight, sheath blight, and bacterial blast.We compare the performance of our model with 21 benchmark architectures and 14 concurrent methods.The extensive experimental outcomes validate the effectiveness of our method and demonstrate efficiency in detecting diseases which in turn will help the farmers to save the production losses at an early stage.The nature of our proposed method is end-to-end.It achieves competitive performance with benchmark architectures with much lower asymptotic complexity and shows superior performance with existing methods.We enhance an existing dataset by manually collecting data and annotating them by experts.This study brings forward more varieties of rice leaf diseases and a fine-tuned dCNN model, following an end-to-end manner, which gives accurate performance with significantly lower asymptotic complexity.Additionally, the subsequent research develops an integrated application to fit in low-end devices which includes an API, an android APP and a website.
We conducted a range of validation studies to enhance the credibility and relevance of our suggested method.We intend to conduct additional trials in future to enhance its usefulness.Furthermore, we plan to broaden our research to identify additional crop diseases outside rice leaf diseases.
The activities in the project builds upon much of the findings generated from a field study conducted with agricultural communities in rural Bangladesh.A range of challenges were identified, and access to expertise was one of them.The activity described in this paper aims to address this need within the community, where farmers could benefit from the availability of expert guidance on disease and treatment.Therefore, future work in the project would involve conducting evaluations with farmer communities to understand how the app performs in real world contexts.This would also involve conducting user studies to understand usability of the applications and scalability of the API.
MEHEDI HASAN BIJOY received the B.Sc. degree (summa cum laude) in computer science and engineering from North South University (NSU), in 2021.He is currently pursuing the M.Sc.degree with Aalto University, specializing in computer, communication, and information sciences, with a primary focus on speech and language technology.Along his academic journey, he has held various roles, including a Lecturer with Bangladesh University of Business and Technology, a Research Assistant with United International University, and a Teaching Assistant followed by a Lab Instructor with NSU.Furthermore, he served as a Reviewer for the Bangla Language Processing Workshop at EMNLP 2023.
NIROB HASAN received the Bachelor of Science degree in computer science and engineering from North South University, Dhaka, Bangladesh, in 2021.Commencing his career as a Software Engineer, in 2022, he has demonstrated expertise in diverse domains of software development.His current responsibilities encompass the development of production-level APIs, which involve the integration of various business logic and AI models.Additionally, he specializes in designing responsive web applications.Previously, he was with Bangladesh Open-Source Network, where he served as a Programming Mentor, from 2018 to 2020.He diligently coordinated programming camps and actively engaged in educational outreach.SUVODEEP MAZUMDAR is currently a Senior Lecturer in data analytics.His research explores developing techniques and mechanisms for reducing the barriers that impede user communities understanding of vast complex multidimensional datasets.He conducts interdisciplinary research on highly engaging, interactive, and visual mechanisms in conjunction with complex querying techniques for seamless navigation, exploration, and understanding of complex datasets.He has applied his research in a wide range of application domains, such as aerospace engineering, sports informatics, crisis/emergency management, smart cities, and mobility planning.As a part of his research, he collaborates with large multi-disciplinary teams of academics, industry partners, city councils, and planners.He has worked in several extensive research and industrial projects funded by the UKRI, European Union, Innovate U.K., and European Space Agency.His research interests include studying and developing data and visual analytic techniques to analyze massive volumes of dynamic data in near real-time; citizen science and crowdsourcing techniques for observing physical phenomena, events, and environments; user interface development, human-computer interaction, and user-centered design; and assistive technologies to support independent activities of daily living.

FIGURE 1 .
FIGURE 1. Sample images from rice leaf disease dataset.

FIGURE 2 .
FIGURE 2. (Left) Image preprocessing begins by gathering image data for our project.It includes data unification and augmentation where we amalgamate multiple datasets and generate synthetic data.(Middle) Our proposed deep learning model which takes an image as input and classifies it into one of the disease categories based on the features of the image.(Right) The deployment of our model which includes an API, an android app, and a website.

FIGURE 3 .
FIGURE 3. Our proposed dCNN backbone for rice leaf disease detection.It consists of six convolution layers, five maxpooling layers, and two fully connected layers.

FIGURE 4 .
FIGURE 4. The experimental outcomes of our proposed rice leaf disease detection model for using different dropout ratios, activation functions, optimizers, and batch sizes.Sub-figures bf(a), bf(b), bf(c), and bf(d) illustrates the empirical outcomes of different dropout ratios, activation functions, optimizers, and batch sizes.

Figure 4 (
b) presents the empirical outcomes of different activation functions.For optimizers, we experiment with RMSprop, Adam, and SGD.Although all three optimizers perform comparably (Figure4(c)), Adam exhibits slightly better and faster performance than RMSprop and SGD.

Figure 4 (
d) displays the accuracy vs. batch size graph, highlighting the relationship between accuracy and batch size.It shows that the model's accuracy increases until the batch size reaches 32 and decreases afterward.Consequently, in the final training, we used ReLU activation function, Adam optimizer, a dropout ratio of 10%, a batch size of 32, and a learning rate of 0.001.Regarding the learning rate, we experimented with adding a decay rate, particularly cosine decay, which ultimately reduced the overall performance.

•
Confusion Matrix: It is an N × N matrix which is used for evaluating the performance of a classification model where N is the number of classes.It compares the actual labels to the model's predictions and gives us a holistic view regarding the performance of our model along with the type of errors it is making.The N × N matrix consists of two kind of values: positive and negative.The columns and rows of the matrix represent the actual and predicted values respectively.The four most important terminologies in a confusion matrix are True Positive (TP), True Negative (TN), False Positive (FP) or Type-I Error, and False Negative (FN) or Type-II Error.True Positive (TP): The model correctly predicted a positive outcome, where the predicted value matches the actual value.True Negative (TN): The model correctly predicted a negative outcome, where the predicted value matches the actual value.False Positive (FP) or Type-I Error: The model predicted positive but it is false, which means the predicted value was falsely predicted.The actual value was negative but the model predicted as positive.

Figure 6 (
Figure 6(a) showcases the changes in accuracy and loss in our RLDD model concerning epochs.The accuracy vs. epoch graph in Figure 6(a) displays an upward trend, indicating that as the number of epochs increases, the model's accuracy also improves.In contrast, the loss vs. epoch graph in Figure 6(a) shows a downward slope, suggesting that as the epochs progress, the model's loss decreases.These trends indicate that the model is effectively learning to generalize new test cases and improve its performance over time.

FIGURE 6 .
FIGURE 6. Changes of Accuracy and Loss of our proposed rice leaf disease detection model with respect to epochs.Sub-figures bf(a) and bf(b) are model accuracy vs epoch and loss vs epoch graphs respectively.

FIGURE 7 .
FIGURE 7. The outcomes of the integrated Gradcam for clear model interpretation, revealing influential areas to strengthen confidence in decision-making.

FIGURE 11 .
FIGURE 11. (Left) The web interface for rice leaf disease detection.(Middle) Uploaded an image from the device.(Right) Get predictions for the uploaded image.

FIGURE 12 .
FIGURE 12. APP interface.a spectrum of convolution and transformer based architectures, each varying in trainable parameters from 73,000 to 303.3 million.The performance of our model was rigorously assessed against different environmental situations including varying light intensity, camera angle, varying distances, different image quality and images with natural background.We also tested our model with images from different geographic locations.This analysis aimed to demonstrate the robustness and generalizability of the models across a broad range of scenarios.Our model demonstrates high performance under different situations.Moreover, our model shows a much superior performance in terms of the trainable parameters of the model.This makes our model useful to be operate on edge devices and in offline mode, particularly in remote areas where internet connectivity is limited or unreliable.Furthermore, our developed API can be leveraged for annotation purposes, effectively addressing the challenge of data scarcity.One of the limitations of our model is that it can predict five rice leaf diseases.Other rice leaf diseases as well healthy rice leaves cannot be predicted by this model.In future, we plan to overcome these challenges by collecting more relevant data and training with dCNN.We encountered a number of challenges while developing and implementing the proposed technique and system.One significant constraint was the limited availability and quality of the rich rice leaf disease dataset.This contributed to an imbalance in class distribution in the training dataset.Another challenge we encountered was the need for computational resources during training and testing stages.
MITHUN BISWAS received the B.Sc. degree in computer science and engineering from the University of Liberal Arts Bangladesh, in 2017.After graduation, he delved into software development, amassing two years of professional experience.He is currently a dedicated Research Engineer with FS Solution Company Ltd., South Korea.In 2022, he pivoted to the domain of machine learning engineering.He has a keen interest in machine learning, image processing, and computer vision research.His dedication to academia is evident through his notable contributions.He has published several papers in esteemed conferences and journals, including the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop, Expert Systems, and IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS.

TABLE 1 .
The statistics of our enhanced rice leaf disease dataset.

TABLE 2 .
The details of our proposed Rice Leaf Disease Detection architecture where F, KS, S, and BS denote filters, kernel size, strides, and batch size, respectively.

TABLE 3 .
Comparison of the performance of various models in terms of accuracy (Acc.), precision (PR), recall (RE), F1 score (F1), and the number of parameters (#Param.)

TABLE 4 .
Comparison of the performance of our proposed method with other existing methods.

TABLE 5 .
Comparison of model performance before and after fine-tuning on different geographical data.

TABLE 6 .
Comparative analysis of training, validation, and inference times for our model and 21 benchmark models.

TABLE 7 .
Comparison of model performance in terms of accuracy under different light intensity transformations.