FieldPlant: A Dataset of Field Plant Images for Plant Disease Detection and Classification With Deep Learning

The Food and Agriculture Organization of the United Nations suggests increasing the food supply by 70% to feed the world population by 2050, although approximately one third of all food is wasted because of plant diseases or disorders. To achieve this goal, researchers have proposed many deep learning models to help farmers detect diseases in their crops as efficiently as possible to avoid yield declines. These models are usually trained on personal or public plant disease datasets such as PlantVillage or PlantDoc. PlantVillage is composed of laboratory images captured under laboratory conditions, with one leaf each and a uniform background. The models trained on this dataset have very low accuracies when running on field images with complex backgrounds and multiple leaves per image. To solve this problem, PlantDoc was built using 2,569 field images downloaded from the Internet and annotated to identify the individual leaves. However, this dataset includes some laboratory images and the absence of plant pathologists during the annotation process may have resulted in misclassification. In this study, FieldPlant is suggested as a dataset that includes 5,170 plant disease images collected directly from plantations. Manual annotation of individual leaves on each image was performed under the supervision of plant pathologists to ensure process quality. This resulted in 8,629 individual annotated leaves across the 27 disease classes. We ran various benchmarks on this dataset to evaluate state-of-the-art classification and object detection models and found that classification tasks on FieldPlant outperformed those on PlantDoc.


I. INTRODUCTION
The global population is expected to reach 10 billion peole by 2050. Therefore, food production must absorb this population growth, although the amount of available arable land is limited [1]. The Food and Agriculture Organization of the United The associate editor coordinating the review of this manuscript and approving it for publication was Gustavo Olague . Nations (FAO) suggests increasing the food supply by 70% to feed the future population by 2050 [2], while about one third of all grown food is wasted because of plant diseases or disorders [3], [4]. In terms of economic value, plant diseases alone cost approximately US$ 220 billion annually [4].
Loss of crop yield is a major research concern. Plants die if their leaves cannot produce chlorophyll via photosynthesis because of diseases or disorders. Artificial Intelligence (AI) has been extensively considered to solve the problem of crop yield loss, particularly in the areas of Computer Vision and Machine Learning. Therefore, many deep Convolutional Neural Networks (CNN) have been proposed by researchers for plant disease identification and classification; some of the most popular CNN are highlighted by Adi et al. [5]. The purpose of these solutions is to provide farmers with a way to identify diseases that attack plants as soon as possible and suggest countermeasures to avoid crop losses.
The PlantVillage [2], iBean [6], citrus [7], rice [8], cassava [9], and AI Challenger 2018 datasets [10] are among the most widely used plant disease datasets, with available laboratory images. These datasets have been widely used to train CNN for plant disease identification and classification. Neural networks trained on these datasets were able to achieve a high classification accuracy during training. However, when these systems were tested under real field conditions, their performance decreased sharply. This is because in contrast to laboratory images, field images have complex background features, including other leaves, stems, fruits, soil, and mulch. Studies have demonstrated that complex backgrounds in field images significantly contribute to this drop in performance, and that background removal can enhance disease recognition accuracy [11].
Therefore, plant disease classification systems trained on laboratory images are not usable in practice owing to the structural difference between laboratory and field images [11], [12], [13]. Laboratory images were captured under controlled lighting and uniform background conditions, with each image containing only one leaf. Field images typically have several interwoven leaves, stems, branches, flowers, non-leaf objects, and complex backgrounds, as illustrated in Fig. 1. Li et al. [14] highlighted in their study the need to establish a large dataset of plant diseases in field conditions for plant disease detection. Indeed, researchers need to test their models on datasets acquired from fields [15] to provide practical solutions to farmers to address crop losses.
To address this challenge, Singh et al. [4] proposed a dataset of field images called PlantDoc, a dataset for visual plant disease detection containing 2,598 data points across 13 plant species and up to 17 classes of diseases. Although it contains many laboratory images, PlantDoc has been used in some studies on plant disease detection, but has achieved very low performance [13]. Because of the lack of extensive domain expertise, some images in this dataset may be incorrectly classified [4]. The major challenge in plant disease identification from field images is to build a sufficiently accurate model to identify the plant involved in the image and the associated disease, which is a complicated task.
In this study, we propose FieldPlant 1 as a new dataset for the identification and classification of plant diseases from field images captured under different lighting conditions. The 5,170 original images captured in plantations were annotated using the RoboFlow on-line platform [16], to identify the 1 https://universe.roboflow.com/plant-disease-detection/fieldplant individual leaves. Some images had only one leaf to annotate and others had multiple leaves to annotate, resulting in 8,629 individual leaf annotations across 27 disease classes. This dataset is intended for researchers to build models that offer practical solutions to farmers for plant disease identification and classification under real conditions.

II. RELATED WORK A. PLANT DISEASE DETECTION DATASETS
Although there are several datasets related to plant diseases, PlantVillage and PlantDoc remain the two most publicly available.

1) PlantVillage
PlantVillage [2] is the largest plant disease dataset. The initial data records for 2016 contained 54,309 images spanning 14 crop species including-apple, blueberry, cherry, corn, grape, orange, peach, bell pepper, potato, raspberry, soybean, squash, strawberry, and tomato. These expertly curated images of healthy and infected crop leaves were made available through the existing online platform, PlantVillage (www.plantvillage.org). Diseases affecting these plants are divided into 17 fungal diseases, four bacterial diseases, two mold (oomycete) diseases, two viral diseases, and one mite disease. The dataset contains 38 classes of plant diseases and one class of background images, as shown in Fig. 2.
This initial data setup was the beginning of an ongoing crowdsourcing effort to enable computer vision approaches to solve the problem of yield losses in crop plants owing to infectious diseases.
From fields with crops infected with the disease, technicians collected leaves by removing them from the plant and placing them against a paper sheet that provided a grey or black background. All the images were captured under full illumination. Once the images were collected, they were edited by cropping away much of the background and orientating all leaves such that they tip pointed upward, as shown in Fig. 3. The images from this dataset are referred to as laboratory images.

2) PlantDoc
PlantDoc [4] is a dataset of 2,569 images across 13 plant species (apple, bell pepper, blueberry, cherry, corn, grape, peach, potato, raspberry, soyabean, squash powder, strawberry, and tomato) and 30 classes (diseased and healthy) for image classification and object detection. The distribution of images of plant species and diseases is shown in Fig. 4.
PlantDoc contains field plant disease images downloaded from the internet and annotated to train models for detecting crop diseases from field condition images, as shown in Fig. 5. Because the images in the dataset were downloaded from the internet, they were generally of poor quality, and some images contained leaves that had not been photographed on plants and were more akin to laboratory images as shown in Fig. 6. However, for the annotations of images made    without the assistance of plant pathologists, it is very likely that annotation errors slipped into the dataset [4] because certain plant diseases are very similar in appearance. Finally, the number of annotated images in different categories is generally insufficient for training models that are capable of achieving high accuracy.

B. PLANT DISEASE DETECTION MODELS
In contrast to conventional machine-learning techniques, deep learning can automatically learn the hierarchical features of pathologies. This eliminates the need to separately design the morphological operations of feature extraction for future classification. Therefore, we present the recent research on convolution neural networks for plant disease detection and classification.

1) PRE-BUILT MODELS
In many cases, the authors used existing state-of-the-art convolutional neural networks to address the problem of plant 35400 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.    disease identification and classification. When the models were not completely re-trained because of insufficient data or missing computational power, transfer learning was used to maintain the pre-trained weights and reduce computation time.
To take advantage of existing neural networks trained on large datasets, stepwise transfer learning was used by Ahmad et al. [12] on pre-trained neural networks to avoid negative transfer learning. They used MobileNetV2 [17] pre-trained weights to build the model and achieved 99% and 99.69% accuracy on the Pepper and PlantVillage datasets, respectively.
Elfatimi et al. [18] presented a deep learning approach for classifying bean leaf diseases. The model was trained using MobileNetV2 [17] architecture under controlled conditions to obtain faster training times, higher accuracy, and easier retraining. To achieve these goals, the authors attempted different hyper parameters and optimization methods. The model achieved an accuracy of 97% for 1296 field images taken from the iBean dataset [6].
The YOLO [19] neural network has been one of the greatest achievements in object detection in the field of Artificial Intelligence. The study [20] used YoloV3 [21] ] and achieved an accuracy of 79.19% in the detection and classification of six rice leaf diseases: blast, bacterial leaf blight, brown spot, narrow brown spot, bacterial leaf streak, and rice ragged stunt virus disease. The experiment was conducted by using 6,330 self-collected images.

2) AUTHORS-BUILT MODELS
Often, specific constraints such as preprocessing steps, network architecture, or dataset structure have led the authors to suggest specific models for plant disease detection.
Khattak et al. [22] proposed a 2-layers CNN model that extracts complementary discriminative features from citrus fruits and leaves by integrating multiple layers. The model differentiated healthy fruits and leaves from fruits or leaves with common citrus diseases, such as black spots, canker, scab, greening, and melanose, with a test accuracy of 94.55%. The dataset used in this study contained only 213 images from the PlantVillage [2] and Citrus [7] datasets.
The unwanted background and noise of the input image can have a significant negative impact on the model accuracy. To overcome this problem, the study in [23] used U 2 -Net by first producing a mask of the region of interest from the original image. Then, a bitwise operation was applied to the original image and mask produced by U 2 -Net. The EfficientNetV2 [24] model was used for cardamom plant disease detection, achieving a detection accuracy of 98.26%.
To reduce the number of parameters of the model, Amin et al. [25] used two pre-trained convolutional neural networks, EfficientNetB0 and DenseNet121, to extract deep features from corn plant images. The extracted deep features from each CNN were then fused using the concatenation technique to produce a more complex feature set, from which the model could learn the dataset better, achieving a classification accuracy of 98.56% on a subset of the PlantVillage dataset.
To reduce the neural network training time, Hassan and Maji [26] suggested a reduction in the model parameters based on the inception layer, residual connection, and depthwise separable convolution. The accuracies obtained for the VOLUME 11, 2023 PlantVillage, rice [8] and cassava [9] datasets were 99.39%, 99.66%, and 76.59%, respectively.
Zhou et al. [27] proposed a hybrid deep learning model that combines the advantages of deep residual networks and dense networks to reduce the number of training process parameters and improve calculation accuracy. The experimental results show that this model can achieve a top-1 average identification accuracy of 95% on the tomato test dataset in the AI Challenger 2018 dataset [10].
Wang et al. [13] suggested a dual-stream hierarchical bilear-pooling model for the multi-task classification of plant diseases. The authors used fine-grained image recognition methods to extract discriminative fine-grained features, thereby enhancing the representation capability of the model. The PlantDoc dataset [4] was used for the experiment, and after optimizing multi-task learning using homoscedastic uncertainty, the plant and disease accuracies obtained were 84.71% and 75.06%, respectively.

3) PRACTICAL SOLUTIONS FOR PLANT DISEASE DETECTION
Based on deep learning models, two main mobile apps for plant disease detection have emerged in the community: Plantix and PlantVillage Nuru.
Plantix [28] is a smartphone application trained to identify a large range of plant diseases. Users snap a cropped image using their phone, which sends the image to the server to perform an analysis using its on-line deep learning model. The results are reported back to the phone with suggestions for suitable countermeasures. Although the image dataset and the deep learning model of Plantix are not available, Goncharov et al. [29] conducted a study showing that the model could identify plants with an accuracy of 87%. However, only 10% of the diseased images had the correct disease at the top of their suggestions. Plantix also requires an internet connection to use image analysis features. This can be a limitation for farmers working in remote areas where internet access might not be available.
PlantVillage Nuru [30] is another smartphone app for plant disease detection that was developed under the PlantVillage [4] Project. It uses a single-shot multibox detector (SSD) with MobileNet to detect and classify plant diseases [31]. PlantVillage Nuru requests that the user submit six plant leaves for better classification and can run without an Internet connection. SSD detectors localize the diseased areas using bounding boxes. However, various studies [32], [33] have reported that the accuracy of SSD detectors for plant disease detection is low. Mrisho et al. [34] showed that Nuru's accuracy for symptom recognition when using six leaves (74-88%, depending on the condition) was similar to that of experts, 1.5 times higher than agricultural extension agents and two times higher than that of farmers.
A literature review of CNN used for plant disease identification and classification is presented in Table 1.

III. THE FieldPlant DATASET
We released FieldPlant, a plant disease dataset of 5,170 annotated field leaf images collected from the Cameroon plantations. The dataset focuses on various diseases in three tropical cultures: corn, cassava, and tomato. To the best of our knowledge, this is the first publicly available dataset for plant disease detection that uses annotated cassava images. This dataset can be used to train efficient models for plant disease detection using field images and object-detection models.
The research mainly focused on diseases appearing on leaves even though it included some non-leaf disease classes such as Cassava root rot (78 images) and Corn charcoal (8 images).

A. DATASET CROPS AND DISEASES
The distribution of Fieldplant images diseases is shown in Fig. 7.

1) CASSAVA
Cassava is a root vegetable that is widely consumed in many countries worldwide. It is extensively cultivated as an annual crop in tropical and subtropical regions because of its edible starchy tuberous root, which is a major source of carbohydrates. Cassava is the third-largest source of food carbohydrates in the tropics after rice and maize. Cassava is a major staple food in the developing world, providing a basic diet for over half a billion people [35].
The different diseases represented in the dataset for cassava crops are as follow: Cassava Bacterial Disease, Cassava Brown Leaf Spot, Cassava Healthy, Cassava Mosaic and Cassava Root Rot.

2) CORN
Corn has become a staple food in many parts of the world, with the total production of corn surpassing those of wheat and rice. Corn is cultivated worldwide, and more corn is produced each year than, any other grain. In 2021, total world production was 1.2 billion tonnes [36]. The

3) TOMATO
Numerous varieties of tomatozq are grown in temperate climates worldwide, with greenhouses allowing their production throughout all seasons of the year. Cameroon is the world's tenth largest tomato producer, with an estimated annual production of 1,279,853 tons [37].  Crop disease images were collected from Zones 3 and 5 of the five agro-ecological zones in Cameroon (Fig. 9). Images were collected under the supervision of plant pathologists at different periods of the year and stages of plant growth. The aim of this procedure was to capture the most diseases possible, given that we do not have the same crop diseases in different agro-ecological zones and that they often occur at different periods of the year. Images were captured using smartphones with 4608 × 3456 (4:3) pixel-resolution cameras. This was done to benefit from the flexibility and autofocus. Leaf images were captured directly from plants with a full plantation background, and usually had more than one leaf per image.
Once images are collected in the field, they are made available to the plant pathologist, who then groups them into folders according to the plant and the disease identified on the leaf. Blurred images or images irrelevant to the study were ignored.
A data scientist used the RoboFlow online platform tool (https://roboflow.com/) to annotate images. Each image was annotated by specifying the disease class in its leaves. Only identifiable leaves in the image were annotated. If a leaf was not infected with a disease, it was assigned to the healthy class. During annotation bounding boxes are used to specify the position of the leaf in the image.   Before adding annotated images to the dataset, the annotations were first checked to ensure that all identifiable leaves in the image were effectively annotated or that non-identifiable images were annotated. If an error occurred, the image was returned for the annotation.
The second level of validation was performed by an expert pathologist before the publication of the dataset. This is a crucial step in which experts check whether correct annotations have been assigned to the identifiable leaves of each image. If any error occurred, the image was sent back to the data scientist for annotation.
The images collected using this process are presented in Table 2. The total number of images obtained following direct collection in the plantations was 6334, but only 5,170 were retained after the second phase of the process by eliminating inconsistent images.

IV. BENCHMARKING FieldPlant DATASET
We ran various sets of experiments on the FieldPlant (FP) dataset to evaluate the performance of the state-of-the-art CNN in identifying leaves and diseases from the images. We trained and evaluated the performances of four CNN (MobileNet [17], VGG16 [39], InceptionV3 [40], and Incep-tionResNetV2 [41]) on our dataset. All the results presented for each evaluation are test accuracies.
To compare our dataset with the existing PlantVillage(PV) and PlantDoc(PD) datasets, we built another dataset called Cropped-FieldPlant (C-FP) by cropping the initial annotated images with bounding box information. The cropped images contained only one leaf per image with varying backgrounds, and usually contained snippets from other leaves, as shown in Fig. 13 obtained from Fig. 12 after cropping annotated leaves. After cropping the original 35404 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.   5,170 images, the total number of individual leaf images was 8,629.

A. SYSTEM CONFIGURATION
All the models in our experiments were trained on a server with the following characteristics:5 GPU Tesla T4 with 16 GB RAM, 4 TB HDD, and 2 AMD EPYC 7251 CPUs with 512GB of RAM. Experiments were performed using a GPU for faster training.
To train the networks, we used sparse_categorical_ crossentropy loss and learning rates of 0.001 for training  and 0.0001 for fine-tuning. Transfer Learning was used to improve the accuracy of the models. We used the weights provided in Keras trained on ImageNet for the pre-trained models. All the images were resized according to the CNN models before being fed into the network.

B. PLANT DISEASE IMAGE CLASSIFICATION 1) CLASSIFICATION FROM RAW IMAGES
First, we attempted to determine the suitability of raw Field-Plant images for classification tasks, as shown in Table 3. To achieve this objective, the annotated image dataset was converted into a Multi-Label Classification CSV dataset using RoboFlow. 2 In this representation, the CSV file contains the names of the disease classes identified in each image of the dataset. Therefore, the models were trained to recognize the disease(s) present in the image regardless of its (their) position(s).

2) CLASSIFICATION FROM CROPPED IMAGES
The second benchmark evaluates the classification accuracy on the Cropped images of FieldPlant (C-FP). The cropping operation was easily performed from the original annotated dataset using pascalvoc-to-image 3 tool.
For comparison, we also ran the classification models against PlantVillage(PV) and Cropped PlantDoc(C-PD). C-PD was obtained from PlantDoc in the same way as described previously. The accuracies of the different models for various datasets are presented in Table 4. 2 https://roboflow.com/formats/multiclass-classification-csv 3 https://pypi.org/project/pascalvoc-to-image/

C. PLANT LEAVES DETECTION
In our last set of experiments, we evaluated in Table 5 the performance of the object detection models on our dataset using COCO pre-trained weights. The aim of this experiment was to determine how these models identified individual leaves in the field images of the dataset at 50% IoU. The TensorFlow Object Detection API of the TensorFlow Model Garden [42] was used for this evaluation. Table 3 show how much difficult it is for classification CNN models to identify diseases on raw field images. As expected, when the training and test sets are the same, the noise backgrounds and the multiplicity of leaves on the raw images reduce the models validation accuracies. These accuracies are further reduced when the models are trained on PlantVillage and tested on PlantDoc or FieldPlant because of the  significant difference in the structure between the training and test datasets. Recall that PlantVillage has single leaf per image with uniform background. However, we notice that the results obtained with FieldPlant are far better than to those obtained with PlantDoc. This could be because Fieldplant has more data for models training. The fact that PlantDoc  contains both field and lab images could also influence these results.

V. RESULTS AND DISCUSSION
Experiments on the cropped images presented in Table 4 show that very good results were obtained when the models were trained and tested on PlantVillage, which had one leaf per image, with a uniform background. These results were less effective when the models were trained and tested on Cropped FiedPlant or Cropped PlantDoc. This is also due to the complex backgrounds of the cropped images, which can not always match between two images even if they identify the same disease. The results were worse when the VOLUME 11, 2023 models were trained on PlantVillage and tested on Cropped FieldPlant or Cropped PlantDoc because of the large gap between the image structures. Models fail to produce accurate results owing to background noise and scrap leaves in images. Similar to the previous experiment, we found that the results obtained from evaluations involving FieldPlant were significantly better than those obtained from evaluations involving PlantDoc. The same justifications can be invoked here.
The learning curves in Fig. 14 show that the model is underfitting because the validation loss is higher than the training loss. Furthermore, training accuracy and training loss were lightly improved after fine-tuning but we notice an overfitting of the model at that point as the validation accuracy and loss rather deteriorated. The model reaches low classification accuracy as shown on the confusion matrix in Fig. 15. The best-classified class is Corn leaf blight with 51 correct occurrences while 153 occurrences are classified as Tomato blight leaf. A lot of images are predicted as one of the Corn disease classes.
The last set of experiments in object detection on the PlantDoc and FieldPlant datasets show that PlantDoc performs better in identifying individual leaves from raw images collected in the field. Object detection in complex backgrounds remains a challenging task [43], [44] because most current object detection deep learning models are based on high-level CNN features, which usually fail to capture finegrained descriptions of objects. The presence of laboratory images with uniform backgrounds in the PlantDoc dataset significantly improves its performance in object detection. On the other hand, some plant leaves appear only partially in some images of FieldPlant, and their annotation could have a negative impact on object recognition and detection tasks.
More generally, the benchmark results allow us to highlight that MobileNet CNN achieves relatively better accuracy in classification tasks for the FieldPlant or PlantDoc datasets.

VI. CONCLUSION AND FUTURE WORK
In this study, we made available to researchers FieldPlant, a dataset of 5,170 annotated plant disease images collected directly from plantations. In contrast to PlantDoc, this dataset is composed exclusively of field images classified by plant pathologists. However, the dataset can be enriched with more disease classes. FieldPlant has the potential to be widely used in plant disease research and management, and is the first plant disease dataset with annotated cassava images. We conducted a set of experiments to evaluate the performance of state-of-the-art classification and object detection models. The results show that the existing models are not sufficiently accurate for plant disease detection and classification of images collected directly from the field, although the classification task results for FieldPlant are better than those for PlantDoc. Therefore, suitable models should be established to help farmers identify the diseases that attack their crops and take appropriate countermeasures. Model ensembling with image segmentation applied to field images to isolate individual leaves from a global image may be a promising approach for solving this problem.
EMMANUEL MOUPOJOU received the master's degree in information systems and software engineering from the University of Yaoundé I, Cameroon, in 2014. He is currently pursuing the dual Ph.D. degree in artificial intelligence with the University of Yaoundé I and the University of Technology of Troyes, France.
From 2014 to 2018, he was a part-time Teacher of professional courses with the Computer Science Department, University of Yaoundé I. Since 2018, he has been a Lecturer with Institut Universitaire Saint Jean du Cameroun (IUSJC). He is the author of three articles and one conference paper. His research interests include artificial intelligence, configuration management, and application security. FLORENT RETRAINT received the Engineering Diploma degree in computer science from the Compiegne University of Technology, in 1993, the M.S. degree in applied mathematics from ENSIMAG, in 1994, and the Ph.D. degree in applied mathematics from the National Institute of Applied Sciences of Lyon, France, in 1998.
He held a postdoctoral position with CEA Grenoble for one year. He was a Research Engineer with Thomson CSF for two years. Since 2002, he has been with the Laboratory of System Modeling and Dependability, Troyes University of Technology, where he is currently a Full Professor. His research interests include image modeling, statistical image processing, hypothesis testing theory, and anomaly detection and localization.
ANICET TADONKEMWA received the bachelor's degree in ICT for development (computer science) from the University of Yaoundé I, Cameroon, in 2017. He is currently pursuing the master's degree in data science with the Institut Universitaire Saint Jean du Cameroun (IUSJC).
In 2014, he has developed a solid experience in the field of IT from an intern to a digital consultant in a start-up software service provider. Through his numerous roles, including a functional analyst, a software developer, and the IT manager, he has developed skills in management, leadership, and especially data management. He is passionate about the digital world, especially the study of enterprise data-contributing from requirements analysis to the deployment of solutions in a cloud environment. However, he is looking for a high-value-added project to quickly improve his skills and later position himself as a data architect on large-scale projects. VOLUME 11, 2023