Crack Detection of Brown Rice Kernel Based on Optimized ResNet-18 Network

The occurrence of cracks in brown rice kernels has a substantial impact on grain quality. The timely and accurate detection of rice grains with cracks is crucial for enhancing the overall quality and flavor of processed rice. In this study, we developed an optical observation platform and optimized the original ResNet-18 neural network structure to improve the detection and classification of grain cracks. We established image datasets for japonica and indica rice varieties, and employed image augmentation and model migration techniques during training. In addition, we compared the performance of the optimized model with DenseNet-121 and GoogLeNet. The results demonstrate a notable enhancement in crack detection accuracy for japonica, reaching 96%, which is a 3.67% improvement over the original model. Furthermore, we achieved a substantial reduction in average training time, reduced by 58.66%. For indica rice, after model optimization and migration, the accuracy reached 96.67%. It’s important to note that the optimized model has limitations and is not suitable for mixed datasets with limited training data. This technology offers the capability to accurately identify and detect cracks in brown rice kernels under visible light conditions, presenting a promising solution for enhancing grain quality during processing.


I. INTRODUCTION
Rice is a staple food consumed by more than three billion people worldwide and is an important source of energy, protein, fat, and vitamins [1].Before rice can be consumed it must undergo standardized processing, including shucking, milling, and polishing.The husk is removed to obtain brown rice kernel, which consists of bran, endosperm, and embryo, and white rice is obtained when the bran is properly milled [2].During processing, the brown rice kernel is affected by various mechanical forces, resulting in crack formation inside the grain, which reduces white rice quality and affects the classification of marketing grades, which is avoided [3].The effective identification of cracks in rice grains will help further expand the use and economic value of rice grains in actual processing and production.For example, refining the grain removes bran (combined with germ) from brown rice, but as a byproduct of grain processing, the bran is also a product of nutritional value to humans [4].
The associate editor coordinating the review of this manuscript and approving it for publication was Antonio J. R. Neves .
Crack detection was used to screen lower-value brown rice with cracks in advance during grain processing.The use of brown rice to extract by-products can provide a processing channel with lower cost but the same yield [5]. Therefore, the detection of cracked brown rice kernels during processing is a significant factor in optimizing grade classification and improving palatability.
In the field of detecting and classifying small agricultural products, such as grains and seeds, machine vision has many significant advantages, such as fast identification speed, high accuracy, and non-destructive testing.For example, migration training based on the ResNet-18 network was used to classify and detect camellia sinensis seeds, and final identification accuracy reached 96.21% [6].After adding a multiscale feature extraction module to optimize the ResNet-18 network, the model's identification accuracy for soybean varieties was higher, at 97.36% [7].In particular, the ResNet-18 network may be advantageous for crack detection.It is of reference significance to use three data recognition networks, which are mainly used in large databases, AlexNet, ResNet-18, and SqueezeNet, to classify 4333 road crack images with eight different categories.The results showed that the recognition accuracy of the ResNet-18 network was the highest at 85.20% [8], [9].In addition, AlexNet, VGGNet-13, and ResNet-18 have been used to detect and classify crack images in other tasks, and ResNet-18 produced the most satisfactory recognition effect [10].The application of deep learning can further improve the intelligence, stability, and economy of automation systems in agricultural production management.For example, machine vision technology can be used to classify multiple rice varieties [11], [12], [13].In other practical applications, such as rice grain type identification, crop and weed identification, and marble crack detection, the residual network represented by ResNet-18 and its variants has certain advantages in detection and recognition tasks [14], [15], [16].Furthermore, in the application object, convolutional neural networks (CNN) have outstanding applications in the classification and damage detection of rice grains.The R-CNN can be used to locate and classify rice grains, and the model exhibits excellent performance [17].A machine vision system based on the Deep-CNN model was designed and applied to highmagnification milled rice images, and the main task was to divide the damage of milled rice into seven categories.Finally, five different Deep-CNN models were developed, and the average individual classification accuracy was above 95% [18].
In the field of general detection, the model with a good detection effect may be large in volume and have many parameters; the model learning efficiency is low and the time is long.Therefore, the models are driven to optimize, which is expected to have shorter learning times and higher recognition accuracy in a single field.The model functions were refined, such as the detection of crack defects, color defects, or positioning.How can the model be dedicated to a task using the optimized original ResNet-18 as an example?Many optimization methods already exist, roughly two types of optimization methods.One type was to improve a certain attribute of the residual network by combining it with other characteristic modules.For example, ResNet with the CBAM module and ResNet-18 with random clipping could be designed, both of which lightly improved the classification accuracy for apple leaf disease [19].The average recognition accuracy of wheat quality was improved by combining the residual network with an Attention Mechanism [20].The second is to change the model structure of the residual network and fine-tune the layer structure of the neural network.The ResNet-18 network large convolution kernel was decomposed, effectively improving the detailed and image features that were extracted more fully [21].The 3D convolutional layer was used in the ResNet-18 network to learn spectral derivatives which cannot be learned by the 2D convolutional layer, and (1 × 1) 2D convolution layer was applied to reduce the sampling of data dimensions, enabling the ResNet-18 network to perform more accurate regression analysis [20].The model improvement method of this experiment should integrate the advantages of the above two methods, mainly to adjust the model structure while adding new modules, especially the connection of the old and new modules.
In this study, japonica and indica rice datasets were created for model training and testing.The conventional neural network model ResNet-18 was chosen, the NiN block before the residual network was added to reduce the model size parameter, and the discard layer suppression parameter afterward was added to reduce the chance of overfitting.In addition, based on the training of a single variety of brown rice kernels, mixed brown rice seed crack detection tests were used to explore the applicability and reliability of the optimized model.Meanwhile, during the three experiments (single variety japonica rice crack detection, indica rice crack detection base on migration, and mixed dataset test), four models (original ResNet-18, optimized ResNet-18, DenseNet-121, and GoogLeNet) were used to test and compare the model learning effects.

A. BROWN RICE KERNEL OBTAINED
In the study, two typical rice samples were selected (purchased at location farmer's market in November 2022): Nanjing 5055 (short japonica rice) and Guiyu 11 (long indica rice), providing clear dimensions variability for the establishment of the training dataset [22].Before the test, the rice hull was stripped to obtain the brown rice kernel, and kernels with a full body and no fracture were selected as the training samples.

B. PREPARING THE KERNEL WITH CRACK
The internal factors contributing to crack formation in brown rice kernels are the structure, rice composition, and humidity [23].External factors include moisture and temperature.During milling, brown rice kernels are exposed to high levels of heat and pressure, resulting in stress gradient differences within the kernel.This stress causes cracks to form in areas with high stress, resulting in visible cracks that expand and ultimately fracture.The crack dimension and depth are affected by the moisture content inside the kernel.Therefore, based on the above principle, kernel samples with cracks can be artificially produced.
An adequate amount of brown rice kernel was placed in a glass vessel with pure water, soaking for 30 min, after which the kernels were removed and left to stand.After 5 min, it can be observed that the soaked kernels gradually started to present obvious cracks, and then they were collected as training samples, which is shown in Figure 1.

C. IMAGE COLLECTIONS SYSTEM
To better photograph the brown rice kernel with crack, a shooting platform was built based on the light transmission principle.Figure 2 shows a schematic of the brown rice grain image-collection system.The system is composed of an enclosed metal support frame, glass panel, light source, white paper, industrial camera (model: MV-CA060-10GC), a camera bracket, computer, and transmission data lines.During shooting, the light source was placed inside a closed support frame, and light passed through the white paper and acted on the kernel, causing the internal crack to be presented (wave-particle duality), kernel images with the crack were obtained by an industrial camera (Hangzhou Hikvision Digital Technology Co., Ltd., China) located directly above.The finished pictures were sent to the computer for saving through the image acquisition channel, and the subsequent image processing was performed.
The causes of crack in rice under light conditions were briefly described.Faults are formed in the structural weaknesses of rice grain because of uneven stress dislocation that occurs on the surface and inside in the process of rice grain volume change, under the effect of external impact or thermal expansion and contraction.Therefore, it is difficult for light to propagate in the direction perpendicular to the section when light travels, which will produce a brightness fault that is similar to the performance of cracks.

D. BUILD THE KERNEL DATASET
An image with brown rice kernels has flaws, including pixels, redundant information, and an abnormal aspect ratio, which need to be processed and added with the corresponding label before becoming the dataset [24].After obtaining the source images of the rice grains, pre-processing operations were performed, which included two fundamental steps: cropping and resizing.Precise cropping was employed to remove irrelevant background elements from the source pictures effectively, resulting in uniformly cropped square images with dimensions of 512 × 512 pixels.Before feeding the images into the network, they were scaled to fit the dimensions of the network.Using the ''resize'' function in the PIL library, an image processing library in Python, the nearest-neighbor interpolation method was employed to resize the image pixel dimensions to 256 × 256 pixels.The images in the dataset were color images and the data type was 256×256×3.The images were manually checked for cracks, and when the overall brightness of the brown rice grains was consistent, no cracks were observed.When there are two or more areas with significantly different brightness values, the brown rice grains are considered to have cracks.A CSV file was then created, including serial numbers (image No.) and labels (intact kernels labeled as 0, kernels with cracks labeled as 1) to build a complete dataset.The schematic of the brown rice kernel dataset is shown in Figure 3.

III. NETWORK OPTIMIZATION AND MODEL PRE-TRAINING A. ResNet-18, DenseNet-121 AND GoogLeNet
A residual Network is one of the convolutional neural networks, increasing network depth and restraining training error reverse increases in network characteristics, which can solve the problem of gradient disappearing and network degradation during the deep learning process (the residual block is the base block of the residual network, which is the link to realize input data propagation across the data layer, training an effective deep neural network).Meanwhile, in the actual training process, residual mapping is easier to optimize.When the target mapping learned by the model is close to the identity mapping, subtle changes in identity mapping are better captured by the residual mapping.
The optimized object of this experiment was the original ResNet-18, which consisted of four modules composed of residual blocks.Each residual block has four main convolution layers, together with the convolutional layer at the beginning of the model and the fully connected layer at the end for classification, there are 18 layers in total, namely ResNet-18 [25].
DenseNet-121 and GoogLeNet were used to compare and validate the model optimization effect.DenseNet-121, GoogLeNet, and ResNet-18, all three are classic convolutional neural networks originally designed for image classification and classification tasks and play an important role in deep learning research and applications.DenseNet-121, GoogLeNet as a comparison model, and ResNet-18 have both similarities and differences, which highlights the advantages of the optimization model.
The DenseNet-121 network consists of 121 layers in total and comprises multiple dense blocks, transition layers, and a final output layer [26].DenseNet-121 and ResNet have similarities, such as using skip connections to improve the flow of information through the network to avoid the problem of gradient disappearance, but differ in their approach to information flow and parameter sharing within the network.ResNet directly sums inputs and outputs, whereas DenseNet-121 connects inputs and outputs in the channel dimension.A comparison of the training results of DenseNet-121 and ResNet-18 showed the role and advantages of the residual network in rice crack detection.
GoogLeNet, also known as Inception v1, consists of 22 layers in total [27].GoogLeNet is characterized by its use of Inception modules that allow networks to learn features at multiple scales and auxiliary classifiers that are used to mitigate the problem of vanishing gradients and global average pooling.Both GoogLeNet and NiN were designed to reduce the number of parameters in the model without sacrificing performance using 1 × 1 convolutional layers as a means of dimensionality reduction.the network (NiN), which is composed of a common convolutional layer and a 1 × 1 convolutional layer similar to fully connected layers, reducing the parameter size of the optimized model and overfitting phenomenon during the learning) [28], [29].Meanwhile, the data dimension was gradually increased in the ResNet block, and 20% of the data were randomly discarded in the dropout layer, suppressing further overfitting phenomena easily occurring in the residual network.Then, all channel elements are averaged in the global average pooling layer, which is used directly for classification, which finally outputs a two-dimensional array representing the samples and features by a fully connected layer.

D. PERFORMANCE MEASURE INDICATOR
The performance measure (PM) is an evaluation indicator for the deep learning model's effectiveness, reflecting the degree of difference between the model prediction and the actual data set, and task-wide work is guided by PM, including training number, model structure, and hyperparameter (PM is not equivalent to the loss function of the training model in the test).In addition, during the test, it was noted that the optimized model task is the detection and classification of brown rice kernels with cracks, indicating that the model detection accuracy is a critical indicator.
The accuracy rate calculation equation is: where TP is the true positive number, FN is the false negative number, TN is the true negative number and FP is the false positive number during the training.The PR curve is the precision (P) and recall (R), and their calculation equation is as follows: where the denominator is the total number of positive samples in the predicted results, including those predicted correctly and incorrectly.
where the denominator is actual positive sample number.

E. PRE-TRAINING OF NETWORK MODELS
The control variable method was adopted to optimize the critical hyperparameters, including the learning rate and learning rate attenuation coefficient, and 100 images with japonica rice and 20 images with tested rice were pre-tested.Hyperparameters refer to parameters that are not learned during the training process but need to be manually set before training.Before the test, based on the pre-test results, the optimization method SGD (Stochastic Gradient Descent) was chosen, the momentum parameter was set to 0.5, the learning rate was set to 0.08, the decay rate was set to 0.5, the batch size was 25, and the number of iterations was 80.

A. JAPONICA RICE KERNEL TEST AND ANALYSIS
Then, the dataset with 1000 japonica rice images was used as the training model, and the training curves of four types of models (original ResNet-18 network, optimized ResNet-18, DenseNet-121, and GoogLeNet) are shown in Figure 7, including the loss function, accuracy curves, average training time and PR curves.Furthermore, on training time for one epoch, the original model's average training time was 0.05108 s and the optimized time was 0.02128 s, indicating that the optimized ResNet-18 model presents significant advantages, which can be explained from the perspective of data dimension monitoring changes.In the original Res-18 model, the data passes through a 3 × 3 convolutional layer and batch normalization layer before entering the main residual network with dimensions (1,64,256,256).In the optimized ResNet-18 model, the data passes through the added NiN block and the max-pooling before entering, the data dimension is (1,64,128,128), and the data to be processed is reduced by 75% in the main residual network.
In the NiN block, a two-dimensional convolutional layer can reduce the input data size when entering the main residual network, ensuring algorithm advantage and improving the training speed under reduced data conditions.In the model, the NiN block influences convolutional layers, causing the perceptual vision to become extensive (input data reduction does not equal the loss of vast information features).In addition, note that the dropout layer can suppress the overfitting phenomenon after optimizing the ResNet-18 model, ensuring the consistency of the model training and test results.The advantage of optimizing the model was not only that the reduction of training parameters led to an increase in model training speed, but also the reduction of model training hardware requirements.When training with an Nvidia Tesla T4 (16GB) graphics card, the optimized model requires 5.4GB video memory, while the original model requires 10.1GB video memory.The advantages of using optimization models were significant.

B. INDICA RICE KERNEL TEST BASED ON MODEL MIGRATION
The long indica rice dimension presents obvious differences from japonica, and the normal crack model building method requires 1000 indica images for retraining, resulting in resource waste.Under small-batch indica image conditions, the model migration work can exploit the universal knowledge on crack identification, satisfying further training model can match the crack classification on indica rice.
Therefore, the fine-tuning method from migration learning was used to implement the model migration task, all design parameters in the japonica crack detection model were replicated, and new data were re-added (indica images).In contrast to the model migration test for cross-category recognition, which was quite different, the rice crack images identified before and after this migration test are highly similar in their characteristics.As a result, the hyperparameter settings of the model trained on japonica rice images are equally applicable to the migration experiment involving indica rice images.For the indica rice migration experiment, only the number of training iterations was adjusted to match the smaller quantity of indica rice images, whereas the model's internal weights were inherited from the japonica rice image experiment.During migration training, the model

C. BROWN RICE KERNEL MIXING TEST
Reducing the model parameter size can improve the learning time; however, the reliability and applicability of this optimization strategy need to be further investigated.Therefore, in this study work, the mixed dataset of japonica and indica rice kernels was constructed.The training dataset contained 600 images and the test set contained 100 images (mixed rice grains in the group A test set), and two kernels each in half.The momentum parameter was set to 0.5, the learning rate was set to 0.08, the learning attenuation coefficient was set to 0.5, the batch size was 25, and 72 iterations were performed.In the test set data of Group A, the recognition characteristics of the rice grain cracks by the four models have been obtained.Compared with the residual network, DenseNet-121 and GoogLeNet have no advantages, and the gap is large.In the Res-18 model before and after optimization, although there are basic results on the recognition accuracy of the three test sets, to reduce experimental errors, two sets of B and C training sets were chosen specifically for the models before and after optimization.The number and composition of the test sets of Groups B and C were the same as those of Group A, and the test results are listed in Table 1.
In the japonica rice data set, indica, and mixed data set, the average test accuracy rates of the optimized model were 96%, 96.67%, and 84%, respectively, and the average test accuracy rates of the original model were 92.33%, 96.67%, and 92%, respectively.Regardless of whether it was the original ResNet-18 neural network or the optimized model, there was a significant difference in the recognition accuracy of ground rice without basic pre-training parameters.
The results of the japonica rice test and the indica rice migration test can also be further explained based on changes in model capacity.In the test to identify cracks in japonica rice, the capacity of the original model was relatively low.Under the same training conditions (1,000 images of japonica rice), the model tended to oversimplify the representation of the data and did not fully capture the feature structure.Therefore, the training accuracy and recall rate of the training results of the original model were slightly lower than those of the optimized model, and the test set recognition accuracy was also 2∼6% lower than that of the optimized model.
After the model before and after optimization learned through 1,000 images of japonica rice and migration learning of 300 images of indica rice, sufficient training data allowed the two models with different capacities to achieve a balance in their ability to identify rice grain cracks and obtain the best performance.With the increase of training data, the model can more easily capture the characteristics of rice grain cracks, which also reduces the impact of model capacity on the learning effect.Therefore, in the test set of the migration experiment, the test results of the optimized and original models were consistent.In summary, the optimization model is not suitable for mixed rice experiments with less training data.The optimization model requires more training data to realize the potential and advantages of the model.Fast and accurate rice crack detection can reduce the circulation of defective rice grains and food safety risks.It can also reduce food waste caused by false alarms.It also accelerates the rice processing workflow and enhances the efficiency of rice production and processing.Through the selection and exclusion of defective grains, this technology contributes to improving resource utilization efficiency and reducing negative environmental impacts.It provides higherquality, safer, and more efficient rice products.
Looking ahead, it is possible to enhance the detection of rice grain cracks of different types and varying degrees by combining multiple image modalities, such as infrared and ultrasound.Additionally, further development can be undertaken to create a rice grain crack detection system suitable for real-time and embedded applications to meet the fast detection requirements of rice processing production lines.

V. CONCLUSION
In this study, a shooting platform was built to capture cracks in brown rice kernels.In the presence of light illumination, different regions of cracked rice kernels exhibit varying brightness, thereby enabling the clear detection of cracks in the rice grains.The experiment provides a lowcost and high-efficiency method for crack detection in rice grains, which can be extended to other transparent grain seeds.The conventional ResNet-18 network was optimized by adding NiN blocks and adjusting the positions of the max-pooling and dropout layers to reduce the overfitting phenomenon in model training.Meanwhile, under the same training conditions, the optimization strategy can reduce the model parameter size and maintain effective model training.This significantly reduces the resources required for model training.The results indicated that the training time of the optimized ResNet-18 network was significantly reduced, and the accuracy of the single-variety kernel crack was improved.The improvement in rice crack detection accuracy will help improve the quality of finished rice and reduce waste caused by misjudgment.The reduction in training time and improvement in detection speed will help reduce labor costs in rice processing and improve the efficiency of rice crack detection.This study provides a promising method to reduce training time and increase the accuracy of single-species rice crack detection.

FIGURE 1 .
FIGURE 1.The crack formation process within brown rice kernels.

FIGURE 2 .
FIGURE 2. Brown rice kernel image acquisition system.

Finally, two
datasets were created: japonica rice and indica rice.The training set of the japonica rice dataset contained 1000 images, consisting of 497 images without cracks and 503 images with cracks.The ratio was close to 1: 1 and the distribution was even.The test dataset contained 300 images, with 104 images lacking cracks and 196 images with cracks.For the indica rice training dataset, there were a total of 300 images, comprising 148 images without cracks and 152 images with cracks.The test dataset consisted of 150 images with 55 crack-free and 95 image cracks.The test sets of both japonica and indica rice were divided into three groups (A, B, and C), but their amounts and compositions were different.The test set for the japonica rice experiment comprised 100 images of japonica rice per group, and the test set for the indica rice migration test comprised 50 images for each group of indica rice.The test set for the mixed rice test consisted of 100 images per group, including 50 images of japonica rice and 50 images of indica rice.
ResNet and NiN modules were the main components of the optimized model.A comparison of the training results of GoogLeNet and ResNet-18 can further demonstrate the role of the NiN module in the optimization model and whether it affects the recognition accuracy of the residual network.

Figure 4
Figure 4 shows the model structure of the optimized ResNet-18 neural network.After coloring brown rice kernel images with 256 × 256 × 3 enter the optimized model, the batch normalization and activation function are used to enhance the gradient in data backpropagation, and the NiN block and max-pooling layer can better achieve the information transmission (NiN block is the basic block for constructing

FIGURE 4 .
FIGURE 4. The structure of the optimized ResNet-18 model.

FIGURE 5 .
FIGURE 5. Schematic diagram of image augmentation of brown rice kernel image.
These parameters are used to control the behavior and performance of the model, rather than being directly learned from the training data.The selection and adjustment of hyperparameters are crucial for the model's performance and generalization ability.During training, the batch size was 10, the total number of training batches was 40, and the learning rate decayed to a certain value after each 10 training batches.The hyperparameter value was determined by related curves, including the loss function, training accuracy, and precisionrecall (PR).After the convergence, average loss values and training accuracy of the last five training batches were used to quantify the learning effect of different hyperparameters; the results of pre-training are shown in Figure 6.

FIGURE 7 .
FIGURE 7. Loss function, accuracy curves, average training time, and PR curves.

FIGURE 8 .
FIGURE 8. Test results of the four models in three experiments (japonica rice crack detection, indica rice crack detection based on migration, and mixed dataset test) by using Group A test set.
learns from the newly added indica rice images and finetunes its internal weights to adapt to the new indica rice crack detection task.In training, indica rice training dataset 300 and test 50 (indica test set in group A test set) were used, and the focus of the model migration task was on optimized ResNet-18.The model migration training curves are shown in Figure 9.The curve information indicates that the optimized ResNet-18 network satisfies the detecting crack task, and the advantage of short training time is retained.After fine-tuning the optimized ResNet-18 model, the results indicate that the classification test model accuracy was 96% for indica rice crack.The accuracies of DenseNet-121 and GoogLeNet in detecting cracks were slightly improved, but the final results were still worse than those of the residual network.The migration test results of four models are shown in Figure 8.

FIGURE 9 .
FIGURE 9. Migration training curves for four models.

Figure 10 shows
the training curves of the mixed-model dataset.The curve information indicates that the training loss function and accuracy of the optimized model are better than those of the original, test result pretends a disadvantage (the original model detection accuracy is 96% and the optimized is 86%), indicating that the optimized model is more prone to overfitting for the same number of training sessions.The test results of the four models on the mixed rice grain dataset are shown in Figure 8.The training and test

FIGURE 10 .
FIGURE 10.Mixed data set training curves.

1 .
Recognition accuracy of ResNet-18 in each training set before and after optimization.

E
. SUMMARY AND DISCUSSION The optimized model with the NiN module significantly reduces the average training time, resulting in a remarkable decrease of 58.66%.The reduction in training time can further save labor costs and hardware resources required for training, and improve detection efficiency.With relatively sufficient training data, the optimized model can reach the training equilibrium state faster and obtain the best performance, with a recognition accuracy of more than 96%, which is better than the original ResNet-18 model.The improvement in crack detection accuracy can help reduce the loss rate of rice grains during large-scale rice processing and improve the use efficiency of intact rice grains.In mixed-grain experiments with limited training data, the optimized model exhibits lower accuracy.However, as the volume of training data increases, the accuracy of the optimized model is similar to that of the original model.The higher training speed highlights the advantages of the optimized model.A sufficient amount of training data is essential to fully unlock the potential of the optimized model.