The Application of Information Classification in Agricultural Production Based on Internet of Things and Deep Learning

Under the China’s increasing attention to the technological innovation of agricultural production, all kinds of agricultural information have exploded on the internet, and agricultural informatization has developed rapidly. The related information has spread all over the whole network, which makes it gradually difficult to extract useful information from the network. To solve the deficiency of information classification ability of traditional agricultural information collection methods, the classification method of agricultural information is optimized, to realize searching the required agricultural information quickly. At first, the study introduces Deep Learning (DL) technology and the Internet of Things (IoT) and their advantages. Then, based on Bayesian Networks (BN) and Decision Tree (DT) algorithm, the agricultural information classification model is implemented and trained. Using various agricultural economic development theories, analyzation is made on the present situation of domestic agricultural informatization development. Finally, the advantages are put forward of agricultural production informatization development and economic management development based on IoT technology. The research results show that, the agricultural information classification model based on DL and IoT technology can accurately select the required effective information from the network, and the application of IoT technology in agricultural production big data plays an important role in production and economic management. Therefore, the agricultural information classification model based on DL and IoT technology can make an effective and accurate judgment on the classification of agricultural information, and then provide a focus for agricultural production and economic development. A new idea is provided for the application of new technologies in agricultural production and management.


I. INTRODUCTION
Since the 14th National Congress of the Communist Party of China, the country has put forward the strategy of rural revitalization, demanding to stand on the new starting point of building a well-off society in an all-round aspect, deepen the rural reform in 2000, and realize the comprehensive and organic connection between the achievements of poverty alleviation and rural revitalization, thus defining the new goals and new orientation of agriculture, rural areas and farmers during the 14th Five-Year Plan period. Three rural issues are separately the issues of agriculture, countryside and farmers. The fundamental purpose of studying the problems of The associate editor coordinating the review of this manuscript and approving it for publication was Alessandro Pozzebon. agriculture, rural areas and farmers is to solve the issues of increasing agricultural income, agricultural development and rural stability, because the issues of agriculture, rural areas and farmers cannot be separated from the improvement of agricultural production and management technology [1].
Deep learning (DL) is a new field in machine learning research. Its motivation lies in the establishment of a neural network that can simulate the human brain for analytical learning. It imitates the mechanism of the human brain to interpret data. DL is a kind of Unsupervised learning [2]- [4]. Nmaley, it is a method that can automatically learn features from the objects. The Internet of Things, that is, the IoT, which refers to collecting all kinds of information of things in real time by various information collection devices and technologies, and connecting things with things, things with people, and realizing intelligent perception, identification and management of things. At present, DL and Internet of Things (IoT) have a strong development prospect in the fields of collecting and classifying agricultural information, managing agricultural production and economy [5], [6].
The application of new computer technology in agriculture has been studied by many scholars all around the world. Xue et al. (2019) put forward a prediction model based on convolutional neural network (CNN) for evaluating the maturity of agricultural compost, and realized the rapid evaluation of compost maturity by analyzing the images of different composting stages [7]. Jin et al. (2020) proposed an agricultural weather forecast model based on DL predictor. Based on the agricultural IoT system, the weather data can be continuously decomposed into four components by using a sequential two-level decomposition structure, and then the gated cyclic unit network is trained as a sub-predictor of each component [8]. Debella et al. (2021) put forward a mapping method of seasonal agricultural land types based on DL image time series, and mapped three agricultural land use categories: grain crops, forage crops (grass) and unused areas [9]. Garcia et al. (2019) proposed a method for detecting the heterogeneous landscape in large areas of orthophoto images that automatically outline the boundaries of agricultural plots. Primarily, the error of boundary displacement error index (BDE) is measured, and the CNN model in DL is applied to high-resolution remote sensing image recognition to obtain accurate and up-to-date information of spatial and geographical characteristics of agricultural areas [10]. Chen et al. (2021) proposed a plant disease recognition method based on image recognition technology. To improve the learning ability of plant pathological features, the deep CNN was used for migration learning, and the network structure was modified [11]. Chen et al. (2020) studied the application of DL in environmental monitoring, and put forward a method to classify agricultural information by constructing a DL model using neural networks [12]. In conclusion, for the application of new computer technology in agricultural production, it mainly lies in the use of DL technology for weather forecasting, remote sensing image recognition of agricultural land topography, crop disease recognition and agricultural supplies status recognition. However, the innovation lies in using DL technology to collect agricultural information, and there is little research in this field. The contribution is to provide a new perspective for the application of new computer technology in agricultural information and management in the future.
Nowadays, in the information society, a variety of new computer technologies emerge one after another, which gives birth to information collection and analysis technologies that can be applied to agricultural information classification and agricultural production management. Here, a classification model for identifying and classify agricultural information is implemented by using DL technology and IoT, moreover, the application of IoT technology in agricultural production and economic management improves the real-time and accuracy of obtaining effective agricultural information, and lays a foundation for the subsequent research of related work.

II. AGRICULTURAL INFORMATION CLASSIFICATION AND RELATED TECHNOLOGIES OF AGRICULTURAL PRODUCTION MANAGEMENT
A. DL AND IoT 1) ARTIFICIAL NEURON Artificial neuron is a mathematical model implemented by imitating the basic operation function of biological neuron, which has some operation effects of biological neuron [13]. Figure 1 displays the schematic structure of artificial neuron. The artificial neuron receives the given signals of the preneuron, and each given signal will be attached with a weight. Under the joint action of all the weight states, the neuron will show a corresponding activation state, which is expressed by the Equation 1.
In Equation (1), f (x) indicates the output state, x i represents an input signal, w i denotes the input signal corresponding to the weight, of which there is a total of n groups.
When a neuron receives a certain input signal, it will be given a certain output, and each neuron has a corresponding threshold value. If the sum of the inputs received by this neuron is greater than the threshold value, its state will turn into an active state, and when it is less than the threshold value, it will show an inhibitory state [14]. The transfer functions of artificial neurons are as follows.
The transfer function should be selected according to the specific application range. The linear function can amplify VOLUME 10, 2022 In 2006, a new research direction, DL, appeared in the field of machine learning research, and began to be studied by academia and gradually applied by industry [15]. In 2012, Stanford University took the lead in using a parallel computing platform with 16,000 CPU cores, and built a training model called ''Deep Neural Networks'' (DNN) [16], which has made great breakthroughs in the fields of speech and image recognition. From a statistical point of view, machine learning technology is to predict the distribution of data, learn a model from data, and then use this model to predict new data [17]. Neural network is the main algorithm and means of DL, which is divided into CNN [18] and deep belief nets (DBN) [13]. Figure 2 illustrates the CNN schematically.

3) IoT
The IoT technology was born in Massachusetts Institute of Technology in 1999. It is a new information service architecture based on the internet and RFID communication technology. The main purpose of this architecture is to enable the information technology infrastructure to provide safe and reliable information about ''goods'' through the internet, and to create an intelligent environment to identify and determine ''goods'' to promote information exchange within the supply chain. The IoT can combine the actual business scenarios, construct the edge intelligent system, and provide all kinds of required data to the application layer, so as to realize the intelligent collaboration of people, machines, objects and environment. The intelligent IoT technology is usually deployed in four layers of the user side: equipment domain, network domain, data domain and application domain, which enables the equipment to have intelligent perception, automatically assemble connection and deployment strategies, solve problems in data concentration, and provide local business logic and intelligent services [19], [20].

B. BASIC INTRODUCTION OF TEXT CLASSIFICATION
The definition of text classification is to treat a group of texts that have been classified by experts as a training set, then preprocess the secondary training set, and then train the classifier with classification algorithm to classify the texts to be processed. This classification is widely used in important information processing systems such as search engines, question answering systems, conversation systems, and etc. Text classification exists in everywhere, as shown in Figure 3 below, which is the application scenario classification of text classification [21].
The main problems of traditional text classification methods are that the text representation is highly sparse at high latitude, the ability of feature expression is weak, and neural networks are not good at processing such data. Besides, the characteristic engineering needs to be carried out manually, and the cost is very high. The reason why DL has achieved great success in image and voice is mainly that the original data of image and speech are continuous and dense, and have local correlation. The most important problem of the largescale text classification by DL to solve is the text representation. After the problems is settled, CNN and other network structures are used to automatically acquire the ability of feature expression, to prove the efficiency [22].

C. TEXT PROCESSING
At present, the main text classification method is to use standard data sets to make a general text classifier. In the extraction of agricultural information classification features, Bayesian network algorithm [23] and DT algorithm [24] are adopted to construct a classifier and implement a classification model. The first step of text classification is to preprocess the text information.
Text classification is the core technology of agricultural information search and agricultural information text mining. The principle of this technology is to use a given data set to divide the data into single or different types of collections. The essence of text classification is to map sets, and the process is represented by the following Equations (9)(10)(11)(12)(13).
In Equations (9)(10)(11)(12)(13), φ refers to a functional relationship, φ represents the approximate function of φ, D = (d, . . . , d i ) denotes a collection of texts, L = (l 1 , . . . , l j ) indicates a given collection of classes, T stands for that the function mapping is true, and F illustrates that the function mapping is false.

1) TEXT INFORMATION PREPROCESSING
Generally speaking, a text is the unstructured data, which cannot be simply represented by a two-dimensional table structure. The first step of preprocessing is to use metadata to save the data in a structured way. The second step is to convert the unstructured text data into computers languages. Before that, the text needs to be segmented [25]. Table 1 presents the text parts to be preprocessed and the processing measures.

2) AGRICULTURAL INFORMATION DATA SET
Because there is little research on the application of artificial intelligence (AI) in agricultural information processing in China, the data provided by China's official agricultural information website is used to set up a small text data set. Firstly, domains are divided according to the actual demand for agricultural information division. Secondly, classification is made on the target set according to natural attributes, technical attributes and market attributes. Information is crawled by using Visio studio 2019 development software, and compiling web crawlers with Python. Thirdly, 10 categories of data sets are established, and the specific information of which is shown in Figure 4.

3) PRETREATMENT OF AGRICULTURAL INFORMATION CLASSIFICATION
Bayesian, DT and CNN models are used to build the classifier, but before that, agricultural information needs to be preprocessed [26], and the process is illustrated in Figure 5.

D. TEXT CLASSIFICATION ALGORITHM
The DT algorithm and BN algorithm are adopted here, whose detailed process are introduced as follows.

1) BAYESIAN NETWORK ALGORITHM
Bayesian network relies on the satisfying conditions found by Kyle's Wolff hypothesis. Bayesian network summarizes  the structural characteristics, and reduces the difficulty of reasoning, decision-making and learning. What is established is the model structure of independence restriction, which is also called reliability network and causal probability network, and is used to express the dependence and independence relationship of random variables. It is expressed as a binary BN = (G, θ), in which, G = (V, E) denotes an acyclic directed acyclic graph [27], as presented in Figure 6.   G includes a set of node variables V = {V 1 , V 2 , V 3 , . . .} and connecting nodes set E. θ represents the distribution set of conditional probability P, in which, θ i ∈ θ represents the conditional probability of Bayesian network at the parent node of node vi. Their joint representation defines a unique joint probability distribution, which reflects that Bayesian network is a natural and compact representation of joint probability distribution. of nodes. Equation (20) expresses the joint probability calculation of Bayesian network {V 1 , V 2 , V 3 , . . .} in node n.
Equations (21) show that the actual meaning of Bayesian network is to independently visualize the conditions of joint probability distribution.
Bayesian network structure learning needs to combine the previous research, and then find out the best network topology that fits the sample data, in the data set D of a given set of discrete variables {v 1 , v 2 , . . . , vi n−1 }. A scoring function should be defined to judge the fit between the independent relationship of a specific result response and the sample, to select a suitable algorithm to search, find out the best network model, and use the Equation (22) for calculation and comparison: Function f (G : D) stands for the calculation function of fit degree between network G and data sets D. Gn is defined as the Directed Acyclic Graph (DAG) in the node set V . The scoring function is based on a large number of data sets of statistical data sets, and the scoring function has an attribute: decomposability, as shown in Equation (23) In Equation (23), N vi,pa(vi) refers to the statistics of Vi and Pa(vi) in a given data set.

2) DT ALGORITHM
The principle of the DT algorithm is to make continuous decisions according to the characteristics of the data set, and finally classify the data lines to achieve the learning effect [28], as shown in Figure 7.
Ent(D) represents Information Entropy, which is used to evaluate the order of data in the system. D refers to the data set, p k stands for the data ratio of k th class, and y indicates the number type of the category. Ent(D) is inversely proportional to the data purity, and its value range is greater than or equal to 0, which means that there is only one kind of data in the data set.
G−R(D, α) denotes the information gain rate, IV (α) stands for the ratio of G(D, α) and G − R(D, α), α refers to the attribute, and V represents the values of α. The data set is classified into the number of D V to find the entropy of each data set respectively. The difference of entropy of data set between the previous entropy equals to the information gain [29]. Figure 8 illustrates the CNN network parameters constructed here.

4) SELECTION OF EVALUATION INDEX
Classification algorithms often use Confusion matrix (CM), whose composition is shown in Figure 9.
Equation (28) displays the calculation of the accuracy rate P.
Equation (29) indicates the process of obtaining the recall rate R.
F 1 is used to calculate the score, and Equation (30) presents its calculation. Accuracy rate P indicates the accuracy of the model in judging each category. Recall rate R denotes the proportion and score of certain data that can be correctly identified by the model. F 1 stands for the harmonic mean of accuracy and recall rate, and its value ranges from 0 to 1. The larger the F 1 value, the better the effect of the model.

E. THEORY OF AGRICULTURAL PRODUCTION AND DEVELOPMENT
The theory of agricultural sustainable development was put forward for the first time in the world around 1990, thus defining the basic concept of agricultural sustainable development: adopting the method of protecting environment and resources to ensure that the present and future demands for agricultural products can be met. Scholars in China have made a more detailed interpretation of this definition, that is,  the sustainable development of agriculture has the following characteristics: 1. Economic sustainability. 2. Ecological sustainability. 3. Social sustainability.
At present, a new efficient and sustainable agricultural model has emerged, which widely applies the IoT to agricultural production, realizing intensive production and modern management. Here, agricultural intelligent greenhouse is selected for illustration.

III. AGRICULTURAL INFORMATION CLASSIFICATION AND AGRICULTURAL PRODUCTION MANAGEMENT TEST RESULTS OF DL COMBINED WITH IoT TECHNOLOGY A. TEST RESULTS OF BAYESIAN JOINT DT ALGORITHM CLASSIFIER
Bayes combined with DT algorithm is used to construct a classifier for text information classification. Figure 10 illustrates the results of comparison between the classifier constructed by simple Bayesian network algorithm and DT algorithm. Figure 11 presents the results of the iterated network model. Figure 11 indicates the result of 10 iterations during ten kinds of experiments. The loss rate has been decreasing with the increase of iteration times, while the decreasing speed is getting smaller and smaller. When the tenth time is about 0,3, it meets the application requirements of the algorithm, so the algorithm is qualified.
The traditional pure DT classification method has limitations, that is, when the data volume of the data set is high, the size of the DT mowed by this algorithm will be too large, which will adversely affect the calculation, and its readability will be poor. For example, when the information retrieval and classification of investment promotion in agriculture is carried out, the accuracy of classification will be reduced because of the excessive amount of information, which is only 0.63. But the accuracy of agricultural analysis and prediction is very good, almost reaching 1. When traditional Bayesian network classification is used, its classification accuracy for all kinds of text data is above 0.8, because most of the assumed feature words are independent from each other. However, the classifier algorithm constructed by Bayesian-DT used here has reached 0. 5% in all aspects.85, and most of them reach an accuracy of more than 0.9, which is superior to the previous two classification algorithms.

B. APPLICATION OF IoT TECHNOLOGY TO AGRICULTURE
By using IoT technology and DL technology to carry out information transformation on agriculture, the following applications have been obtained: 1. Real-time monitoring, 24-hour continuous operation, and accurate data factors of crop growth and environment. 2. Control the growth factors needed by crops at any time, and adjust the nutrient composition, concentration, pH and other factors of the growth liquid. 3. For the prevention and control of diseases and insect pests, automatic monitoring and detection, automatic and timed drug spraying. 4. Establish a prediction model according to the extracted crop information, predict the future crop growth and maturity time, rationally regulate crop growth, realize efficient management, and connect with the market in advance. 5. Unmanned agricultural production can be realized.
This will further increase production and income, increase the yield of greenhouse crops by more than ten times that of open-air planting, and have a far-reaching impact on the future farmers' market; save resources and reduce overall energy consumption by 15-50%; the variety of crops makes crops which were difficult to be planted before get a suitable growth environment, and improves the agricultural economic value per unit area.

IV. CONCLUSION
Primarily, the DL technology and IoT technology are introduced, and then the agricultural information classification model is implemented, which is based on Bayesian network and DT algorithm. Next the model is trained to determine its VOLUME 10, 2022 availability and advantages. Then, by using various agricultural economic development theories, it analyzes the present situation of domestic agricultural information development. Finally, the benefits of agricultural production informatization development and economic management development have been achieved. This experiment is helpful to the research of information transformation of agricultural production in the future.
Although this study has basically achieved the original expected research goal and obtained some valuable research conclusions, due to limited academic accomplishment, there are still many shortcomings, which may affect the following two factors: (1) The design of the agricultural information classification model cannot adapt to all use environments.
(2) A large number of data of agricultural production informatization transformation achievements have not been obtained. This also points out the direction for the future research. In the future, two aspects will be mainly focused on: (1) The agricultural information classification model should be further improved to make the structure more practical.
(2) Further investigation should be made all over the world. In this way, a large amount of actual data can be obtained for the transformation of intelligent agriculture in the IoT.