Online Sorting of the Film on Cotton Based on Deep Learning and Hyperspectral Imaging

,


I. INTRODUCTION
Cotton is one of the most important crops in the world. As the main cotton-producing province in China, Xinjiang has widely applied mulch film covering technology to retain the soil moisture, to maintain the soil structure and to prevent pests [1]. However, mulch film is often mixed with cotton during the machine-harvesting and machine-processing steps, which results in reduced cotton quality. Some techniques have been developed for detecting foreign matter in cotton, such as electrostatic separation, ultrasonic detection and computer vision detection [2], [3]. Electrostatic separation is a rudimentary method that utilizes the charge characteristics to distinguish the film from cotton. However, it is affected by many uncertainties, such as voltage, and it results in The associate editor coordinating the review of this manuscript and approving it for publication was Mu-Yen Chen . poor stability and sensitivity, which limit its applications. Ultrasonic sensors identify plastic film according to different densities between plastic film and cotton. However, the relatively low speed for ultrasonic transmission has resulted in a slow identification process [4]. With the development of computer technologies, computer vision techniques with the advantages of low costs, fast speed and consistency have been widely used in foreign matter detection [5]- [11]. Existing computer vision techniques depend on the color differences to distinguish foreign matter and cotton. However, it is difficult to detect foreign matters such as plastic film, which has good photopermeability.
Hyperspectral imaging is an emerging technology that integrates spectroscopy and imaging to obtain both the spectral and spatial information from objects simultaneously [12]. It can detect the chemical compositions and structural features in the spatial domain simultaneously. Hyperspectral imaging has been applied in agricultural and food inspection since 1990s, but few studies have reported on cotton quality assessments using hyperspectral imaging. Guo [13], [14] reported that hyperspectral imaging in reflectance mode over the spectral range of 400-1000 nm was capable of detecting white and transparent polypropylene fiber, black human hair, and black and transparent PE mulching film from cotton with overall recognition accuracies of 73.2 % and 75.3 % for the training and testing sets, respectively. Moreover, most of other studies employed a hyperspectral imaging system covering a wavelength range longer than 1,000 nm. The Fortier Channel [15] applied Fourier transform near-infrared (FT-NIR) spectroscopy with a wavelength range of 1,100 -2,400 nm to distinguish the individual types of cotton trash with a 97 % overall prediction accuracy for trash components. However, it is an offline algorithm that focuses on cotton trash samples (hulls, leaves, seed coats, and stems). On the other hand, this method indicates that extending the detection wavelengths beyond 1,000 nm is necessary in order to obtain more useful sample information that is difficult to distinguish from similar color matters.
Currently, multivariate statistical methods and machine learning methods, such as the partial least squares regression (PLSR) [16], the multiple linear regression (MLR) [17], the support vector machine (SVM) [18] and the artificial neural network (ANN) [19], are frequently used to improve hyperspectral signal classification results. While the PLSR and MLR conduct linear analysis between spectra and samples, they are not suitable for parsing complicated mapping relationships, such as the nonlinearity between spectra and samples. Although the SVM could establish nonlinear relationship for samples and spectra, its results depend on kernel functions. The ANN was developed to extract the nonlinear and complex features of samples. However, this method is generally considered a shallow learning approach with a model structure with one hidden layer [19]. Deep learning has been developed to improve the conventional ANN [20]- [22]. It can more greatly learn the hierarchical feature representations and extract the input information layer by layer to represent different levels of nonlinearities [23]- [25].
In this research, a hyperspectral imaging technique coupled with deep learning was used to classify the film from seed cotton. The proposed algorithm integrates an improved weighted stacked autoencoder, the grey wolf optimizer and an extreme learning machine (ELM) to build the classification models for recognizing the seed cotton and film. Different from the classic autoencoder architecture, in the weighted stacked autoencoder, the self-encoded features were weighted based on their corresponding correlation with the network output [20]. Next, the advanced features from the weighted stacked autoencoder are used as the input for the ELM. ELM is a single-hidden layer feedforward ANN. Instead of using gradient descent algorithm, ELM utilizes the concept of random mapping and Moore-Penrose generalized inverse to optimize the network weight values. The established model can ensure not only the smallest training error but also better generalization ability compared to the conventional gradient descent optimization algorithm. The corresponding training time of ELM is dramatically decreased [26]. However, as an ANN network, the number of hidden neurons of ELM and its first layer parameters from random mapping indeed affect the regression performance of the ELM model [27]. To select the best combination of these parameters, the grey wolf algorithm is applied into our application which shows better ANN optimization results compared to other classic metaheuristics in previous studies [28], [29].
Specifically, the integrated model is used as the final classifier to identify the film and cotton. Therefore, the specific objectives of current research include the following: • Develop a sorting machine for the online detection of film from seed cotton based on hyperspectral imaging system and deep learning; • Develop a fuzzy factor to adjust the weights based on the correlation coefficient of the inputs (original signal) and the outputs of the stacked autoencoder to extract the more representative features; and • Use the grey wolf optimizer for the first time to determine the neurons and weights of the extreme learning machine to achieve higher classification accuracy.

A. MATERIALS AND DATA COLLECTION
A Xinjiang municipality cotton ginning company provided approximately 10 kg of seed cotton that was mechanically harvested from the south of the Xinjiang municipality, China. Trained workers from the company picked out the film from the unginned cotton. At last, 49 pieces of various sized films were singled out. The mixture of films and seed cottons was fed into our sorting machine/system, and they were used to construct the experimental and testing datasets in this paper. The schematic of our machine is shown in Figure 1. The seed cotton is loaded from the top inlet of the feeding room.
To improve the efficiency and accuracy of plastic film sorting, there are two kinds of rollers that are designed for seed cotton. The main function of the top rollers is load bearing and cotton feeding, and the shaft diameter is larger in order to achieve higher strength and stiffness. During feeding, the rotation speed is approximately 3-6 r/min with a slow but stable conveyance. The aim of the bottom rollers is to disperse the VOLUME 8, 2020 clusters of entangled cotton, and their rotation speed is set as 200 r/min to achieve a better cleaning effect. Because the teeth of the adjacent winding rollers interact with the same ones of the top feeding rollers, the cotton is conveyed layer by layer. The teeth on the winding rollers can pull seed cotton from the top feeding rollers once the fiber is hooked during high-speed rotation. Because of the centrifugal force, the winding rollers will project cotton onto the black-rubber conveyor belt. In the same way, some mixed films and cotton are scattered by the impact of the teeth. Then, the seed cotton is separated and transported on the black-rubber conveyor belt. The width of the belt is approximately 2 m, and the material of the belt is black rubber to minimize the background reflected light. A servo-motor drives the conveyor belt at 2 m/s, and the encoder produces the speed pulse. The hyperspectral imaging system is placed on the belt. The high-speed hyperspectral camera (Spectral Camera SWIR, SPECIM Spectral Imaging Ltd., Finland) was used to acquire the hyperspectral images. The spatial pixels of the camera number 384, and the spectral range is 1000 to 2500 nm with 288 spectral bands. A 15-mm lens that was designed for optimized performance from 900 to 2500 nm was utilized to achieve about a 5.2-nm pixel resolution, and the camera's field-of-view was approximately 2 m. The external illumination was equipped with two lines of dome halogen lamps to light the scene, and the dome halogen lamps can realize omnidirectional lighting to overcome the darker areas that may result in occlusion. Before practical use, a white reference plate was put on the belt to adjust the white balance and to fix the brightness value. The acquisition board on the computer was connected to the camera using a Camera Link cable. The board receives the encoder pulses and sends a trigger signal to synchronize the frame rate of the camera with the speed of the belt. Normally, the frame rate is approximately 390 fps with 2 m/s as the belt speed.
The sorting system is located in the front of the belt. The high-speed valves with nozzles are arranged in a line, which are exactly aligned with the belt. There are 48 nozzles, and the width of each nozzle is approximately 41.6 mm. Under normal conditions, the seed cotton can fly through the inlet of the trash removal box into the storage box due to the effect of inertia. In contrast, the separated films will be absorbed into the trash removal box due to the combined action of big air friction and small inertia. Once the computer recognizes the films on the belt, it will count the number of pulses of the encoder for synchronization. When the films are under the valves, the computer will give a trigger signal to the corresponding valves to eject the films, and the films will be sucked in the trash removal box for vacuum aspiration. The online recognition algorithm is performed on an Nvidia GTX1060 GPU with 6 GB of DRAM.

B. SORTING ALGORITHM FOR FILM AND COTTON
The process of the proposed sorting algorithm is shown in Figure 2, and it includes three main parts: 1) the features are extracted by the variable-wise weighted stacked autoencoder, 2) detection by the extreme learning machine with the grey wolf optimizer, and 3) postprocessing.

1) VARIABLE-WISE WEIGHTED AUTOENCODER
The basic structure of an autoencoder is an unsupervised neural network with one hidden layer, and it consists of an input layer, a hidden layer and an output layer. The goal of the autoencoder is to reconstruct the original input (x i ) as accurately as possible in the output layer (x i ). Here, we stack a group of antoencoders to construct a deep network [30] to reconstitute the inputs in order to extract the appropriate spectral features (Y) from hyperspectral images. However, in the spectrum analysis domain, it is well known that not all wavelength variables have the same importance for the output in the NIR spectra [31]. Some wavelength variables even have a negative influence on the regression result [32]. Wavelength selection can, to some degree, eliminate the negative influence from these wavelengths, but there still are some disadvantages. First, wavelength ranges with comparatively high noise might carry useful relevant information for prediction, and simply removing such wavelengths would spoil the multichannel advantage of the model to some extent. Second, some wavelength combinations might represent some data information, and the selection of an individual wavelength might cause the loss of useful information. Therefore, some research developed variable-wise weighted methods to assign continuous nonnegative values to wavelengths rather than directly eliminating unimportant wavelengths [33]. Thus, the method promises to preserve the useful information that is hidden among the noise, to retain the multichannel advantage and to reduce the influence of the negative features with small positive or zero weights.
In this paper, a variable-wise weighted stacked autoencoder [20] is adopted to extract the high-level features and to reduce the dimensions of the data. To extract the more representative features, a fuzzy factor, which is different from that in [20], is used to adjust the weights based on the correlation coefficient of the inputs (original signal, x i(j) ) and outputs (reconstructed signal,x i(j) ) of the weighted stacked autoencoder, which is named the VW-AE. Therefore, the target (loss function, J λ ) of the VW-AE can be expressed as follows: whereÃ is a d × d h weight matrix,b is the bias vector for the output layer, and λ (j) is the weight of the j-th variable. λ (j) is set as follows: where f ( CC (j) ) is a unipolar sigmoid function of CC (j) , which is assigned a fuzzy weight to adjust the scale of CC (j) . CC (j) is the correlation coefficient of the j-th variable, and it is calculated as follows: wherex j andȳ are the means of the j-th variable and the output, respectively, and Y = [y 1 , y 2 , · · ·y N ] is the output that is connected with the input X.
In equation (2), the variables that are highly related to the output are given large weights, and the fuzzy weight assigns the large CC (j) a corresponding larger weight and the small CC (j) a much smaller weight, which is different from the normalization in [20]. By training the VW-AE, the reconstruction should be more accurate for the output-relevant variables, and the hidden features are more relevant to the output.

2) EXTREME LEARNING MACHINE OPTIMIZED BY THE GREY WOLF OPTIMIZER
Compared with traditional classification algorithms, an ELM has the advantages of a strong generalization ability and fast learning speed [34]. Some scholars have applied ELMs to hyperspectral image classification and achieved better performance than other classification algorithms [35]. An ELM usually uses a single-layer feedforward network. Its basic structure includes an input layer, a hidden layer and an output layer, as shown in Figure 3. A single hidden layer neural network can be expressed as follows: where In other words, it finds the parameters, including β i , W i and b i , that make equation (5) tenable.

VOLUME 8, 2020
Assuming H is the output of the hidden nodes, and β is the weight of the outputs, equation (6) can also be expressed as follows: where T is the expected output.
Once the input weights W i and bias b i of the ELM are determined, the output matrix of the hidden nodes H is definite. In this way, the ELM model with one hidden layer can be transformed into a linear system H β = T , and the weights of outputs β can be calculated as follows: where H + is the generalized inverse of matrix H , and T is the expected output. Since both the number of hidden nodes and weights impact the entire performance of an ELM with one hidden layer, the grey wolf optimizer (GWO) is used to simultaneously optimize the number of hidden nodes and weights of the ELM in order to achieve higher classification accuracy. The overall workflow of GWO optimization process is shown in Figure 4.
There are two main steps of the GWO: 1) encircling prey and 2) hunting by imitating the grey wolf in nature. When encircling prey, the first three attacking wolves (GW α , GW β and GW δ ) can move to any place in order to guarantee they are around the target, which means that the three attaching wolves are those transversal vectors in P which make the corresponding ELMs achieve the first three best performance. Then, the rest of the other wolves update their positions according the best three. In other words, the other transversal vectors (rows) in P are updated and closer to the three selected wolves. The details of the GWO are introduced in reference [36]. Note that the neuron existence flag vector in [36] has been simplified into a numerical variable L in the array (P), because that the performance of ELM is only related to the number of hidden neurons, instead of its existence flag ordering in the binary coding. The best three wolves are selected based on the objective function introduced the next paragraph.
The biggest advantage of the GWO is that the maintained strategy of handling the exploration and exploitation in the search process ensures that the most appropriate parameters and number of hidden nodes of the ELM can be found at the same time with the objective function (F) as: where acc is the whole accuracy of classification, L is the number of the hidden nodes, L U is the upper limit of the number of the hidden nodes, L D is the lower limit of the number of the hidden nodes, and γ is the parameter for adjusting accuracy and model complexity. γ , L U and L D are set as 0.76, 5, and 200 in our experiments. More specifically, the objective function consists of two parts: 1) the first term is the classification accuracy and 2) the second term means the complexity of the ELM. The fewer hidden nodes mean the simpler ELM. Therefore, the final target is to achieve the least F, which meets the higher accuracy and the fewer number of hidden nodes at the same time. The ELM that is optimized by the GWO is named the GWO-ELM in this paper.

3) POSTPROCESSING
The output of the GWO-ELM can generate four probability matrixes, which represent the classification probability for 'Film on Background', 'Cotton', 'Film on Cotton' and 'Background'. Considering illumination variations, imaging noise, and dirt on the cotton, some pixels will be misclassified. In this postprocessing step, we will simply combine the spatial information to achieve better classification performance than the pixelwise method. The detailed steps are as follows.
For the probability map of each class, a 5 × 5 uniform kernel will be used to convolute with the map. It based on the classification probability of each pixel being highly related to its neighboring probability information. Considering that no prior knowledge of cotton/film shapes was available in practice, in the application, a simple and fixed filter kernel was used. The four updated probability maps will be normalized to ensure that the sum of the classification probability of each pixel equals one. The pixel label will be determined as the class with the maximized normalized probability.

A. DETERMINATION OF MODEL PARAMETERS AND STRUCTURE
To construct and test the proposed sorting model, 107 hyperspectral images with the size of 538 × 384 × 288 (height × width × wavelength) were collected. 21 hyperspectral images were randomly selected to construct the training set and the remaining images were used as the testing set. All the pixels in each image were labeled as one of four categories: background, film on background, seed cotton and film on seed cotton. Each pixel in any hyperspectral image corresponds to 288-dimensional data of spectral information, which was set as a sample. Finally, 223872 samples were labeled, which meant that there were 55968 samples for each type in the training set, which were used to determine the parameters and structure of the proposed sorting model. First, the weights and biases of the VW-SAE were randomly generated. The weights and biases of the VW-SAE in each layer were updated using layer-by-layer pretraining technology and the gradient descent algorithm based on the root mean square error loss function, and their parameters were determined by the 10-fold cross-validation method. After each layer of the VW-SAE's training was completed, a two-layer neural network was used as an optimizer to fine-tune the VW-SAE as a whole. Through the experiments, we found that the proposed VW-SAE was composed of three VW-AEs with 144, 72 and 36 neurons, respectively. The sigmoid was used as the activation function in order to acquire the ideal spectral features for classification. Finally, 36 high-level features are extracted by the VW-SAE and set as inputs of the ELM. Table 1 lists the results of the GWO-ELM with the feature inputs extracted from hyperspectral image based on the VW-SAE and the traditional minimum redundancy maximum relevance algorithm (mRMR) [37]. For the mRMR, the highest classification accuracy is acquired when 10 features are extracted from 288 wavelengths, whose results are listed in the table. Detailed mRMR experiment results are listed in the supplementary material. In comparison, VW-SAV can extract more appropriate information than the mRMR, especially for distinguishing film on background and cotton.
Here, the ELM adopted a single hidden layer neural network, and the activation function was the sigmoid in order to conduct nonlinear classification. The weights, biases and number of hidden nodes of the ELM were simultaneously optimized by the GWO algorithm. Recent studies show better optimization performance of GWO for training perceptrons [28]. In the experiment, the number of hidden layer neurons of the ELM was set to 14. A comparative experiment was carried out to further valid the optimization results, and Table 2 lists the results with different number of hidden nodes of the ELM. It is clear that the ELM with 14 hidden nodes can achieve the best classification results, especially the type of film on background. To further prove effectiveness of GWO method to our specific dataset, we compared its results with two classical optimization algorithms, genetic algorithm (GA) and particle swarm optimization (PSO) algorithm. The final comparison results are shown in Table 3, and GWO performs the best among the three optimization algorithms.  Figure 5 shows the relative reflectance for the four types of objects (film on cotton, background, cotton and film on background), where the dotted line represent the actual spectral range of every type, and solid line represents its mean, respectively. The patterns for the relative mean reflectance for film on cotton and cotton and for the background and film on background were similar throughout the entire spectral region because of the weak reflectance of the transparent film. There is a noticeable difference between the relative mean reflectance for the cotton and background, which suggests that it may be easy to recognize cotton on the black belt. VOLUME 8, 2020 However, it is clear that the spectral scale of film on cotton covers the whole spectral scale of cotton, and that there is overlap between the spectral scales of film on background and background in Figure 5. Therefore, it is hard to sort the film only with specific reflectance.

B. SPECTRAL ANALYSIS AND PROCESSING
Since the spectra of the film on cotton and film on background hugely differ, the samples are first classified into four types in our work: film on cotton, film on background, cotton and background. Table 4 lists the classification results of the different models for the four types and the three types (film, cotton and background). Obviously, the results of the four types from the different models are better than those of the three types.  Table 5 summarize the classification results for background, film on background, cotton and film on cotton using five combinations of mathematical models and a traditional machine learning model (ANN). Overall, both the grey wolf optimization extreme learning machine (GWO-ELM) and the artificial neural network (ANN) coupled with the combination of variable weighting and the stacked autoencoder (VW-SAE) can recognize the objects very well except for film on background that had a recognition rate of 0.8628 for the VW-SAE +GWO-ELM and 0.8506 for the VW-SAE+ANN. When the variable weighting was not used, for either the GWO-ELM or the ANN models coupled with stacked autoencoders, each corresponding object obtained lower recognition rates excluding the background. Notably, the classification results for film on background achieved an almost 6 % reduction using the SAV+GWO-ELM model and the recognition rate achieved about 18 % reduction using the SAV+ANN model, which suggest that the variable weighting algorithms have significant effects for different models, but they have no influence on the background classification results. In addition, the identification of film on background achieved a classification accuracy of 0.4178 for the GWO-ELM model and the identification achieved a classification accuracy of 0.5794 for the ANN model. In other words, when the variable weighting coupled with a stacked autoencoder algorithm was used, the classification results for each of the objects could be improved over those of the GWO-ELM model and ANN model. Meanwhile, comparing to the ANN model, the variable-weighted stacked autoencoder algorithm has more effect on improving the classification accuracy of film on background for GWO-ELM model. In addition, Table 5 also shows that the single ANN model achieved the optimal recognition rate for film on background compared to other models. Overall, for all combinations of discrimination models, the recognition rate for background and film on cotton could reach approximately 0.99, while for the recognition of film on background, the variable-weighted stacked autoencoder algorithm could improve the classification results over those of the single GWO-ELM and ANN models.

C. ANALYSIS OF MODEL PERFORMANCE
To intuitively observe the recognition results, the established models in Table 5 are used to classify each pixel for a hyperspectral image with the size of 538 × 384 × 288 (height × width × wavelength). Figure 6(a) showed that three pieces of film in the raw picture can be observed. Each model can recognize the background very well, which may be due to the large spectral differences. As shown in Figure 6(h) and 6(g), the ANN and GWO-ELM cannot well recognize the film on the background, and the GWO-ELM model also cannot ideally identify the film on cotton. After using the stacked autoencoder algorithm, the recognition rates for film on background and film on cotton both significantly improve, shown as Figure 6(e) and 6(f). Meanwhile, combining the VW-SAE with the GWO-ELM and ANN models can enhance the recognition performances for film on background, as shown in Figure 6(c) and 6(d). It can be observed that the generated pseudocolor maps have the same conclusion with Table 5, which indicates that the established models are reliable for online detection. Table 6 presents the recognition accumulative time and overall classification accuracy for each model. The stacked autoencoder algorithm can improve the overall classification accuracy by 4.3 % for the ANN and by 11.2 % for the GWO-ELM, while the combination of variable weighting and the stacked autoencoder can further improve the classification results by 4.25% and 2.48 % for the SAE+ANN and SAE+GWO-ELM, respectively. These results demonstrate that the algorithm that combines variable weighting and the stacked autoencoder provides a significant and positive improvement over both the ANN and GWO-ELM models. Although the classification accuracy improved using the combination of variable weighting and the stacked autoencoder, the process time increased by 45.5 % for the GWO-ELM with an accumulative recognition time of 1.44 s for an image with the size of 538 × 384 × 288. However, compared with the ANN, the accumulative recognition time could decrease by 35.4 % for the combination of the variable weighting and the stacked autoencoder coupled with the GWO-ELM, which can be used for online detection.
Finally, the classification results of the VW-SAE+GWO-ELM are combined as film and nonfilm, as shown in Figure 7(c). These results will trigger the valve to separate the film from cotton.
The study demonstrates that the variable-weighted stacked autoencoder algorithm coupled with the GWO-ELM can achieve better classification results with limited computational time increases, which could meet the online detection requirement. As shown in Figure 5, the spectra that were obtained from hyperspectral imaging had similar absorption peaks to other studies [38]. Nevertheless, it is slightly tough to visually identify cotton and film on cotton, as well as background and film on background, since the patterns of their spectra were similar, which encouraged us to develop a new algorithm to recognize these four objects. The stacked autoencoder algorithm can enhance the classification results for the four types of objects based on a single GWO-ELM   or ANN model. However, the variable-weighted stacked autoencoder algorithm could obviously improve the overall recognition results. The recognition rates for film on background for all of models were relatively low, which could be due to the influence of the strong absorption of the black background. Although the variable-weighted stacked autoencoder coupled with the GWO-ELM provided similar results to the variable-weighted stacked autoencoder coupled with the ANN, the recognition time for each pixel was faster, which could assess cotton more efficiently and save costs. Moreover, the average recognition rate for the variable-weighted stacked autoencoder coupled with the GWO-ELM model was also comparable to Mengyun Zhang's [32] and Ruoyu Zhang's [38] laboratory studies. These studies applied hyperspectral imaging in the transmittance and reflectance modes over the spectral range of 900 -1700 nm to inspect foreign matter on the surface of cotton. However, the thin films in our studies are much hard to detect compared to the foreign matters, such as plant bract and leaves.

D. APPLICATION OF PROPOSED SORTING SYSTEM
The proposed method has been integrated into the sorting system with a Nvidia GTX1060 GPU and tested by two companies from Shandong Province, China and put into production in Xinjiang municipality, China. The algorithm is implemented by deep learning library Keras supported with Nvidia parallel computing platform CUDA, and the GPU-accelerated library cuDNN. The mixture of films and seed cottons was fed into our proposed sorting machine/system (Figure 8(a)), and as shown in Figure 8, the machine can separate the films and seed cottons in real time. It should be noted that 384 pixels in a hyperspectral image is about 2m of the conveyor belt in Figure 8(b). In practice, considering the hyperspectral camera is the line scan camera, the position of its illumination light is removeable as shown in Figure 8(b). We can easily reduce the distance between the lights and the nozzles and then adjust accordingly the height of hyperspectral images to achieve better sorting result. After several field experimentations, the distance between the camera and the sorting nozzle is set as about 1m as shown in Figure 8(b). Figure 8(c) and (d) show the clear cotton and plastic film separated by the sorting system. In-field running results show that the proposed machine can process 3 tons of mixtures per hour, and sorting accuracy can achieve up to 95%.

IV. CONCLUSION
The study has developed a new sorting algorithm for the online detection of film on cotton using hyperspectral imaging over the spectral range of 1000 -2500 nm. The results showed that the single ANN and GWO-ELM cannot recognize film on background, and using variable-weighted stacked autoencoder algorithms to extract features can provide positive effects for both the GWO-ELM and ANN by recognizing film on background at up to 86 % accuracy. The recognition rates for film on background are relatively low for all of the models, which may be due to the influence of the absorption of the dark background. The combination of variable weighting and the stacked autoencoder coupled with the GWO-ELM is determined to be the optimal model with an overall classification accuracy of 95.58 % for online detection due to the lower recognition time of 1.44 s per image. The proposed method has great potential to achieve online detection for the recognition of film on cotton.
On the other hand, the facility costs of this proposed sorting system is a little expensive, due to the integrated GPU for computation acceleration. With the future hardware developments, the equipment cost is expected to be further reduced. The proposed algorithm has a good application prospect and the similar idea can be applied into other fields, such as wheat and stalk separations.