Improved Feature Selection Method for the Identification of Soil Images using Oscillating Spider Monkey Optimization

Precision agriculture is the process that uses information and communication technology for farming and cultivation to improve overall productivity, efficient utilization of resources. Soil prediction is one of the primary phases in precision agriculture, resulting in good quality crops. In general, farmers perform the soil prediction manually. However, the efficiency of soil prediction may be enhanced by the use of current digital technologies. One effective way to automate soil prediction is image processing techniques in which soil images may be analyzed to determine the soil. This paper presents an efficient image analysis technique to predict the soil. For the same, a robust feature selection technique has been incorporated in the image analysis of soil images. The developed feature selection technique uses a new oscillating spider monkey optimization algorithm (OSMO) for the selection of features that are relevant and non-redundant. The new oscillating spider monkey optimization algorithm uses an oscillating perturbation rate to increase its precision and convergence behavior. A set of standard benchmark functions was deployed to visualize the performance of the new optimization technique (OSMO), and results were compared based on mean and standard deviation. Furthermore, the soil prediction approach is validated on a soil dataset, having seven categories. The proposed feature selection method selects the 41% relevant features, which provide the highest accuracy of 82.25% with 2.85% increase.


I. INTRODUCTION
Worldwide, agriculture is one of the significant sources of food and income. The economy of various countries highly depends on the outcome of agriculture. According to the geographical locations and soil quality, different types of plants are harvested. There is a direct relationship between the soil and the plants [1]. For each plant, a specific soil is required. Soil texture affects agriculture, selection of crops, the requirement of nutrients and water, and growth of crops. Therefore, it is essential to predict the soil before farming.
Furthermore, the selection of plants according to the soil will increase productivity. In general, the soil is predicted manually by farmers as the availability of experts is not easy at all times. This manual process further may be biased and non-appropriate due to the expertise and knowledge of the farmers [2]. Therefore, the automation of soil prediction with precision may improve this selection and may be helpful to increase productivity without the need for an expert. Furthermore, precision agriculture tries to bring farmers and soil closer to each other and work in synchronization. It makes use of technology and intelligent devices that are interactive and easy to use in farming.
In general, the soil may be classified according to its texture into seven classes, namely, silty sand, clay, sandy clay, clayey sand, clayey peat, humus clay, and peat [3], [4]. The pipette and hydrometer methods are the primary method of predicting the soil, which is time-consuming and requires an extensive workforce. Moreover, the most common soil analysis approach to study soil sub-surface with soil surface depth information is known as cone penetration testing (CPT) [5], [6]. However, Zhang et al. [7] observed that this CPT method may give uncertain results due to the complex composition and varying mechanical properties of soil. It results in class overlapping of different soils.
Furthermore, CPT and hydrometer methods are not easily approachable to each farmer. Recently, digital image processing provides very effective classification techniques that can be utilized to select soil according to its texture. Therefore, in this work, a soil prediction method has been developed using image analysis techniques that use the texture images of soil for its classification.
Past several years witnessed different soil classification methods [3], [8], [9]. Chung et al. [10] use RGB histogram techniques to represent the paddy soil series. Sharma and Kumar [11] presented traditional image mid-level and high-level classification methods for the classification of soil. Bhattacharya et al. [3] extracted the features using the boundary energy technique after the segmentation of signals. For the classification task, various classifiers are used and validated like neural networks, decision trees, support vector machine (SVM), and many others. Furthermore, Gordon reviewed the SVM classifiers in soil classification using image-based features and observed that the efficiency of these techniques is highly affected by how features are extracted.
Since the classification accuracy is heavily dependent on the quality of extracted features [12], [13], various feature extraction methods for soil classification have been presented. These feature extraction techniques are general statistics and learning-based. The methods of learning-based feature extraction use machine learning for feature extraction from soil images. Some popular machine learning approaches in this category are convolutional neural network (CNN) [14], restricted Boltzmann machines [15], and auto-encoders [16]. Feature selection problem is a combinatorial optimization problem, recently Shi et al. [17] proposed a collaborative approach for dimensionality reduction, and Li and Du [18] deployed the Laplacian method for analyzing hyperspectral imagery. Padarian et al. [19] used a CNN model to predict the soil organic carbon and showed that it reduces the error by 30%. Lu et al. [20] presented a 4 layers deep CNN for soil detection. For the same, they have used the combination of 80 synthetic hyperspectral bands and eight multispectral bands to improve the soil prediction accuracy by 7.42%. Furthermore, Yu et al. [21] proposed a three-dimensional CNN for soil classification and accelerated the feature discriminator's ability. Thuy and Wongthanavasu [22] [23] proposed two new approaches for feature selection; the first approach is based on D-stripped quotient sets, and the second is based on stripped neighborhood. In literature, these learning-based feature extraction methods perform well. Swarm-based techniques [24] [25] are successfully deployed for solving different real world problems. However, a high computational cost is induced in these approaches.
The statistical techniques extract different sizes, shapes, and structural features from the soil images further supplied to one of the classifiers. These methods do not generally perform well due to the complexity of the texture of an image. Recently, different texture feature-based methods have been implemented to take out the features of an image like local binary pattern (LBP) [26], the histogram of oriented gradient (HOG) [27], speed-up Robust Features (SURF) [28], and scale-invariant feature transform (SIFT) [29]. These feature extraction methods show better performance for complex structured images. However, these methods generate highdimensional feature maps that may lead to the generation of irrelevant and redundant features. These redundant and irrelevant features are responsible for the degraded performance of a classification system [30]. Therefore, this work uses a powerful feature selection approach to reduce such redundant and irrelevant features.
In literature, many feature selection methods are present, which can be categorized into wrappers, filters, and embedded methods [31]. The filter method is computationally efficient because it uses feature maps as class variables. However, these filter methods do not perform well when used as a classifier [32]. In the case of wrapper methods, predictive models are used to find the feature subset such as Sequential backward selection (SBS) [33] method. These methods are more promising than the filter methods [31] and iteratively remove the irrelevant features. Finally, embedded methods use classifiers to find a good set of features such as SVM with recursive feature elimination (SVM-RFE) [34]. The primary concern with the wrapper and embedded methods are their computational complexity which is very high [12].
Graph-based techniques are also successfully deployed for feature selection. Roffo et al. [35] proposed a framework for feature selection using Markov chain and power series of matrices. Hashemi et al. [36] introduced a multi-label graph-based approach. Here, the author used the PageRank algorithm. Henni et al. [37] deployed PageRank algorithm with subspace for feature selection using a graph-based unsupervised algorithm. Roffo et al. [38] introduced a graphbased ranking system with probabilistic latent. Zhang and Hancock [39] introduced hypergraph clustering in feature selection. Chen et al. [40] used these algorithms for objectoriented classification.
Classical search strategies are single solution-based techniques for optimization, and the result is also one optimized solution. These classical optimization methods are not able to solve problems that are non-differentiable, discrete, and multi-model. Thus, it is required to use some approaches to handle these complexities. Population-based approaches are one of the best solutions and emerging techniques. These algorithms use a swarm of solutions in every cycle and result in a population of solutions. Population-based algorithms share numerous mutual conceptions. Due to the capability of handling a large class of problems, this research work considered SMO as the center of research. The use of meta-heuristic methods generally removes the drawbacks of the feature, as mentioned earlier, selection methods. Different researchers presented various meta-heuristic algorithms for the efficient selection of features from high-dimensional feature maps. For example, the spider monkey optimization (SMO) [41] algorithm is a nature-inspired algorithm based on the behavior of spider monkeys. SMO has been proved better for handling high-dimensional feature space, which may be helpful in feature selection. The algorithm looks over the search area in its initial phases and then exploits it iteratively. For the same, the SMO uses the social organization of spider monkeys. SMO has been used in various real-world optimization problems successfully. Hybrid of SMO with other NIA [42] also deployed for information retrieval and performed well. However, its performance may be improved by modifying its different parameters.
In SMO, different phases are used, like the global and local leader phases, which are highly affected by a parameter known as perturbation rate. It is an essential parameter of SMO for deciding its convergence behavior. In standard SMO, a linearly increasing perturbation rate is used. Although this linear increasing function performs well but not so well for non-linear real-world problems, the inclusion of non-linearity in perturbation rate may increase the performance of new variants. Considering the same, different researchers have used different non-linear perturbation rates. For example, Kumar et al. [43] used a chaotic SMO in bagof-words for soil prediction in which chaotic perturbation is used.
Moreover, an exponential perturbation rate is used in SMO for plant leaf disease identification [44]. However, these variations also converge towards the local optima. Hence, improvement in the perturbation rate is still an open area. Therefore, this paper introduces a new approach (OSMO) in which the perturbation rate oscillates between a range and improves the balancing of exploitation and exploration. Furthermore, the proposed OSMO has been used to select the optimum set of features from high dimensional feature vectors of soil images.
The major research contributions of this paper are as follows:

1) A new oscillating perturbation rate introduced in SMO
and named as oscillating SMO (OSMO) algorithm. 2) The texture features are obtained from the images using the SURF method.
3) The OSMO-based feature selection is used to enhance the classification accuracy. 4) Soil images are classified by SVM, LDA, kNN, and RF classifiers. 5) The RF classifier achieved the best accuracy (82.25%) for the OSMO-based feature selection approach with 2.85% increase.
The rest of the paper is as follows. The standard SMO is briefed in Section II followed by the OSMO algorithm in Section III. Section IV contains result discussion and statistical validation of achieved results. Section V concludes the paper with future scope.

II. SPIDER MONKEY OPTIMIZATION (SMO)
The SMO algorithm is driven by the social and foraging behavior of spider monkeys. The fission-fusion social structure (FFSS) is used to model the SMO algorithm, in which monkeys divide themselves into groups from large to small and vice-versa. The following are the essential characteristics of FFSS in spider monkeys [41].
1) All the spider monkeys maintain a group of 40 to 50 monkeys known as individuals in SMO. 2) There is a global leader (GL) among the monkeys who can divide the groups into smaller three to eight subgroups if the food is insufficient. Each group starts foraging independently. 3) Each subgroup also has a local leader (LL) under which the food is searched. 4) The group members use a unique sound to social interaction with other members of the group's As mentioned earlier, the basis of foraging among spider monkeys, the SMO, has been mathematically designed and developed. In SMO, there are six phases and discussed in subsequent subsections. The Algorithm 1 illustrates the pseudo-code for the standard SMO.

Algorithm 1 Spider Monkey Optimization
Randomly initialize the swarm of N monkeys. That denote a vector of D decision variables, depicted as Here, i is the i th individual. Randomly initializes pr and limit for Local and Global Leader. Measure, the fitness of each individual. Elect local and global leaders using greedy selection process. while Stopping Criteria do Obtain new positions for all individuals using the Local leader phase. Use the fitness values of each group member to deploy the greedy selection.
Apply the global leader phase to obtain the new positions for all the group members. The location of global and local leaders is updated based on fitness.
If there is no change in any Local group leader for a predefined limit, apply the LLD phase. If Global Leader is not updated for a predefined limit, then apply the GLD phase to divide the group into smaller subgroups. Maintain the minimum size of each group to four. end while Let i th individual (X i ) in a D-dimensional vector of N population is represented by Eq. (1) and initialized by Eq. (2).
where X j max and X j min are upper and lower value of X i in the j th dimension. ϕ returns an arbitrary number in the range of [0, 1].
Local Leader Phase (LLP): This phase updates the location (X j i ) of each member using the learning of the LL (XL j k ) and members of the local group by Eq. (3), based on a probability pr that is known as perturbation rate. This position is only updated if a new solution has higher fitness in comparison to the existing solution.
k is the positions of the LL in k th group and X j r is randomly selected r th individual from this group.
Global Leader Phase (GLP): After the LLP, every individual updates its location using Eq. (4). This phase includes the knowledge of GL (XG j ) and members of the local group.
In the GLL phase, the best individual is declared as a global leader (XG j ). If the GL fails to update her position, then it increments the global limit counter.
Local Leader Learning (LLL) phase: The monkey (Solution) with the highest fitness is elevated as LL (XL j k ) of a particular group during the LLL phase. Similar to the previous step, if the position of LL does not change, then the counter for the local limit is incremented by one.
Local Leader Decision (LLD) phase: This phase either randomly initializes all the group members or modifies their position using Eq. (5) based on the counter for local limit.
Leader Decision (GLD) phase: If the count of global limit for a GL (XG j ) outstretched a threshold, then the GL creates smaller subgroups until the maximum number of groups (MG) is achieved. This phase also selects the LL's (XL j k ) by using the LLL phase. On the other hand, if the location of GL remains the same until the threshold MG, then it fuses all the small groups into a large group.

III. PROPOSED APPROACH
The proposed approach presents a new soil prediction method using the soil images. First, the method identifies the category of soil in three simple steps as given in Fig. 1. The 1 st step is the extraction of features from the considered soil images followed by the second step selection of features. The primary research outcome of this work is the development of the second step, i.e., feature selection. In feature selection, a new approach (OSMO) has been introduced to select optimum features. The identified prominent features are then made available to the classification phase, where a classifier is trained to recognize the soil quality. Each phase of the proposed methodology is described in the upcoming subsections.

A. FEATURE EXTRACTION
The first step of the proposed soil classification system is the extraction of features. For the same, texture features are extracted from the images using the SURF technique. The SURF technique for feature extraction was developed by Bay et al. [28] which extracts the local features and corresponding descriptor. Generally, SURF is used to extract the texture features in various application areas of computer vision. The features extracted by SURF are rotational, illumination, scale, and noise invariant. This technique works in three phases: detection of interest points, neighborhood description, and keypoint matching. First, the Hessian matrix approximation finds the interest points in the image. Then, for the feature descriptor, the sum of Haar wavelet responses among the neighborhood is measured. Finally, keypoint matching is performed between the descriptors.

B. OSCILLATING SMO BASED FEATURE SELECTION
After the SURF features extraction method, a highdimensional feature map is generated. Due to its high dimension, the computation cost of a classification system increases. Furthermore, some of these features may be redundant and irrelevant, reducing the efficiency of a deployed classifier. Therefore, to minimize these two effects, a new feature selection approach has been presented in this paper. The newly presented feature selection approach uses an OSMO algorithm to select the optimum set of features. OSMO is an optimization algorithm that finds the optimum solution in a guided search and variant of the existing SMO algorithm. The steps of the proposed feature selection method are as follows. individual from real to binary using a threshold value (T h) and Eq. (6). The value of (T h) can be selected empirically. b) Use the accuracy returned by K-fold cross SVM as the fitness value of an individual. The input to the SVM classifier is those features whose X j i value is 1 and the corresponding image label. 5) Apply the OSMO algorithm to find the best individual. 6) Select those features whose corresponding X j i value is 1 in the best individual returned by the OSMO algorithm. The proposed OSMO algorithm is similar to the SMO algorithm except for the perturbation rate. In basic SMO, the perturbation rate increases linearly, while an oscillating perturbation rate is introduced in the newly proposed OSMO algorithm and is explained in the following subsection.

1) Oscillating Perturbation rate
The perturbation rate, an essential parameter of SMO, dramatically impacts the precision and rate of convergence. The basic SMO has a linearly increasing perturbation rate. However, as the maximum problems in the real world are nonlinear, non-linearity in the perturbation rate can improve SMO's performance. Recently Kumar et al. [43] [44] proposed chaotic SMO [43] and exponential SMO [44] for soil classification and leaf image classification respectively. These two modifications in SMO taken advantage of the nonlinear perturbation rate. The chaotic map used by [43] to decide perturbation rate is illustrated by Eq. (7).
where, t and max_it represent current iteration and maximum number of iterations respectively. The value of z is decided by Eq. (8) where, z t ∈ [0, 1] represent chaotic number for t th generation. Value of µ is fixed 4 for this experiment after exhaustive experiments. In exponential SMO [44], the perturbation rate, is increased exponentially with the help of exponential function as illustrated in Eq. (9) pr new = (pr init ) max_it t (9) where, t and max_it represent current iteration and maximum number of iterations respectively and pr init in initial rate of perturbation that is randomly initialized in the range of 0 and 1.
Keeping these modifications in mind, this paper proposes a new oscillating perturbation rate which is inspired by the oscillating inertia weight in PSO [48]. In OSMO, the rate of perturbation is updated as per its oscillating behavior, which can be implemented by the following Eq. (10). (11) where, P r min and P r max are the oscillating range of P r. The value of t represents the t th iteration. T is the oscillation period and k is a constant integer value in the range of [1,7]. S 1 is the count of iterations for which P r oscillates and for the remaining iterations, its value is kept constant. This way, the P r oscillates for S 1 number of iteration and then remains the same as other iterations.

C. SOIL IMAGE CLASSIFICATION
In the final step of soil image classification, a classifier is used that is trained using selected features and the corresponding image labels. The proposed OSMO algorithm selects the features. This paper uses four classifiers for classification: k-nearest neighbors, Linear discriminant analysis (LDA), SVM, and random forest (RF). The SVM classifies the data with the help of a defined hyper-plane. This paper uses a multi-class SVM classifier for the classification of soil images. LDA uses a linear combination of features to discriminate against the classes. K-nearest neighbors classifier stores all available cases and categorizes new objects based on a similarity measure. RA is an ensemble learning method for classification in which decision trees are used to predict the classes.

IV. EXPERIMENTAL RESULTS
The performance of the oscillating SMO-based feature selection algorithm for the prediction of soil images has been conducted in two phases. The result analysis of the new OSMO algorithm is depicted in phase one, followed by the effectiveness analysis of oscillating SMO-based feature selection method on soil image dataset. The results of both phases are illustrated in the subsequent subsections. VOLUME XX, 202X

A. RESULTS OF OSCILLATING SMO
For the analysis of the OSMO performance, 15 representative benchmark functions are used. These functions are taken from Kumar et al. [43] who have also worked on the soil classification and presented chaotic SMO (CSMO). These functions are depicted in  [43], exponential SMO (ESMO) [44], whale optimization algorithm (WOA) [49], intelligent gravitational search algorithm (IGSA) [50], and PSO [51]. The parameter settings of the proposed OSMO algorithm are given in Table  2. For all other existing algorithms, the parameter settings  Tables 4 and 5 respectively. For the same, the NULL hypothesis is that with a significance level of 5%, two methods are the same for a benchmark function. In the table, '+' and '=' signify the rejection and acceptance of the NULL hypothesis, respectively. Moreover, '+' specifies that the OSMO algorithm performs better in comparison to the corresponding method. From the Wilcoxon rank-sum test table, it is discernible that the OSMO either performs better for the benchmark functions or performs equally than the existing method. The values for F 6 , F 10 , and F 14 functions with respective to IGSA is '=', which indicate that the proposed algorithm gives a similar output as for IGSA and CSMO, respectively. Therefore, these results validate the performance of the OSMO better than the other considered methods. Figure 3 shows the comparison of features selected by the proposed OSMO algorithm and other considered algorithms, while Figure 4 shows the accuracy comparison for considered VOLUME XX, 202X  The performance of the OSMO algorithm-based feature selection mechanism for the soil image dataset is compared with CSMO, ESMO, IGSA, WOA, and PSO-based feature selection approaches. The accuracy of classification and number of selected features are considered as the performance parameters, and the results are illustrated in Table 6. It is discernible from the table that SURF extracted 127054 from the image and same is illustrated in Fig 3. Moreover, the table also depicts the number of features selected by the proposed and considered feature selection methods. The proposed OSMO-based feature selection method selects 52092 features which are 41% of total features. Regarding the feature elimination process, the OSMO eliminates the    Table  6 and Fig. 4. The RF classifier gives 82.25% accuracy for an OSMO-based approach which is the highest. However, the remaining classifiers also perform better for feature sets selected by OSMO. Therefore, it can be said that the OSMObased feature selection approach for soil image classification improves other algorithms and can be utilized for other classification applications.

V. CONCLUSION AND FUTURE SCOPE
This work presented a new approach for feature selection from soil images named as oscillating SMO algorithm. An oscillating perturbation is proposed in the OSMO algorithm to take advantage of non-linearity and achieve a better convergence. The OSMO algorithm was tested on 15 benchmarks and compared against CSMO, ESMO, IGSA, WOA, and PSO algorithms. The results show the best convergence behavior of the OSMO algorithm. In addition, the proposed algorithm was tested on a soil image dataset that has seven classes. The proposed OSMO-based feature selection algorithm takes SURF extracted features and eliminates irrelevant and non-redundant features. The work was compared with five other well-known meta-heuristic algorithms, namely CSMO, ESMO, IGSA, WOA, and PSO-based techniques. The proposed OSMO-based algorithm eliminates the maximum number of features, i.e., 59%. Four classifiers are tested to classify soil images, namely RF, LDA, kNN, and SVM. Table 6 demonstrates that all the classifies give good results for OSMO, but RF gives the best accuracy for it. The future work includes the applicability of the OSMO to other real-world datasets. Furthermore, the parallelization of OSMO may be implemented for its use in big data. The perturbation rate may be modified with some nonlinear search strategies to take advantage of the nonlinear nature of the problem.