Single-Point Crossover and Jellyfish Optimization for Handling Imbalanced Data Classification Problem

The imbalanced datasets and their classification has pulled in as a hot research topic over the years. It is used in different fields, for example, security, finance, health, and many others. The imbalanced datasets are balanced by applying resampling and various solutions are designed to tackle such datasets that mainly focus on class distribution issues. The imbalanced data is rebalanced using these methods. This paper introduces a technique for balancing data through two stages: first, oversampling methods are utilized in the process of rebalancing such imbalanced dataset using the single-point crossover to generate the new data of minority classes, second, it searches for an optimal subset of the imbalanced and balanced datasets by Jellyfish Search (JS) which is an optimization method. Experiments are performed on 18 real imbalanced datasets, and results are compared with famous oversampling methods and the recently published ACOR (Ant Colony Optimization Resampling) method in terms of different appraisal measurements. Higher performance is recorded by the proposed method and comparability with well-known and recent techniques.


I. INTRODUCTION
M ACHINE learning (ML) techniques play a vital role in gaining insights from the data in different repositories that are growing exponentially. Data and data mining is crucial for the next world. To obtain the world's maximum data, many companies generate large data centres there. The data becomes useable if ML methods analyze them and some decision-making results are generated [1]. In many applications in the real world, distorted sample distribution affect the classification process as instances of some classes appear rarely. From a learning point of view, the minority class is typically more interesting because it means a high cost since it is not well classified. In supervised machine learning, significant variations in prior probabilities of class make the classification of minority or rare classes difficult. This is called the problem of imbalanced class [2]. Imbalanced class problem is a data quality issue that impacts the classification of many applications. It appears in several real-world fields. For example, extortion detection [3], content classification [4], and programming quality forecast [5], medical applications [6], risk management [7], and biological data analysis [8]. To overcome this problem, it is important to acquire a classifier that has high accuracy for the class of minority without seriously affecting the accuracy for the class of majority. In more than ten years, different techniques have been devised to mitigate the problem of class imbalance. Three approaches are suggested for such techniques (1) Data level methods focus on sampling the instances of majority and minority classes for balancing the distribution; (2) Algorithm-level techniques focus on adapting current learners to mitigate their prejudice against the class of majority; (3) Hybrid approaches which consist of the advantages of the two above-mentioned types. For class-imbalanced data sets, the data-level approaches are most widely used [4]. The data-level approaches target minimizing the imbalance VOLUME 4, 2016 ratio between the majority and minority classes by either under-sampling the majority class data or oversampling the minority class data. Since the size of the dataset has been gradually rising, the under-sampling method may be a better option than the oversampling method [3]. In this paper, two data-levels methods are introduced based on resampling using single-point crossover operator and Jellyfish search to overcome class imbalance classification. The proposed algorithms primarily performed on two stages: first, it rebalancing an imbalanced training dataset by oversampling algorithm using single-point crossover to generate the new training data with oversampled minority classes, second stage, finding an optimal subset of the balanced and imbalanced datasets by Jellyfish Search (JS). The Experiments are applied to eighteen real datasets using the Support Vector Machine (SVM) classifier and cross-validation. The results are compared with the best results obtained by [2] ACOR, SMOTE [12], BSO [13], ROS [14], and ADASYN [15]. The results show that the proposed method performs better than many state-of-theart methods. The remainder of the paper is structured as follows: Section II briefly introduces an overview of the most related work from sampling methods, Jellyfish behavior and its search algorithm and the used evaluation metrics. The proposed system phases are in section III. Section IV illustrates system specifications, discussion and comparative analysis of the experimental results. Finally, the conclusion is presented in section V.

II. RELATED WORK A. SAMPLING METHODS
Methods of sampling (or resampling) are one form of stateof-the-art solutions to fix class-imbalanced datasets. Methods of sampling concentrate on producing a new dataset from a given class imbalanced dataset containing a more balanced distribution between the classes of the majority and minority. Asserted that balanced distribution showcased better classification results than imbalanced ones [2]. The primary benefit of these methods of sampling is that they are independent of the construction process of the class. It is usually possible to categorise sampling methods into three groups: undersampling, oversampling, and hybrid methods. The methods of under-sampling concentrate on removing of data samples from the majority class, while the methods of oversampling are used to reproduce those data samples or generate new data samples from existing ones in the minority class. The methods of hybrid combine both oversampling and undersampling to balance the given dataset [11]. Some sophisticated the oversampling techniques for imbalanced dataset learning, among which the Synthetic Minority Oversampling Technique (SMOTE) have been developed. SMOTE is a renowned method, suggested by Chawla et al. [12], it produces artificial examples based on functional similarities in space between current representations of minorities. It finds k, the nearest neighbors intra-class for every sample of minority and subsequently produces synthetic samples towards some or all those nearest neighbors. Furthermore, SMOTE technique has a main disadvantage since it produces the same number of synthetic samples for each original minority sample without considering the examples of majority located near the examples of minority; thus the overlapping between classes is increasing. The borderline-SMOTE technique is enhancement for SMOTE technique, since applies SMOTE on samples of minority class living on the borders between the classes of majority and minority. Bunkhumpornpat et al. [16] proposed a new method called Safe-level SMOTE. This technique characterizes a safe-level value for each minority data point before the production of synthetic samples. This safe-level is relegated by the numbers of minority samples existing in its k-nearest neighbor. Synthetic patterns are produced along with the safe most significant level, guaranteeing the synthetic patterns in the safer regions. When a safe-level 0, a minority pattern is a commotion, while patterns with a safe-level close to the value of k are in the safe area. In this strategy, synthetic samples are produced along the line of k-NN. In contrast to SMOTE, Synthetic samples are put similar to the minority class than the majority class. The ratio of a minority class's safe level is defined as the ratio of a minority sample's safe level to its nearest neighbor's safe level [17]. Another similar method, ADASYN [15], uses density distribution as a criterion for the adaptive distribution to determine the number of synthetic samples to produce for each sample of a minority class. Thus, ADASYN makes a learner more concerned with the problematic regions of the decision boundary. Furthermore, in some cases, examples of noisy in the class of minority can cause multiple synthetic instances to be created, thus degrades the classifier's performance. A proposed ACOR algorithm for enhancing the performance of the oversampling methods for the classification of class imbalance. The algorithm of ACOR is a general preprocessing framework that increases the efficiency of existing oversampling techniques for imbalanced datasets. Comparing ACOR with other oversampling techniques, ACOR does not depend on the mechanics of producing new samples. The significant difference of ACOR is that oversampling algorithms that can be fully used, and through ant colony optimization, an optimal training set can be obtained. ACOR efficiently reduces samples and makes a training set more appropriate for a given classifier [2]. We introduce in this paper two proposed methods to rebalance an imbalanced dataset by oversampling algorithm using the single-point crossover to generate a new data of minority class and using JS to select the optimal instances from the training set before and after oversampling then, comparing the results to select the best method to handle the class imbalance problem.

B. SINGLE-POINT CROSSOVER
Crossover is the random recombination of individuals and produces their offspring with genetic material differences between them (exchange of fragments of chromosomes' code sequences). Crossover is probably going to occur with some FIGURE 1: Swapping genetic information after a crossover point.
probability, referred to as the crossover probability which is the basic parameter in the crossover. The popular methods of the crossover are single-point, multi-point, heuristic or arithmetic crossovers [21]. Crossover operators are used producing the child by combining the genetic data of two or more parents. A single -point crossover is a fundamental method of crossover. It selects a random crossover point. The first child is generated by the first parent, which concatenates the first part with the last part of the second parent. Similarly, another child comes from the first part of the second parent and the last part of the first parent, as shown in Fig.1. Each child from the reproductive process assumes to have the same biological gen as their parents [22] and [23].

C. THE JELLYFISH SEARCHING ALGORITHM
Jellyfish live in the water at different depths and temperatures around the world. They look like bells in shape; some are very big while others are less than a centimetre in diameter. They are different in colors, shapes, and sizes. Some jellyfish uses tentacles to feed their mouths with food while others use filter-feeding, so they eat what the current brings them. Other jellyfish deliberately catches prey by stinging them with their tentacles and then emit a venom that paralyzes them [24]. They attack creatures who swim up against or touch them. Some of the stings of certain jellyfish is painful, but not fatal. A sting can cause pain, itching, red marks, tingling, or numbness. However, stings from some species of jellyfish, for example, box jellyfish (also referred to as sea wasps), are very dangerous and may even be fatal. Most jellyfish are present in Australia's coastal waters, the Philippines' coastal waters, the Indian Ocean, and the central Pacific Ocean [25]. Jellyfish is weak swimming creatures, and they drift with water currents and tides, so that they can form a swarm in the water and a large number of jellyfish is named a Jellyfish bloom. Many factors that control the formation of a swarm, including the available nutrients, the availability of oxygen, temperature, and water currents, which are the most critical factors in the formation of the swarm. The ecosystem affects the jellyfish flocks significantly, and the amount of food varies in the places that the jellyfish visit, so the best place will be determined that contains more food. Therefore, a new algorithm is being developed that depends on the movement of jellyfish in the water and their search for a food called the Jellyfish Search (JS) Optimizer [26]. Fig. 2 shows the Behavior of Jellyfish in the ocean and presents the steps of the algorithm. The jellyfish search optimization algorithm is based on three idealistic rules [26].
• Jellyfish either obey the ocean current or move within the swarm and controls the switching between these forms of movement by a "time control mechanism". • Jellyfish swim in the ocean searching for food. They are more drawn to locations where there is a larger amount of food available. • The location determines the quantity of food found and the corresponding objective function of it.

1) Ocean current
The ocean current contains large amounts of food, so it attracts jellyfish. The direction of the ocean current is deter- Where X * is the jellyfish currently with the best location in the swarm, β > 0 is a distribution coefficient, related to the length of − −− → trend. µ is the mean location of all jellyfish. And the new location of each jellyfish is given by Eq. 2 2) Jellyfish swarm The motion of jellyfish in swarm motions is passive motion (type A) and active motion (type B). Most jellyfish exhibits type A motion initially when the swarm has just been formed. They are gradually displaying type B motion over time. Type A motion is the movement of jellyfish around their own locations and Eq. 3 gives the corresponding updated location of each jellyfish.
Where U b is the upper bound and L b is the lower bound of search spaces, γ > 0 is a motion coefficient, related to the length of motion around jellyfish's locations. To simulate type B (active motion), either a jellyfish i moves toward a jellyfish j or moves away; a jellyfish j other than a jellyfish i is randomly chosen and a vector from the jellyfish i to the chosen jellyfish j is used to determine the direction of motion. When the quantity of food at the location of the chosen jellyfish j exceeds that at the location of the jellyfish i, a jellyfish i moves toward a jellyfish j. And if the quantity of food available to the chosen jellyfish j is lower than that available to a jellyfish i, it moves directly away from it. So, every jellyfish moves in a better direction to find food in a swarm. Eqs. 4 and 5 simulate the direction of motion and the updated location of a jellyfish, respectively.  where f is an objective function of location x.

3) Time control mechanism
The time control mechanism is used to determine the type of motion (type A and type B motions in a swarm) over time; also controls the motions of jellyfish toward an ocean current. The time control mechanism is introduced to regulate the motion of jellyfish between following the ocean current and moving inside a swarm of the jellyfish, the time control mechanism consists of a time control function c(t) and a constant C 0 . The time control function is a random value that changes from 0 to 1 over time. Eq. 6 calculates the time control function, when its value increases C 0 , the jellyfish follow the ocean current. When its value is smaller than C 0 , they move inside a swarm. An equal C 0 value is not known, and the time control changes randomly from zero to one. Hence, C 0 is set to 0.5, which is the average of zero and one.
Where M ax iter is the maximum number of iterations, an initialized parameter and t is the time specified as the iteration number.

4) Boundary conditions
When a jellyfish travels outside the bounds of the search area, it will return to the opposite bound according to Eq. 7.
Where X i,d is the location of the i th jellyfish in d th dimension; X i,d is the updated location after checking boundary constraints. U b,d and L b,d are upper and lower bounds of the d th dimension in search spaces, respectively. The JS algorithm is summarized as follows in Algorithm 1.

D. EVALUATION METRICS
Either negative or positive is the outcome in a two-class classification problem. The function of the classifier assigns N negative or P positive labels to each sample. False negatives, false positives, true negatives and true positives are the only Algorithm 1 : Jellyfish Searching Algorithm Step 1: Start Step 2: for (i = 1 : n pop (population size)) do Calculate objective function f (x i) for each x i end for Step3: for (i = 1 : n pop (population size)) do Calculate the time control c(t) using Eq. 6 if c(t) ≥ 0.5 then jellyfish follow the ocean current then Determine ocean current using Eq. 1 New location of jellyfish is defined by Eq. 2 else if (rand(0, 1) > (1 − c(t))) then jellyfish moves inside the swarm. then jellyfish exhibits type A motion The new location of jellyfish is defined by Eq. 3 else if jellyfish exhibits type B motion. then Determine the direction of jellyfish using Eq. 4 The new location of jellyfish is defined by Eq. 5 end if end if end if Step4:Check boundary conditions and calculate f (x) at new location.
Step5:Update the location of jellyfish x i and the location of jellyfish with the most f (x)(X * ). end for for i do Step6:Update the time: t = t + 1 end for if (t ≤ M ax iter ) then go to Step 3. then Print output end if Step 7: End. possible outcomes in a binary classifier. True positive occurs when the predicted value and actual value both are positives. It is called false positive when the actual value is negative. If the actual and predicted values both are negative, that results in true negatives. If the real value is positive and the predicted value is negative; then it is termed as false negative. Hence, the performance measures of precision [19]: Accuracy is the most straightforward and impulsive way to evaluate a classifier. Yet, it is known for certain that accuracy is an unacceptable measure for evaluating a classification model in class imbalance problems. Precision is specified as the proportion of relevant samples; the classifier's capability of selecting the correct samples is measured by the recall function or TPR (True Positive Rate), recall and precision tend to suppress each other. TNR (True Negative Rate) is the proportion of actual negatives which are correctly identified as such. F_measure is an index that calculates the harmonic mean of recall and precision. The performance measures of precision, recall, and f-measure are used to express the classifier performance in a single class. In addition, the measures of precision, recall and f-measure are inadequate indicators of performance for datasets with rare target classes [14] since the datasets containing a small percentage of minority class samples can generate an unstable precision. Gmean is defined as the geometric mean value of TPR and TNR. AUC is the region under the Receiver Operating Characteristics (ROC) [20] graph plotted on a two-dimensional graph, with pairs of true positive rate TPR over false positives rate FPR. Gmean and AUC are reliable performance measures; they has been commonly used to measure classifier performance regardless of the degree of class imbalance. Besides the above metrics, the Wilcoxon signed-rank (WSR) test is also applied to assess how remarkable the performance variance between any two of the used methods. According to the WSR test, a value T is compared to a corresponding value existed in the WSR table [27]. Here T is equivalent to the minimum of the gross of positive and negative ranks min{ +Rank, −Rank} as shown ahead in section IV.

III. THE PROPOSED METHOD
The proposed methods based mainly on single-point crossover and jellyfish search to overcome class imbalance classification by resampling the training data. They primarily consist of two stages: first, it rebalances an imbalanced dataset by oversampling algorithm using the single-point crossover to generate the new data of minority classes, second, it finds an optimal subset of the balanced and imbalanced dataset by jellyfish search. A solution to alleviate the imbalanced classification problem is to minimize the difference between classes. Oversampling in this study utilized as the first proposed method to reproduce the instance of the minority class using the single-point crossover. The second proposed method uses the JS algorithm not for undersampling the majority class, but for selecting the optimal instances of both classes from the training set. The proposed method starts the search with a randomly generated population for selecting optimal instances. The used encoding type for the representation scheme is the binary encoding. In this type of encoding, each search particle is represented as a binary elements vector, and the data instances are treated as either '1' for present or '0' for absent. Here the ones represent the remained instances while zeros present the removed instances. The JS-based proposed model is applied to the dataset before and after the oversampling to find the powerful method for dealing with the imbalance problem as shown in Fig.3.

IV. EXPRIMENTAL RESULTS
The experimental results of this paper framework go in two directions. The first one proposes the Crossover for oversampling data to rebalance the imbalanced datasets and compares its performance with other well-known oversampling methods: SMOTE and SL-SMOTE. The second direction proposes the JS optimization algorithm paired with different oversampling techniques: Crossover, SL-SMOTE, and SMOTE to check if it can improve the performance of existing oversampling algorithms and also applied on the original imbalanced data. The focus of this work is to differentiate between different resampling methods. Hence, the classification method was unified to have a more specific result. This framework uses the SVM for classification.The proposed framework is applied over eighteen real data sets, selected from the UC Irvine Machine Learning Repository [28]. These datasets are different in a number of features, sizes, and imbalanced proportions. Since some of the original datasets have more than two classes, they are changed to twoclass datasets to fit our focal pint. Table 1 shows the minority and majority classes of 18 datasets with some other characteristics, including number of samples, number of features, and imbalanced proportion [2].

A. SYSTEM SPECIFICATIONS
The platform adopted to conduct the experiments on the proposed methods is a laptop with the features: Intel(R) Core VOLUME 4, 2016 (TM) i7-3632QM CPU @ 2.20 GHz, 6G RAM, an operating system Windows 10 using MATLAB version R2020. Table 2 holds the performance measures, sensitivity, specificity, precision and f-measure of applying the SVM classifier over the original datasets and the Crossed-over ones. The better values are highlighted in bold, In which the proposed method gives the better performance. In expansion, the outcomes of the original dataset before oversampling are given as an indication hint. Table 3 presents the results of using the SVM classification algorithm over balanced data by using different oversampling techniques, including the proposed method (Crossover) and Original, SMOTE and SL-SMOTE. The table presents metrics like average accuracy (Accuracy), G-mean, and AUC to measure the performance of the used methods. Accuracy is mentioned here as a reference not considered as an assessment for imbalanced data classification. The better results are highlighted in bold and indicates that the proposed method has the stronger influence. Fig. 4 illustrates a comparison between the proposed method's performance for oversampling (Crossover) and other oversampling methods mentioned in table 3 by showing the count number of datasets with the highest metrics values (Accuracy, G-mean and AUC). The figure shows that using the proposed method gives the better results and the highest dataset count 12, 14 and 14 for Accuracy, G-mean and AUC respectively. Table 4 holds the performance measures, sensitivity, specificity, precision and f-measure of applying JS optimization algorithm with the SVM classifier over original, Crossedover, SOMTE and SL-SMOTE data. Also, better measures values are highlighted in bold. Table 5 have the comparison between applying JS optimization algorithm with the SVM classifier over original, Crossed-over, SMOTE and SL-SMOTE data (JS-Original, JS-Crossover, JS-SMOTE and JS-SL-SMOTE). The best metrics values are bolded and indicate that applying the Crossover over data gives mostly the higher performance as explained ahead using WSR. Table 6 displays the average test results of AUC for Crossover vs. JS-Crossover and Crossover vs. JS using the SVM classifier. Crossover has a better positive variance than Original in 15 datasets, whereas JS-Crossover has a better negative variance than Crossover in three datasets. The gross of all positive ranks +Rank=140 and the gross of negative ranks −Rank =31. The T value should be ≤ 40 at 0.05 level since 18 datasets are used. Hence, Crossover performs better than JS-Crossover as T=min{ +Rank, −Rank}=min{140, 31}= 31<40. By the same token, Crossover performs better than JS as T=min{ +Rank, −Rank} =min{133, 38}=38<40. Table 7 displays the average test results of AUC for Crossover vs. SMOTE and Crossover vs. SL-SMOTE using the SVM classifier. Crossover has a better positive variance than SMOTE in 17 datasets, whereas SMOTE has better negative variance than Crossover in one dataset. The gross of all positive ranks +Rank =165 and the gross of negative ranks −Rank =6. Hence, Crossover performs better than SMOTE as T=min{ +Rank, −Rank}=min{165, 6}=6<40. By the same token, Crossover performs better than SL-SMOTE as T=min{ +Rank, −Rank}=min{154, 17}=17<40. Except, Crossover has better positive variance than SL-SMOTE in 16 datasets, while SL-SMOTE has a better negative variance than Crossover in two datasets. Table 8 shows the test results of average AUC for Crossover vs. JS-SMOTE and Crossover vs. JS-SL-SMOTE using the SVM classifier. Crossover has a better positive variance than JS-SMOTE in 13 datasets, whereas JS-SMOTE has a better negative variance than Crossover in five datasets. The gross of all positive ranks +Rank=132 and the gross of negative ranks −Rank=39. The T value should be ≤ 40 at 0.05 level since 18 datasets are used. Hence, Crossover performs better than JS-SMOTE as T=min{ +Rank, −Rank}=min{132, 39}=39<40. By the same token, Crossover performs better than JS-SL-SMOTE as T=min{ +Rank, −Rank}=min{162, 9}=9<40. Except, Crossover has a better positive variance than JS-SL-SMOTE in 17 datasets, while JS-SL-SMOTE has a better negative variance than Crossover in one dataset. Table 9 shows the test results summary of AUC for Crossover vs. JS-Crossover, JS, SMOTE, JS-SMOTE, SL-SMOTE and JS-SL-SMOTE using SVM. Illustrates that applying Crossover for oversampling data gives better classification results. Table 10 presents the average test results of G-mean for Crossover vs. JS-Crossover and Crossover vs. JS using the SVM classifier. Crossover has a better positive variance than JS-Crossover in 12 datasets, while JS-Crossover has a better negative variance than Crossover in three datasets. The gross of all positive ranks +Rank=124 and the gross of negative ranks −Rank=31. The T value should be ≤ 40 at 0.05 level since 18 datasets are used. Hence, Crossover performs better than JS-Crossover as T=min{ +Rank, −Rank}=min{124, 31}=31<40. By the same token, Crossover performs better than JS as T = min{ +Rank, −Rank}=min{145, 26}=26< 40. Except, Crossover has a better positive variance than JS in 14 datasets, while JS has a better negative variance than Crossover in four datasets. Table 11 shows the test results of average G-mean for Crossover vs. SMOTE and Crossover vs. SL-SMOTE using the SVM classifier. Crossover has a better positive variance than Original in 17 datasets, while SMOTE has a better negative variance than Crossover in one dataset. The gross of all positive ranks +Rank=158 and the gross of negative ranks −Rank=13. Hence, Crossover performs better than SMOTE as T=min{ +Rank, −Rank}=min{158, 13}=13<40. By the same token, Crossover performs better than SL-SMOTE as T=min{ +Rank, −Rank}=min{162, 9}=9<40. Table 12 displays the average test results of G-mean for Crossover vs. JS-SMOTE and Crossover vs. JS-SL-SMOTE using the SVM classifier. Crossover has a better positive variance than JS-SMOTE in 14 datasets, whereas JS-SMOTE has a better negative variance than Crossover in four datasets. The gross of all positive ranks +Rank=131 and the gross of negative ranks −Rank=40. Hence, Crossover performs better than JS-SMOTE as T=min{ +Rank, −Rank}=min{131, 40}=40. By the same token, Crossover performs better than JS-SL-SMOTE as T=min{R + , R − }=min{162, 9}=9<40. Except, Crossover has a better positive variance than JS-SL-SMOTE in 17 datasets, while JS-SL-SMOTE has a better negative variance than Crossover in one dataset. Table 13 has the summary test results of G-mean for Crossover vs. JS-Crossover, JS, SMOTE, JS-SMOTE, SL-SMOTE and JS-SL-SMOTE using SVM. It illustrates that applying Crossover for oversampling data gives better classification results. Table 14 displays the average test findings of AUC for JS-Original vs. Original and JS-Crossover vs. Crossover using the SVM classifier. JS-Original has a better positive variance than Original in 16 datasets, whereas Original has a better negative variance than JS-Original in two datasets. The gross of all positive ranks +Rank=168 and the gross of negative ranks −Rank=3. Hence, JS-Original performs better than original as T=min{ +Rank, −Rank}=min{168, 3}=3<40. Whereas, Crossover performs better than JS-Crossover as T=min{ +Rank, −Rank}=min{40, 131}=40. Crossover has a better positive variance than JS-Crossover in 15 datasets, whereas JS-Crossover has a better negative variance than Crossover in three datasets. Table 15 shows the test results of average AUC for JS-SMOTE vs. SMOTE and JS-SL-SMOTE vs. SL-SMOTE using the SVM classifier. JS-SOMTE has a better positive variance than SMOTE in 15 datasets, while SMOTE has a better negative variance than JS-SOMTE in three datasets. The gross of all positive ranks +Rank=155 and the gross of negative ranks −Rank=16. Hence, JS-SMOTE performs better than SOMTE as T=min{ +Rank, −Rank}=min{155, 16}=16<40. By the same token, JS-SL-SMOTE performs better than SL-SMOTE as T=min{R + , R − }=min{141, 30}=30<40. Except, JS-SL-SMOTE has a better positive variance than SL-SMOTE in 13 datasets, whereas SL-SMOTE has a better negative variance than JS-SL-SMOTE in five datasets. Table 16 has the test findings summary of AUC for JS-Original vs. Original, JS-Crossover vs. Crossover, JS-SMOTE vs. SMOTE and JS-SL-SMOTE vs. SL-SMOTE using SVM. It illustrates that applying JS over oversampling methods improves the classification results. Table 17 displays the average test findings of Gmean for JS-Original vs. Original and JS-Crossover vs. Crossover using the SVM classifier. JS-Original has a better positive variance than Original in 13 datasets, whereas original has a better negative variance than JS-Original in three datasets. The gross of all positive ranks +Rank=135 and the gross of negative ranks −Rank=17. Hence, JS-Original performs better than original as T=min{ +Rank, −Rank}=min{135, 17}=17<40. Whereas, Crossover performs better than JS-Crossover as T=min{ +Rank, −Rank}=min{28, 143}=28. Crossover has a better positive variance than JS-Crossover in 15 datasets, whereas JS-Crossover has a better negative variance than Crossover in three datasets. VOLUME 4, 2016 Table 18 has the test results of average G-mean for JS-SMOTE vs. SMOTE and JS-SL-SMOTE vs. SL-SMOTE using the SVM classifier. JS-SOMTE has a better positive variance than SMOTE in 15 datasets, whereas SMOTE has a better negative variance than JS-SOMTE in three datasets. The gross of all positive ranks +Rank=152 and the gross of negative ranks −Rank=19. Hence, JS-SMOTE performs better than SMOTE as T=min{ +Rank, −Rank}=min{152, 19}=19<40. By the same token, JS-SL-SMOTE performs better than SL-SMOTE as T=min{ +Rank, −Rank}= min{126, 33}=33<40. Except, JS-SL-SMOTE has a better positive variance than SL-SMOTE in 13 datasets, whereas SL-SMOTE has a better negative variance than JS-SL-SMOTE in four datasets. Table 19 shows the test results summary of G-mean for JS-Original vs. Original, JS-Crossover vs. Crossover, JS-SMOTE vs. SMOTE and JS-SL-SMOTE vs. SL-SMOTE using SVM. It illustrates that applying JS over oversampling methods improves the classification results.

B. ANALYSIS AND DISCUSSION
The overall results indicate that using Crossover for imbalanced data oversampling produces a higher classification result, according to Wilcoxon signed-rank test. Also, the test results illustrate that using the JS optimization method enhances the results of the used oversampling techniques along with the original data but not with the Crossover technique.

C. COMPARITAVE ANALYSIS
The experiment is applied on eighteen real datasets using support vector machine (SVM) for classification and crossvalidation for oversampling. The results of the experiment are compared with the best results obtained by [2]. Authors in [2] proposed a novel hybrid algorithm named Ant Colony Optimization Resampling (ACOR) to improve the class imbalance classification. In their algorithm four traditional oversampling methods -SMOTE [9], BSO [16], ROS [36], and ADASYN [1]-had been used first to rebalance the imbalanced datasets; subsequently, the ACO algorithm, had been applied to the resampled datasets to find an optimal subset from the obtained balanced training dataset. To validate their results, they used three classifiers naive Bayes [10], C4.5 [11], and support vector machine with radial basis function kernel (RBF-SVM) classifier [13]. Table 20 holds a comparison between the performance of the proposed method and the best results of state-of-the-art methods in [2], ACOR, SMOTE, BSO, ROS and ADASYN. The proposed method achieves the higher accuracy for 12 datasets, while ACOR, SMOTE and BSO are higher in 3, 3, 1 dataset respectively. The G-mean metric has the higher values with the proposed method in 14 datasets and the AUC in 8 datasets. Fig. 5 has the visualization of the previous analysis.

V. CONCLUSION
To handle solution for imbalanced data classification problem, several strategies have been introduced based on sampling preprocessing. The essential key of these strategies is to rebalance the imbalanced data using robust techniques. This research proposes two approaches; the first introduces the JS as an algorithm for sampling imbalanced data also, as a technique for improving the performance of the existing oversampling techniques like SMOTE and SL-SMOTE. The second approach proposes the Crossover as an oversampling technique. The experimental results demonstrate that the implementation of JS-SMOTE, JS-SL-SMOTE, and JS-Original gives higher performance than SMOTE, SL-SMOTE, and Original. Also, according to the results, the Crossover as an oversampling technique gives the highest performance.