Skip to Main Content
In Internet traffic classification, the class imbalance problem is mainly addressed by adjusting the class distribution. In the meanwhile, feature selection is also a key factor evoking this problem. Therefore a new filter feature selection method called balanced feature selection (BFS) is proposed. Every feature is measured both locally and globally and then an optimal feature subset is selected by our search model. A certainty coefficient is presented to measure the correlation between a feature and a certain class locally. The symmetric uncertainty is utilised to measure a feature and all classes globally. Through experiments on two real traffic traces using three classification algorithms, BFS is compared with five existing feature selection methods. Results show that it outperforms others by more than 15.29% g-mean improvement. Classification results are averaged over all datasets and classifiers here, 59.54% g-mean, 86.35% Mauc and 91.42% overall accuracy are achieved, respectively, when it is used.