Skip to Main Content
Given the importance of organizing and managing the rapid growth in knowledge of Arabic electronic content, this study introduces the Weirdness Coefficient (W) as a new feature selection method for Arabic special domain text classification. The proposed method was used to classify a dataset comprising five Islamic topics using Naive base (NB) and K-nearest neighbor (K-NN) classifiers, and three representation schemas. The results were also compared with a well-known feature selection method, Chi-squared. In addition to its simplicity in computation, the Weirdness Coefficient showed promising classification accuracy.