Skip to Main Content
Text classification has gained booming interest over the past few years. As a simple, effective and nonparametric classification method, KNN method is widely used in document classification. However, the uneven distribution in training set will affect the KNN classified result negatively. Moreover, the uneven distribution phenomenon of text is very common in documents on the Web. To tackling on this, this paper proposes an improved KNN method denoted by DBKNN. Experimental results show that the DBKNN algorithm can better serve classification requests for large sets of unevenly distributed documents.