Text classification is the process of determining categories or tags of a document depending on its content. Although it is a well-known process, it has many steps that r...Show More
Metadata
Abstract:
Text classification is the process of determining categories or tags of a document depending on its content. Although it is a well-known process, it has many steps that require tuning to have better mathematical models. In this context, as an agglutinative language, especially the Turkish text classification process requires some extra tuning and preprocessing steps. This paper proposes a methodology and expresses key-points for tuning the Turkish text classification process using supervised machine learning algorithms. For this purpose, we perform intensive experiments on an open Turkish news dataset. Our study shows that our methodology improves categorization results based on F1-score.
As a sub-category of information retrieval, text classification is the automated process that uses natural language processing as an essential method in diverse domains.