Abstract:
Text classification, a well-known Natural Language Processing (NLP) task, can be defined as the process of categorizing documents according to their content. In this proc...Show MoreMetadata
Abstract:
Text classification, a well-known Natural Language Processing (NLP) task, can be defined as the process of categorizing documents according to their content. In this process, the selection of classification algorithms and the determination of the correct variables for classification are very important for an efficient classification. The texts to be classified in this study are first preprocessed using the IG (Information gain) method, taking into account the Tf (Term frequency) and Idf (Reverse document frequency) values, and then they are divided into different categories using the DPC (Clustering Density Peaks) algorithm which is a semi-supervised algorithm. In the study, TTC-3600 dataset, which includes texts obtained from 6 well-known Turkish news portals and 6 different fields, was used. The study performed better than the previous results in the selected dataset.
Date of Conference: 05-08 July 2023
Date Added to IEEE Xplore: 28 August 2023
ISBN Information:
Print on Demand(PoD) ISSN: 2165-0608