Skip to Main Content
Classification plays a vital role in many information management and retrieval tasks. This paper studies classification of text document. Text classification is a supervised technique that uses labeled training data to learn the classification system and then automatically classifies the remaining text using the learned system. In this paper, we propose a mining model consists of sentence-based concept analysis, document-based concept analysis, and corpus-based concept-analysis. Then we analyze the term that contributes to the sentence semantics on the sentence, document, and corpus levels rather than the traditional analysis of the document only. After extracting feature vector for each new document, feature selection is performed. It is then followed by K-Nearest Neighbour classification. The approach enhances the text classification accuracy.
Date of Conference: 22-24 Feb. 2012