Abstract:
In this paper, we propose a method for Thai text summarization by paragraph extraction based on the extracted Thai compound nouns and term weighting method in terms of te...Show MoreMetadata
Abstract:
In this paper, we propose a method for Thai text summarization by paragraph extraction based on the extracted Thai compound nouns and term weighting method in terms of term frequency inverse document frequency (TF/spl middot/IDF). According to the highly frequent and highly productive of Thai compound nouns in Thai text, this property shows that Thai compound nouns play the important role in summarization. The morphological analysis is used to determine Thai compound nouns and all paragraphs are ranked by summation of term weighting score. The cosine similarity between each paragraph is calculated in order to select the important paragraphs among all paragraphs. The result shows that 0.469 F-score for 45% summary of our proposed method yield the most effective approach among all experiments.
Published in: 2005 International Conference on Natural Language Processing and Knowledge Engineering
Date of Conference: 30 October 2005 - 01 November 2005
Date Added to IEEE Xplore: 27 February 2006
Print ISBN:0-7803-9361-9