Analyzing Extractive Text Summarization Techniques and Classification Algorithms: A Comparative Study | IEEE Conference Publication | IEEE Xplore

Analyzing Extractive Text Summarization Techniques and Classification Algorithms: A Comparative Study


Abstract:

The increasing volume of text data has made it difficult for individuals to keep up with the information. Text summarization and classification are two important natural ...Show More

Abstract:

The increasing volume of text data has made it difficult for individuals to keep up with the information. Text summarization and classification are two important natural language processing (NLP) techniques that can help address this challenge. This paper presents a system that performs both text summarization and classification on news articles. The system first collects a large corpus of news articles and removes noise and irrelevant information. The text is then classified into predefined categories based on its content. Next, an extractive summarization algorithm is used to select the most important sentences from the input text. Finally, the Natural Language Processing (NLP) model converts the summary into a meaningful text. The system has achieved high accuracy for Random Forest classification. The results show that the system can effectively summarize and classify news articles. The system uses three different extractive summarization algorithms: TextRank, word based frequency, and TF-IDF clustering. The results show that TextRank is the best performing algorithm in terms of ROUGE score. ROUGE score is a metric that is used to evaluate the quality of a text summary. The higher the ROUGE score, the better the quality of the summary. The system is implemented using Python and the NLP library. The system is available as an open-source paper on GitHub.
Date of Conference: 27-29 January 2024
Date Added to IEEE Xplore: 30 April 2024
ISBN Information:
Conference Location: Bhubaneswar, India

References

References is not available for this document.