Multi-Document Summarization Made Easy: An Abstractive Query-Focused System Using Web Scraping and Transformer Models | IEEE Conference Publication | IEEE Xplore

Multi-Document Summarization Made Easy: An Abstractive Query-Focused System Using Web Scraping and Transformer Models


Abstract:

The paper proposes a web-based abstractive query-focused multi-document summarization system that aims to simplify the process of summarizing multiple documents on a give...Show More

Abstract:

The paper proposes a web-based abstractive query-focused multi-document summarization system that aims to simplify the process of summarizing multiple documents on a given topic. The system leverages a range of technologies and techniques, including web scraping, natural language processing, and transformer models, to automate the summarization process and improve the accessibility of information for users. The system is designed to take user input in the form of a query, the number of words to be summarized, and the number of documents to be referred to. It then utilizes Google search engine API integration to retrieve the most relevant webpages based on their ranking, and performs web scraping of

tags using beautiful soup (bs4) and selenium frameworks. The scraped data undergoes pre-processing, including stop word removal, tokenization using Auto tokenizer, and visualizing frequency matrix and word-cloud plots with seaborn and matplotlib. The system employs a transformer model ‘mt5-small Pretrained’ as the pipeline summarizer. The transformer model ranks the words based on frequency and generates a summary of the text that is coherent, concise, and relevant to the user’s query. The system delivers the output in the form of a well-structured summary that captures the essential information from multiple documents. The experimental results demonstrate the potential of integrating different technologies and techniques to automate the summarization process and provide users with high-quality summaries of multiple documents on a given query.

Date of Conference: 23-25 June 2023
Date Added to IEEE Xplore: 07 August 2023
ISBN Information:
Conference Location: Hubli, India

Contact IEEE to Subscribe

References

References is not available for this document.