Feature Based Automatic Text Summarization Methods: A Comprehensive State-of-the-Art Survey

With the advent of the World Wide Web, there are numerous online platforms that generate huge amounts of textual material, including social networks, online blogs, magazines, etc. This textual content contains useful information that can be used to advance humanity. Text summarization has been a significant area of research in natural language processing (NLP). With the expansion of the internet, the amount of data in the world has exploded. Large volumes of data make locating the required and best information time-consuming. It is impractical to manually summarize petabytes of data; hence, computerized text summarization is rising in popularity. This study presents a comprehensive overview of the current status of text summarizing approaches, techniques, standard datasets, assessment criteria, and future research directions. The summarizing approaches are assessed based on several characteristics, including approach-based, document-number-based, Summarization domain-based, document-language-based, output summary nature, etc. This study concludes with a discussion of many obstacles and research opportunities linked to text summarizing research that may be relevant for future researchers in this field.


I. INTRODUCTION
The World Wide Web (WWW) has become an immense information resource. Today, some websites generate more data every day than was produced in the previous ten years combined. However, the majority of the data generated by these websites is irrelevant, redundant, and loud, masking the most pertinent information. In addition, users must explore several files and web pages to find the information they seek. It wastes the time of many users. A strong document summary can fix the aforementioned issue. If every online page provided a concise summary of its content, it would save time for many users and boost website engagement. However, it is not possible to manually summarize each web page on the World Wide Web. Automated text summarization (ATS) The associate editor coordinating the review of this manuscript and approving it for publication was Seifedine Kadry . technologies can resolve the issue. Consequently, ATS has become a focus of NLP study.
ATS systems are designed to accomplish objectives like as extracting the most important and relevant information from a document, generating summaries that are much shorter than the original content, etc. The ATS systems can be categorized generally into one of the following categories: a. Single-document summarization system:This type generates a single summary for a single document. b. Multi-document summarization system:The generation of a single summary for multiple documents is performed in this type.
These systems are more susceptible to duplication and inaccuracy due to the fact that various documents may contain identical sentences representing different information (inaccuracy) and different sentences representing identical information (redundancy).
There are three primary methods for generating summaries: a. Extractive approach: In this approach, important sentences from a document are picked and combined to generate a final summary. Major steps in an extractive approach include: i. Document pre-processing ii. Create a provisional representation of the document iii. Score sentences according to their retrieval value iv. Select the sentences with the highest scores. b. Abstractive approach:This strategy seeks a much deeper comprehension of the document. Instead than selecting meaningful sentences directly, it generates new sentences that convey the same information using natural language processing algorithms. Important steps in an abstractive approach include: i. Preprocessing the document ii. Making an Intermediate representation of the document iii. Generating new sentences based on IR. c. Hybrid approach: This approach combines both the abstractive and the extractive approaches to generate the summary. Automatic Text summarization is one of the most challenging areas of text and data mining. There are numerous obstacles associated with developing high-quality automated summaries mentioned as below: (i) Redundancy:Most ATS systems generate phrases with similar informational content. Because the size of the summaries is limited, more valuable and diverse information-carrying sentences may not be included in the summary. It may result in the loss of crucial information. (ii) Time-zones for multi-document summarization: Different documents in a dataset can belong to different time zones. Hence, they might use temporal words to convey different meanings. It is a big challengein the multi-document summarization. (iii) Generating short summaries for very large documents like novels, books, etc. (iv) Generated summaries may not maintain a proper flow. This is more significant in extractive text summarization. These significant challenges in text summarization are the focus of intense research. Nevertheless, certain models perform better than others in certain criteria, such as abstractive summarizers' ability to maintain a decent flow and decrease repetition, but they cannot solve the remaining problems.
Numerous research articles have been published on this subject. Survey papers are vital for imparting concept knowledge to a novice audience and offering information on current trends and future horizons in a single document. Some survey papers covered a specific subdomain of text summarization: Jain et al. [1] surveyed on legal document summarization; Al-Saleh and Menai [2] on Arabic text summarization techniques; Kumar et al. [3] on multilingual text summarization; and some studies ( [4], [5]) attempted to provide an overview of the entire field of text summarization.
Existing survey articles, however, do have limitations. Either the information covered is minimal ( [1], [2], [6]), the articles examined are outdated and do not address the most recent developments in this subject, or the information supplied is difficult to comprehend. By presenting a succinct, up-to-date, and comprehensible overview of the topic of text summarizing, this survey paper overcomes these drawbacks of prior publications.
In this paper, we explore the various classifications of Text summarizing approaches based on several parameters such as methodology, document count, language, etc. We also briefly address investigations undertaken within each classification. We listed the outcomes, benefits, and drawbacks of each study. Finally, we present a comprehensive review of the performance of various approaches on prominent datasets. However, a comprehensive analysis of each study is outside the scope of this work. In addition, this paper discusses the most popular and effective methodologies, as a comprehensive treatment of all approaches would exceed its scope.
The main contributions of this study are as follows: • Provides a tabular and comprehensive analysis of different studies, making it easy for the reader to compare and evaluate various methodologies • Describes the benefits and drawbacks of each study analysed in this paper.
• Offers a comprehensive analysis of numerous strategies and their performance on popular datasets.
• A comprehensive discussion of future horizons, recommended methods, and research directions.
The flow of this paper is explained in the diagram in FIGURE 1. This paper's body is divided into numerous sections. The first section provides a quick bibliographic analysis of the growing interest in the topic and identified tendencies. The classification of text summarizing algorithms based on various factors is discussed in Section 2. Section 3 enumerates the assessment criteria employed by various studies to compare and contrast their systems with those of others. Section 4 provides a listing of the significant datasets utilized  in the research described in section 2. Section 5 demonstrates alternative methods of classifying ATS. In section 6, we conduct a comprehensive examination of the prevalent strategies for text summarizing and provide some observations on the enhancements and results gained by other investigations. The seventh segment discusses the difficulties of text summary, followed by a conclusion in the last portion.

II. A BRIEF BIBLIOMETRIC STUDY ON THE EVOLUTION OF THE FIELD
Following is a brief literature overview demonstrating how interest in the topic has progressed (figure 2). Following this is a classification of approaches according to the approach taken by the various summarization systems.
Regarding the academic interest generated by reputable publications, a study of the works published in the past few years is informative. FIGURE 2 depicts the primary approach used to classify and analyze scientific papers. This diagram's sequence is based on the principles provided in [7] and [8].
A search was conducted in the Web of Science (WoS) and Scopus databases to determine the evolution of the works published in the field. Their selection reflects the fact that they are the data sources with the most extensive coverage and the greatest prevalence in bibliometric research. Both resources are complimentary because their geographical scopes and journal collections are distinct [9]. In addition, the journals included in these databases are chosen based on their quality and influence. Given that we aim to study the current trajectory in computing, we restricted our search to the years 2011 through 2021. The executed queries are listed in TABLE 1. Figures 3 and 4 reveal a strong upward trend that has become more evident since 2018. In the previous two years, there has been a slowdown, although this may be related to the time required to update the database's publications. The trend of citations is quite progressive, indicating a focus on the achievements made in ATS during the period. In fact, the h-index in WoS is 39 while in Scopus it is 56.
Most systems are language-dependent, and the dearth of native speakers or digital resources in certain languages impedes study. Analyzing the summaries, titles and keywords in Scopes show that most of the language's studies are amongst the most spoken languages in the world (TABLE 2).
Several observations can be made regarding the number of works published in the various languages: 1) the number of works does not reflect the number of speakers; for example, Nigerian pidgin is the 14th most spoken language, but it is not mentioned in the results; 2) There are languages among the 30 more spoken that have no study, such as Cantonese, Tagalog, Hausa, Swahili, Nigerian, and Javanese; 3) Indian languages are well represented: Bengali (28), Hindi (17), Punjabi (8), Kannada (8), Telugu (7), KonKonkani (5), Assamese (4), Tamil (2), or Marathi (2). However, the representation of Hindi, the third-most-spoken language, is inadequate, and other languages, such as Nepali, are not mentioned.

III. EVALUATION METRICS
Automatic text summarizing approaches are evaluated using performance measurement measures, as is the case with all other methods. These metrics are discussed in this section.

A. ROUGE (RECALL-ORIENTED UNDERSTUDY OF GISTING EVALUATION)
It is the most popular evaluation metric used in the field of text summarization. ROUGE has four types: a) ROUGE-N: In this metric, N stands for N-grams co-occurrence statistics. It measures the quality of a summary using n-gram recall between the summary and a set of manually generated summaries as shown in Eq. (1).

3) F-MEASURE
It is computed by computing the harmonic mean between precision and recall as shown in Eq. (4).
Summarization evaluation with Pseudo references and BERT (SUPERT) is an un-supervised summary evaluation metric for evaluating multi-document summary by measuring the semantic similarity between the summary and the pseudo reference summary. SUPERT was made by [81]. The limitation of ROUGE is that it needs manual summaries to judge the quality of a summary. SUPERT can be used to summarize a dataset that does not have manual summaries.

IV. DATASETS FOR TEXT SUMMARIZATION
In this section, we discuss about the popular dataset, used for text summarization methods among researchers.

A. DOCUMENT UNDERSTANDING CONFERENCES (DUC)
The National Institute of Standards and Technology (NIST) provides these groups of datasets. DUC is part of a Defense Advanced Research Projects Agency (DARPA) program, Translingual Information Detection, Extraction, and Summarization (TIDES), explicitly calling for major advances in summarization technology. The datasets consist of the following parts: • Documents • Summaries, results, etc.
manually created summaries automatically created baseline summaries submitted summaries created by the participating groups' systems tables with the evaluation results additional supporting data and software DUC distributed seven datasets from 2001 to 2007. DUC 2002 is the most popular dataset for extractive summarization ( [23], [38], [56], [70]). These datasets are available at https://duc.nist.gov/data.html.

B. CNN/DAILY MAIL
It contains over 300,000 articles from CNN and Daily Mail. The dataset is generated using a python script available at CNN [71]. The processed version of this dataset is available on GitHub [72]. It is a very popular dataset among extractive ( [58], [73]) and abstractive summarization studies ( [44], [66]).

C. OPINOSIS
It is a dataset constructed from user reviews on a given topic. It is very suitable for semantic analysis and has been used by multiple studies for the same purpose. It consists of 51 topics, with each topic having 100s of review sentences. It also comes with gold standard summaries and some scripts to evaluate the performance of a summarizer using ROUGE metric. The dataset and related material can be downloaded from Opinosis [74]. This dataset was prepared by [45], [75], and [76] for their research.

D. GIGAWORD
This dataset consists of more than 4 million articles. It is a part of TensorFlow dataset collections and is highly popular among abstractive summarization studies [77]. The source code for this dataset is available at Gigaword [78].

E. MEDLINE CORPUS
The MEDLINE corpus is provided by NLM (National Library of Medicine). NLM produces this dataset in the form of XML documents on an annual basis. This dataset can be downloaded from [79]. Shang et al. [59] used this dataset to develop an extractive summarizer.
It is a Chinese text summarization dataset. This dataset consists of 2 million short texts from a Chinese microblogging website Sina Weibo. It is also provided with short summaries for each blog, written by the blog authors. It is a very suitable choice for Chinese abstractive summarization systems as the dataset size is large and it can be used to train neural networks efficiently. Li et al. [77] used this dataset to develop an encoder-decoder based abstractive text summarizer.

G. BC3(BRITISH COLUMBIA UNIVERSITY DATASET)
The corpus is composed of 40 email threads (3222 records) from the W3C corpus. Each thread is commented on by three different commenters.The dataset consists of: (i) Extracted abstracts (ii) Abstract abstracts with linked sentences VOLUME 10, 2022 Yousefi-Azar and Hamey [36] used this dataset to develop a deep learning based extractive text summarizer.

H. EASC (ESSEX ARABIC SUMMARY CORPUS)
This dataset consists of Arabic articles and extractive summaries generated for those articles. It is one of the most popular Arabic datasets used in text summarization. Alami et al. [37] and Elayeb et al. [80] used this dataset for Arabic text summarization.

I. GEOCLEF
GeoCLEF is used in geographical studies. It consists of 169,447 documents; each document consists of stories and newswires from the Los Angeles Times newspaper (1994) and the Glasgow Herald newspaper (1995). It is used by Perea-Ortega et al. [55] for developing a geographical information retrieval system.

V. CLASSIFICATION OF SUMMARIZATION APPROACHES
Based on the summarization approach, text summarization can be further divided into 3 main types: a. Extractive approach b. Abstractive approach c. Hybrid approach The impact of these summarization approaches in the study mentioned above shows a growing of the abstractive types in the last decade ( Table 3) A selection of relevant papers was made based on quality aspects.For each of the approaches to be described below and for each technique applied with that approach, we have selected those articles that, mentioning the technique used in each approach, most clearly and illustratively describe its practical application.
In the remaining of this section, we discuss the classifications of text summarization methods based on different classification parameters. The different classifications of a text summarization system are represented in the FIGURE 5.
In the following subsections, each of these approaches will be discussed.

A. EXTRACTIVE TEXT SUMMARIZATION
In this approach, the most important sentences are selected from documents and then assembled to produce the summary. The typical workflow of the extractive-based approach is: i. Preprocessing ii. Intermediate representation iii. Sentence scoring iv. Summary construction and post-processing The preprocessing and summary construction stages are common for most extractive text summarizers. They are mostly different in terms of techniques for intermediate representation and sentence scoring. Most of the research around extractive text summarization is also focused on these steps. The main extractive text summarization methods are discussed in the following sections.
In the following subsections, we will review the methods employed to each of the main approaches in text summarization. In first place, we will discuss the Extractive Text methods: statistical, topic-based methods, clustering, graph, semantic, machine learning, deep-learning methods, fuzzylogic techniques, and discourse based (RST). Next, we will discuss the Abstractive Text methods: graph based tree-based, domain specific methods, and deep-learning methods, and finally, the Hybrid Text methods.

1) STATISTICAL-BASED METHODS
In these methods, statistical features are used to compute a sentence's importance. Statistical features may include sentence position [10], sentence length, number of proper nouns in the sentence, term frequency [10], and cosine similarity can be used for computing sentence scores [11] as shown in TABLE 4.

2) TOPIC-BASED METHODS
In this approach, the main topics of a document are extracted. Then the sentences are scored based on their coverage of document topics. TF-IDF [6], Term frequency, Document titles [12] can be used to find document topics. Further, Ngram co-occurrence and semantic sentence similarity can also identify document topics [13] as shown in TABLE 5.

3) CLUSTERING-BASED TECHNIQUES
In this method, the sentences are clustered based on some similarity measure. Then a summarizer extracts the most central sentences from each cluster and processes them to generate a summary. Clustering algorithms like k-means ( [14], [15], [16], [17]), k-medoids [18], etc. are used for sentence clustering as shown in TABLE 6.

4) GRAPH-BASED TECHNIQUES
In these methods, the document is represented as a graph of sentences. The sentences represent the nodes. The edges represent the similarity between the nodes. The similarity between words can be represented using some similarity measures like cosine similarity ( [6], [19], [20]). Graph-based techniques are prevalent for extractive summarizers. Popular summarizers such as TextRank [21], LexRank [19] and [22] use a graph-based approach. The sentences are then scored based on the properties of the graph. The summary of such methods is shown in TABLE 7.

5) SEMANTIC-BASED TECHNIQUES
In these methods, sentence semantics are also taken into consideration. LSA (Latent Semantic Analysis), ESA (Explicit Semantic Analysis) and SRL (Semantic Role Labeling) are some ways of doing semantic analysis of textual data. Out of the three, LSA is the most common and is used by most of the studies ( [12], [24], [25], [26], [27]) as show in TABLE 8. Common steps in semantic analysis using LSA is: • Creating a matrix representation of the input.
• Apply SVD (Singular value decomposition) to capture the relationship between individual terms and sentences.

6) MACHINE-LEARNING-BASED TECHNIQUES
Machine learning approaches have gained popularity in recent years. These techniques convert the text summarization problem into a supervised classification problem, in which each sentence is classified as either a 'summary' or 'non summary' sentence. In the end, 'summary' sentences are collected to generate the summary. Rather than defining rules manually, the model is trained on a training set. The set consists of documents and their respective human-generated summaries. Various classification techniques like SVM ( [27], [28], [29]), Naive-Bayes ( [27], [29], [30]), Decision-Trees [30], Ensemble methods ( [27], [31], [32]) and neural-network ( [33], [34], [35]) have been used for text summarization as shown in TABLE 9.

7) DEEP-LEARNING BASED METHODS
Deep learning techniques are getting more and more popular for text summarization. Seq2seq and encoder-decoder based models [36] are used for extractive text summarization. Alami et al. [37] developed deep learning and clusteringbased model for Arabic text summarization. Feed forward neural networks are also being used for extractive summarization [33]. The brief about these methods is shown in TABLE 10.

8) OPTIMIZATION BASED METHODS
In these techniques, the summarization problem is formulated as an optimization problem. The steps involved in an optimization-based technique are as follows: • Preprocessing and converting the document to an intermediate representation.
• Using an optimization algorithm to extract summary sentences from the IR. Multi-Objective Artificial Bee Colony algorithm (MOABC) is the most common optimization algorithm used by many studies ( [28], [38], [39], [40]) as discussed in TABLE 11.

9) FUZZY-LOGIC BASED TECHNIQUES
In these techniques, Fuzzy-logic based systems are used to compute the sentence scores. Fuzzy-logic techniques are popular because we can represent scores more precisely. The typical workflow of a fuzzy-logic based system is given as under: • Extracting meaningful features from a sentence. e.g., sentence length, term-weight etc.
• Using a fuzzy system to provide scores to those features. The score ranges between 0 and 1. Babar and Patil [12], Abbasi-ghalehtaki et al. [28], Azhari and Jaya Kumar [41], and Goularte et al. [42] developed fuzzysystems based text summarizers. Some studies even integrated different domains like cellular learning algorithms [28] and neural networks [41] with the fuzzy systems to further improve the results as shown in TABLE 12.

10) DISCOURSE BASED
Discourse based studies include analyzing bigger language structures like lexemes, grammar and context and their effect on sentence weights. Rhetorical structure theory (RST) has been used widely by multiple studies ( [34], [43]) for discourse analysis and text summarization as shown in TABLE 13. In recent years, it has been observed that machine-learning, deep-learning, rhetorical structure theory and fuzzy-systems based techniques are getting more popular for extractive text summarization. Hence, for future research, these techniques can be explored extensively.
The main advantages and disadvantages of extractive text summarization are pointed out as below: • Extractive summarizers are easier to implement than abstractive summarizers.
• Capture more accurate information as sentences are directly extracted from the document without altering the contents.
• Generate more accurate information as this is not how humans generate summaries.
• Multi-document extractive summarization suffers from sentence redundancy.
• Can mix information from different timelines, resulting in wrong summaries. VOLUME 10, 2022

B. ABSTRACTIVE TEXT SUMMARIZATION
In this approach, the summary is generated in the same way humans summarize documents. The summary does not consist of sentences from the documents, rather new sentences are generated by paraphrasing, merging the sentences of the original document. Abstractive text summarization requires a deeper understanding of the input document, the context and the semantics. It also requires some deeper understanding of In the following subsections, the techniques and methods used in Abstractive Text summarization are discussed.

1) GRAPH-BASED METHODS
In these methods, the individual words are taken as the graph's nodes. The edges represent the structure of the sentence. AMR (Abstract Meaning Representation) graphs are popular graph-based text representation methods. Various sentence generators are integrated with AMR graphs for abstractive text summarization [44]. Ganesan et al. [45] developed a popular text summarizer, Opinosis. The brief about these methods is shown in TABLE 14.
The processing steps of the OPINOSIS model are as follows: • The path in the intermediate is considered as the summary.
• The goal is to find the best path.
• To do this, rank all the paths and sort them based on decreasing scores.
• Use a similarity measure metric (e.g., Cosine similarity) to remove redundant paths.
• The best path is chosen for the summary.

C. TREE-BASED METHODS
In these techniques, parsers convert text documents to parse trees. Then various tree-processing methods like pruning and linearization are used to generate tree summaries. Deep learning models like encoder-decoder neural networks can also be used to generate meaningful information from the parse trees [46]. Techniques like sentence fusion are also used to eliminate redundancy in the generated summary [47]. The further details about these methods are shown in TABLE 15.

D. DOMAIN-SPECIFIC METHODS
Many studies focus on domain-specific text summarizers. These studies can be benefitted by using knowledge dictionaries unique to each domain. In addition, the sentences that do not hold much importance in normal text summarization can be imperative depending on the domain. Sports news may contain some sport-specific keywords that are important to convey the necessary information about a game, e.g., ''out'' in cricket is considered an important word that is more significant than other words like ''high'' Okumura and Miura [48] developed a sports news summarization system utilizing the above domain characteristics. Lee   models is being explored for abstractive text summarization ( [50], [51]). Pre-trained transformers are also used for abstractive text summarization [51] as shown in TABLE 17 The main advantages and disadvantages of extractive text summarization are pointed out as below: •Generate better quality summaries as the sentences are not directly extracted from the document.
•Summaries are safe from plagiarism. •More complex to implement than extractive summarizers. •Captures less information as some of the information can be lost while rephrasing the sentences

1) HYBRID TEXT SUMMARIZATION
In this approach, a hybrid of extractive and abstractive summarizers generates the summary. Generally, hybrid text summarizers generate better quality summaries than extractive summarizers, and they are less complex than abstractive text summarizers. Lloret et al. [52] developed a hybrid summarization system called Compendium Gupta and Kaur [53] developed a machine learning-based model, and Binwahlan et al. [54] developed a fuzzy-systems based hybrid text summarization model. The details about few of such methods are as shown in TABLE 18. Some of the advantages and disadvantages of hybrid text summarization are as shown below: •Generates better quality summaries than pure extractive models.
•It is easier to implement than abstract text summarizers.
•The quality of summaries is less than pure abstractive summarizers.

VI. OTHER CLASSIFICATION CRITERIA
The following classification shows other criteria for classifying scientific papers: a. Classification based on the number of papers: single or multiple. b. Classification according to the domain of the abstract c. Classification based on the number of languages used. d. Classification based on the nature of the output These classifications are discussed and exemplified below.

A. BASED ON THE NUMBER OF DOCUMENTS
The text summarization methods based on the number of documents are classified in different categories as discussed in below sections.

1) SINGLE-DOCUMENT
In this type, the summary is generated for a single document. It is easier than multi-document text summarization as the single document has generally only one topic and is written in a single period. It is less prone to redundancy than multidocument text summarization. Perea-Ortega et al. [55], Sankarasubramaniam et al. [56], Abbasi-ghalehtaki et al. [28], and Alguliyev et al. [14] developed single document text summarizers, as shown in TABLE 19.

2) MULTI-DOCUMENT
In this type, a single summary is generated for multiple documents. It is more complex than single document text VOLUME 10, 2022      summarization as the documents may refer to different periods. In addition, different documents may cover different topics, which makes multi-document text summarization more challenging Ferreira et al. [23], Nguyen et al. [57], Barzilay and McKeown [47], Xu and Durrett [58], and Patel et al. [20] developed multi-document text summarizers as discussed in TABLE 20.

B. BASED ON THE SUMMARIZATION DOMAIN
Based on summarization domain, text summarization is of two types: generic domain-based text summarization and specific domain based text summarization as discussed below:

1) GENERIC DOMAIN TEXT SUMMARIZATION
This type of text summarization is based on without having a specific domain. In this type of summarization, the importance of a sentence, keyword or key phrase depends on its grammatical properties, e.g., proper nouns, numerical terms and references can be given higher importance. It is more common than domain-specific summarization as these algorithms tend to perform well in different domains but may end VOLUME 10, 2022   up losing some important domain information in summary Ferreira et al. [23], Babar and Patil [12], and Al-Maleh and Desouki [50] worked on generic text summarizers as shown in TABLE 21.

2) SPECIFIC DOMAIN TEXT SUMMARIZATION
This type oftext summarization is concerned with a specific domain. In this type, the importance of a sentence, keyword or key phrase depends not only on its grammatical properties but also on its relation to the domain of study. This approach can capture better domain-specific summaries as some keywords, key phrases which are important for some domains, may not hold much importance in others. Shang

C. BASED ON LANGUAGE
Based on language, the text summarization methods are classified in different categories as discussed in the section below: VOLUME 10, 2022

1) MONOLINGUAL
In this type of summarization, the document and the summary are in the same language. Perea-Ortega et al. [55] and Sankarasubramaniam et al. [56] worked on summarizers for the English language Al-Maleh and Desouki [50] worked on the Arabic text summarization, shown in TABLE 23.

D. MULTILINGUAL
In this type of summarization, the document and the summary are written in multiple languages. Rani

E. CROSS LINGUAL
In this type of summarization, the document is of one language and the summary is generated in some other language. Linhares Pontes et al. [64] developed a French to English text summarizer as shown Table 25.

F. BASED ON NATURE OF OUTPUT SUMMARY
Based on the nature of the output summary, the summarization methods are classified in to two categories as discussed below:

1) GENERIC
The output is not influenced by external factors. The generated summary is not controlled by external queries Babar and Patil [12], Gupta and Kaur [53], Sankarasubramaniam et al. [56], and Chatterjee and Sahoo [65] developed non-querybased text summarizers as shown in Table 26.

2) QUERY-BASED
The summary can be controlled using user-defined queries. The summary is generated based on the user The summary can be controlled using user-defined queries. The summary is generated based on the user requirements. This approach is prevalent among search engines depending on the query. Some sentences can have more importance than others. Shang et al. [59], He et al. [66], Salton et al. [67], and Van Lierde and Chow ( [68], [69]) developed query based models for text summarization as shown Table 27.

VII. ANALYSIS OF POPULAR TEXT SUMMARIZATION TECHNIQUES
In this section, we are going to perform a detailed analysis of the various popular text summarization techniques. These techniques have always been a popular choice among researchers as they are well researched, efficient, and have the most tendency to be improvised on. We will also analyze studies incorporating these techniques, their results, and enhancement ideas will also be discussed.

A. K-MEANS CLUSTERING
In this algorithm, an unlabeled dataset is divided into 'k' number of clusters. Items in each cluster have properties similar to each other.
For text summarization, k-means can be used to cluster sentences containing similar information. This can be helpful in removing redundant sentences and improving overall summary quality.
Alguliyev et al. [15] used the K-means algorithms on the DUC 2002 dataset and got a ROUGE-1 score of 0.4727 Mohd et al. [16] employed a k-means-based model on the DUC 2007 dataset and got a ROUGE-1 score of 0.34. This clearly indicates that k-means is a promising technique in text summarization and can produce great results.

B. LSA (LATENT SEMANTIC ANALYSIS)
In this method, a document is first converted into a termto-sentence matrix. This representation can be then used to collect information about the words that occur commonly together. That information can then be used to generate quality summaries. The performance of LSA-based models is further improved using SVD(Singular Value Decomposition).
Babar and Patil [12] used LSA with a fuzzy system model to get a precision of 0.8654.Priya and Umamaheswari [24] used LSA with TF-IDF on a Hotel review dataset to get an accuracy of 0.54. LSA based models can produce significant results, most modern studies are shifting towards neural network-based models. However, an LSA model alongside a neural network-based model can definitely achieve some interesting results.

C. TEXTRANK
In this method, a document is represented in the form of a graph. Each node of the graph represents a word, and the edges between two nodes represent the relationship between two words. It also applies a voting mechanism such that nodes having more incoming edges are given higher ranks. Also, while ranking a node the ranks of the nodes casting the vote are taken into consideration [21].

D. LEXRANK
Like TextRank, it is also a graph-based voting algorithm. In this algorithm, the nodes of the graph are represented by the sentences of the document and the edges VOLUME 10, 2022  represent the similarity between two sentences. It employs a recommendation-based mechanism to compute sentence ranks [19].
Unlike textRank, the edge weights are computed based on some similarity metric (e.g., Cosine similarity), producing better output in some scenarios.

E. MOABC
This algorithm is an enhancement of the popular ABC (Artificial Bee Colony) algorithm. The ABC algorithm is inspired by the natural food searching behavior of honeybees. In the ABC algorithm, the optimization is done in three phases: i. Employed bees: These bees exploit the food source, return to the hive, and report to the onlooker bees. ii. Onlooker bees: These bees gather data from employed bees, then select the food source to gather data from. iii. Scout bees: These bees try to find random food sources for our employed bees to exploit.
This algorithm tries to convert the text summarization problem into an optimization problem, with the best summary representing the global minima. Sanchez-Gomez et al. [40] used MOABC on DUC 2002 dataset to get a 2.23% improvement on ROUGE-2 scores over state-of-the-art methods. Abbasi-Ghalehtaki et al. [28] implemented a MOABC + cellular automation theory-based algorithm on the DUC 2002 dataset to get significant results.

F. MACHINE LEARNING TECHNIQUES 1) LOGISTIC REGRESSION
Logistic regression is a classification algorithm, which is very useful in binary classification i.e., whether the gender of the author is male or female. Unlike linear regression, it models the data using a non-linear function like the sigmoid function It can also be used for classification problems, where the number of classes in the output are gmore than 2. The mathematical expression for the sigmoid function is given in Neto et al. [30] used the logistic regression classifier on the TIPSTER collection. They got a precision of 0.34 for the model. Neto et al. [32] used logistic regression on the EASC (Essex Arabic Summary Corpus) and got a ROUGE-1 score of 0.129.

2) SVM
The main idea behind an SVM classifier is to choose a hyperplane that can segregate n-dimensional data into different classes with minimum overlapping. Support vectors are used to create the hyperplane, hence the name 'Support vector machines'. In an SVM model, the distance between a point x and the hyperplane, represented by (w, b), where, Shen et al. [27] used SVM on the LookSmart web directory along with LSA and achieved significant results. Neto et al. [30] used SVM on the TIPSTER collection to a precision of 0.34.

3) RANDOM FOREST
Random forest classifiers are a part of ensemble-based learning methods. Their main features are ease of implementation, efficiency, and great output in a variety of domains. In the Random Forest approach, many decision trees are constructed during the training stage. Then, a majority voting method is used among those decision trees during the classification stage to get the final output. Alami et al. [32] used a Random Forest classifier on the EASC collection and got a ROUGE-1 score of 0.129. John and Wilscy [82] used Random Forest and Maximum Marginal Relevance (MMR), achieving significant results. The MMR coefficient selects the sentences that have the highest relevance, with the least redundancy with respect to the rest of the sentences generated for the summary.
Machine learning-based methods achieved significant results in the text summarization domain, however, due to limited dataset sizes, the models could not learn that efficiently and thus they could not compete with the state-ofthe-art graph-based models. However, neural network-based models overcame the limitations of machine-learning-based models and produced even better results than the state-of-theart graph-based models.

G. NEURAL NETWORK-BASED APPROACHES
Task summarization can be formalized as a seq2seq model, where input sequence is the input document, the output sequence is the output summary. Since the input size can keep varying, we cannot use a traditional neural network for this task. These seq2seq models are getting very popular in recent times. The most popular seq2seq models being used in for text summarization are RNN, LSTM anGRU.

1) RNN
RNN (Recurrent Neural Networks) belong to a class of neural networks that can use the previous outputs as input for next state. The structure of a basic RNN model is given in FIGURE 6.
The activation vector (a) is computed as shown in Eq. (7): The output value (y t ) is computed as shown in Eq. (8):

2) LSTM
Although RNN can generate significant results for text summarization, they suffer from the 'vanishing gradient' problem VOLUME 10, 2022 while backpropagation. This limits the learning abilities of the model. To counter this, LSTM (Long short-term memory) models were introduced. In an LSTM model, a gate-based mechanism is employed in each LSTM cell that is used to memorize the relevant information. This solves the vanishing gradient problem of RNN. The cell of an LSTM model is shown in Figure 7.

3) GRU
Gated Recurrent Units (GRU) are another modification over standard RNNs that can solve the vanishing gradient problem. Similar to LSTM units, GRU units have a gate-based mechanism to store the relevant data for backpropagation training. The construction of a GRU cell is given in FIGURE 8.

VIII. CHALLENGES AND FUTURE SCOPES
Even with these advancement in text summarization, multiple challenges still exist, and researchers are working to overcome the challenges. These challenges can also act as future research directions for the new studies. These challenges are in many domains like multi-document summarization, applications of text summarization and some user-specific summarization tasks. Few of the challenges are as discussed below:

A. CHALLENGES RELATED TO MULTI-DOCUMENT SUMMARIZATION
Multi-document text summarization is more complex than single-document text summarization due to the following issues: i. Redundancy ii. Temporal dimension iii. Co-references iv. Sentence reordering Some approaches for multi-document summarization can also generate improper references, e.g., assume one sentence in a document contains a proper noun, and the following sentence consists of a reference to the noun. If the summarizer ranks the second sentence higher than the first and does not include it, it will create improper references to other sentences. It is a massive challenge in multi-document summarization.

B. CHALLENGES RELATED TO APPLICATIONS OF TEXT SUMMARIZATION
Since most current studies focus on a specific text domain, i.e., news, biomedical documents, etc., some of these domains do not have significant economic value. Focusing on a long text, such as an essay, dissertation thesis or reports, may be more economically profitable. However, since the processing of long text requires high computational power, it remains a major challenge.

C. CHALLENGES RELATED TO USER-SPECIFIC SUMMARIZATION TASKS
Summarizing semi-structured resources like web pages databases is an important application of text summarization since most of the textual data is present in a semi-structured format. This type of summarization is more complex than simple text summarization since there is much more noise in the data. Hence developing efficient summarizers for these domains is a massive challenge.

D. CHALLENGES RELATED TO FEATURE SELECTION, PREPROCESSING AND DATASETS
For any natural language processing problem, the performance of the selected methods dramatically depends on the selection of the features, so is valid with text summarization techniques. Irrespective of the methods such as machine learning, statistical, fuzzy, deep learning etc. that have been used at a large scale in recent times for such problems, selecting appropriate features for concerning documents to be summarized is still a significant challenge in front of researchers. So, there is much scope in solving the feature selection problem, such as determining the most appropriate features to summarize the dataset, discovering new features, optimizing the commonly used features, using features for semantic, adding grammatical features, linguistics features etc. Preprocessing a dataset using appropriate methods also affects the performance of the summarization methods, so it also needs attention in the future. One can explore the appropriate stemming approaches, stop word removal techniques, tokenizers, and suitable POS taggers to categorize token classes among nouns, verbs, adjectives, adverbs, etc. The creation of a new dataset is also a demanding task. Many little-explored domains, such as legal, tourism, health, etc., need new datasets to be created and used to expedite the summarization work at a different level.

IX. CONCLUSION
Text summarization is an exciting research topic among the NLP community that helps to produce concise information. The idea of this study is to present the latest research and progress made in this field with a systematic review of relevant research articles. In this study, we consolidated the research works from different repositories, related to various text summarization methods, datasets, techniques, and evaluation metrics.
We have also added a section on ''Analysis of Popular Text Summarization Techniques'', which articulates the most popular techniques in the text summarization domain and gives the strengths and limitations of each technique, and hints at future research directions. We have presented the information in a tabular format, covering the advantages and disadvantages of each research paper, which can make it easier for the readers to use our review paper as a base paper for text summarization domain knowledge. We presented a detailed discussion on the different types of text summarization studies based on approach (extractive, abstractive and hybrid), the number of documents (single-document and multi-document), summarization domain (generic domain and domain-specific summarization), language (monolingual, multilingual, cross-lingual), and nature of the output summary (generic and query-based summarizer). We also presented a detailed analysis of various studies in a tabular format, which will save the readers the hassle of reading through long texts and save their time. We also gave a detailed review of various datasets used in this domain and provided references to the datasets. We discussed various standard evaluation metrics used (ROUGE, F-measure, recall, precision etc.), which can be used to measure the quality of a text summarization model. Finally, we discussed various challenges faced in text summarization that can lead future studies in the domain. RISHABH KATNA received the bachelor's and master's degrees in computer science and engineering from the National Institute of Technology Hamirpur (NIT Hamirpur), Hamirpur, in 2021 and 2022, respectively. He has been working with Standard Chartered GBS as an Intern Software Engineer. He is currently working as a Software Engineer at Qualcomm. His research interests include social networking, automation for web developers, and natural language processing.
ARUN KUMAR YADAV received the Ph.D. degree in computer science and engineering, in 2016. He is currently an Assistant Professor with the Department of Computer Science and Engineering, National Institute of Technology Hamirpur (NIT Hamirpur). He is also working on government sponsored funded projects and supervised many students. He has published more than 20 research papers in reputed international/national journals and conference proceedings. His research interests include information retrieval, machine learning, and deep learning.
JORGE MORATO received the Ph.D. degree in library science from the Universidad Carlos III de Madrid, Spain, on the topic of knowledge information systems and their relationship with linguistics. He is currently a Professor of information science with the Department of Computer Science, Universidad Carlos III de Madrid. His research interests include NLP, information retrieval, web positioning, and knowledge organization systems. VOLUME 10, 2022