Complex Network and Source Inspired COVID-19 Fake News Classification on Twitter

In COVID-19 related infodemic, social media becomes a medium for wrongdoers to spread rumors, fake news, hoaxes, conspiracies, astroturf memes, clickbait, satire, smear campaigns, and other forms of deception. It puts a tremendous strain on society by damaging reputation, public trust, freedom of expression, journalism, justice, truth, and democracy. Therefore, it is of paramount importance to detect and contain unreliable information. Multiple techniques have been proposed to detect fake news propagation in tweets based on tweets content, propagation on the network of users, and the profile of the news generators. Generating human-like content allows deceiving content-based methods. Network-based methods rely on the complete graph to detect fake news, resulting in late detection. User profile-based techniques are effective for bots or fake accounts detection. However, they are not suited to detect fake news from original accounts. To deal with the shortcomings in existing methods, we introduce a source-based method focusing on the news propagators’ community, including posters and re-tweeters to detect such contents. Propagators are connected using follower-following relations. A feature set combining the connectivity patterns of news propagators with their profile features is used in a machine learning framework to perform binary classification of tweets. Complex network measures and user profile features are also examined separately. We perform an extensive comparative analysis of the proposed methodology on a real-world COVID-19 dataset, exploiting various machine learning and deep learning models at the community and node levels. Results show that hybrid features perform better than network features and user features alone. Further optimization demonstrates that Ensemble’s boosting model CATBoost and deep learning model RNN are the most effective, with an AUC score of 98%. Furthermore, preliminary results show that the proposed solution can also handle fake news in the political and entertainment domain using a small training set.


I. INTRODUCTION
S OCIAL media platforms (e.g., WhatsApp, Sina Weibo, Twitter, Facebook, etc.) have completely changed information dissemination. They are free and easy to access, support fast dissemination, and are highly engaging. Therefore, these digital platforms have gained great popularity attracting large numbers of users [1]. They allow following a variety of events and breaking news across the world [2]. As the Variety, Volume, and Velocity of information transmission have substantially increased, it has led to increased impact and visibility of real and misleading information as well [3]. It is stated in a survey that 65% of news on social media is fake news [4]. In social media, fake information spreads wider, deeper, and faster [5]. False information has severe and wide consequences to society [6], economy [7], freedom of expression [5], journalism [8], democracy [9], reputation, public trust [10], and peace [11], [12]. Unfortunately, people get news from social media more than traditional news media [13], [14]. Consuming news from such social media platforms has its associated pros and cons. Its use to spread VOLUME 4, 2016 1 misleading information is straightforward, and it can influence large audiences. It can be done for political, financial, or other gains by an organization or individuals. The 2016 US presidential election campaign is a notable instance where the audience has been manipulated using fake news by almost 19 million malicious bot profiles [9], [15]. It could be done due to the influence of fake news that someone who had no chance according to the polls get elected [15]. Fake news is becoming a severe problem because it negatively impacts individuals and society as a whole. Economies worldwide are vulnerable to fake news, which seriously affects the stock market, massive trades, and business-related activities. For example, fake news claiming the injury of Barack Obama in an explosion wiped out 130 billion dollars in the stock market [16]. Fake information may invoke a false sense of feelings related to fear and surprise [5] which ultimately creates social panic [17], [18]. It is pretty difficult to correct someone's cognition after fake information has gained trust [10], [19]. Therefore earning money by influencing opinions [20], [21] is also easy through false information.
It is essential to understand few related terms of misleading information. Indeed, false information or deceptive information has various flavors: Fake/ False News, Misinformation, Disinformation, Hoaxes, Propaganda, Satire, Rumors, Click-Bait, and Junk News, etc. [22]- [26]. Although standardized definitions are missing, it's generally agreed that misinformation is inaccurate and misleading information. It could spread unintentionally in contrast to disinformation which is false information spreading deliberately to deceive people. Developing a mechanism to defeat different forms of misinformation and disinformation is required. However, detecting misleading information in its different forms on social networks and social media microblogs presents unique challenges. First, it is written (intentionally or unintentionally) to mislead readers, making it difficult to detect simply based on content. Second, social media data is highly large-scale, multi-lingual, multi-modal, mostly user-generated, free from grammar and spellings, sometimes anonymous and noisy. Finally, the spread of information on social media is fast, and hence detection mechanisms need to predict misleading information fast enough to stop its dissemination and severe repercussions. Therefore, detecting such information on social media is an extremely important and also technically challenging problem [27]. One can distinguish four main approaches to deal with this issue: Content-based methods, Network/propagation-based methods, Hybrid Based methods, and source-based methods.
Content-based methods try to identify deception directly through the content- [28]. They present many limitations, such as contents are multi-lingual therefore, requiring separate models. Additionally, contents are intentionally written to mislead users by mimicking the truth. Therefore even style-based techniques are ineffective.
Propagation/Network-based methods exploit the news propagation network properties (tweet's and retweet net-work) - [29] to feed supervised machine learning classifiers for the required deception detection task. As they need complete or enough propagation, information detection is very late and ultimately suffering with associated grave repercussions. Hybrid-based methods - [30] combine both news content and news propagation network-related information to predict deception through supervised classifiers Finally, source-based methods [31] focus on the news producers to identify common patterns of unreliable users.
Considering the above few challenges, neither contentbased methods nor network/ propagation-based methods are the right choices. Therefore, we propose a novel source-based solution where the information propagators are identified through their context, indirectly for deception detection.
To the best of our knowledge, it is the first attempt where complex network measures and user profile-based features are extracted from the propagators of each news article. Each news article that is either categorized as legitimate or misleading is posted/re-posted by many users. These users are form a respective community that is legitimate or misleading. The above features are applied to the whole community and every news propagator participating in the community. Supervised machine learning and deep learning models are used for the prediction task. Figure 1 illustrates the major phases of our approach. A brief overview follows.
First, a Real-world COVID-19 Dataset is constructed. Second, four approaches exploiting user profile features and complex network measures are evaluated at the node and the community level. Third, various models are investigated, and the top four models among the best two levels are selected for best features selection using the model's feature importance. Fourth, the best model is further improved using a final set of features.
The main contributions of this work are as follows. 1. We Build a twitter-based real-world COVID-19 dataset for fake news detection. This dataset includes distinct features. It is a dataset with communities, which have additional links between re-tweeters and posters in addition to their context. It provides a crucial resource for COVID-19 fake news detection research studies. It could also be of prime interest for influence estimations and epidemic spreading analysis based on various propagation models. 2. We discuss many fundamental theories that could help the research community to develop theory-guided stronger solutions. These theories also enlightened the study, and a novel theory-driven model is developed. The proposed model is source-based and enriched with hybrid features. 3. We explore complex network measures to identify the news labels through their community's interconnection patterns. To the best of our knowledge, it is the first attempt to use these features in COVID-19 related fake news detection. 4. We also examine User profile features for the same purpose. Only short-listed features are further used for the next phase. 2: Two groups of features are examined on two levels: node level and community level. Four approaches are investigated. In these approaches, user profile features and complex network measures are evaluated at two different levels. 3: Eleven ML and two deep learning models are examined for each approach. The two best approaches are selected for further experiments. The top four models among the best two approaches are also selected for best features selection using the model's feature importance through mean decrease impurity (MDI). 4: Using a final set of features, the best models are further improved and evaluated.

5.
In this exploratory study, we investigate the effectiveness of both user profile features and complex network measures. Therefore, we explore different combinations of both types of features at the whole community level and the individual's level. 6. We identify the dominant complex network measures and user profile-based features of such community's users. 7. We give recommendations about the best features and the best approaches. There has never been a similar exploratory study conducted using source-based methods. 8. We perform a comprehensive comparative study with alternative methods.
The remaining of the paper is organized as follows. Section II presents preliminaries on automatic fake news detection. Section III discuss briefly related works. Section IV reports the composition of the COVID-19 dataset. Detail methodology of the research is discussed in section V. Feature analysis is conducted in section VI. Section VII discusses the adequacy of various classification approaches to the problem. Section VIII presents fake classification modeling including model selection and features selection. Section IX starts digging the best two approaches further in detail. The comparative study with the baselines follows in section X. Finally the conclusion is presented in section XI.

II. LITERATURE REVIEW
Methods are briefly discussed for a basic understanding of major methods used in deception detection to highlight their gaps/shortcomings and the importance of the appropriate method selected for the solution. There are many surveys on automatic fake news detection where authors discuss different approaches (e.g., ML features, latent features, graph, IR, NLP, etc.) with their strengths and weaknesses [19], [32]- [35]. Perspectives and new challenges on who, how, and why fake news are created, how they propagate, and how they could be detected. are also presented [36].
One can consider four major categories in fake news detection studies. Following are some of their main characteristics. 1. Content-based methods: They focus on contents only, and no auxiliary information is needed. They don't wait for content to be propagated. These methods are much explored. Exploiting limited information, they are suited to early detection but suffer from limitations of language and domain (e.g., politics, entertainment, etc.). They can be further classified as a. Knowledge-based, b. Style-based/linguistic, and c. Stancebased.
a. Knowledge-based is further classified as (Expertbased: [37], Crowd-sourced: [38], Crowed-sourced Site: [39], IR-based: [40], Semantic Web/SPO: [41]) and b. Style based/linguistic methods are classified as ( NLP: [42], [43], ML: [44], [45] , DL: [46]) and c. Stance-based methods [47] a. Knowledge-based: There are different methods to check the authenticity of news from external sources called Knowledge-based methods. These are ineffective when news items correspond to new events. Knowledge-based methods are either Manual or Automatic. Automatic methods are scalable, in contrast to manual. b. Style-based [48]/linguistic [28] methods: They are best for identifying the intention but fail when the domain changes because each domain differs in style. They are also weak in bot/malicious profiles detection because the bot also masquerades style. It is also very challenging for these methods to adopt evolutionary styles when language is not formal (e.g., social media microblog). It becomes tough to detect malicious information only through content because it mimics real news to mislead users. VOLUME 4, 2016 c. Stance-based: content-based methods use a very dominant mechanism named stance classification. Fake news is also detected using stance classification. The news content propagates in tweets. One uses their respective reply for stance identification. Once the news is posted through a tweet, then each reply has a different stance. For news-related stances, there are four commonly used labels, i.e., Support, Deny, Query, Commenting (SDQC respectively). A tweet is classified as fake if a majority of its replies are classified as Deny (D) or Deny (D) with Query (Q) stances. There are many successful studies for fake news detection based on stance classification [47].
2. Propagation/Network-based methods: They use information exposed through news cascades [29], [49]. They may also use some custom networks constructed through different system entities. They can cover different variations of misleading information, i.e., rumor, spam, etc. Combining style-based methods with propagation-based methods makes malicious profiles detection much easier. Propagation-based solutions are domain-specific (e.g., politics, entertainment, health, etc.) because each domain has different propagation patterns. These methods could hardly work before misleading information has been fully propagated on social media [19], which means late detection.

Hybrid methods:
In addition to content-based and propagation-based methods, hybrid methods are very popular. They combine functionalities of other methods to overcome the limitations. Examples of hybrid methods include (e.g.: Propagation based + Content-based: [30], Contentbased + Social Context-based: [50], [51]).
4. Source-based methods: Although they are not very much studied, they have a high potential. Quiet open area for exploration, they indirectly focus on contents through the source identification of misleading information [31]. They can assess suspicious behavior and identify a variety of deceptive information forms. The common idea behind such methods lies in the fact that one can identify many common patterns of unreliable users, helping to assess the reliability of unknown users in advance. Generally, misleading information mostly comes from malicious sources [52] therefore, identifying such sources is malicious information detection. Theories such as social validation theory [53] and the most popular homophily [54] indicate that users in a network/community always share similar behaviors or attributes. For example, they tend to post similar news articles. Therefore, on social media, users mostly form connections or groups with like-minded people resulting in an echo chamber effect [55]. Fake news spreaders/malicious users have quite distinct connectivity patterns, e.g., they mostly form denser networks [56] and perform collective actions [57] to create more polarized groups and fully exploit the echo-chamber effect. There are many theories [58] presenting distinct attributes at the user level and network/community level as a whole which further classify source-based methods in 4

A. MODEL'S APPROACH AND MOTIVATION:
Let's now focus on essential aspects of the model's motivations. Some recent studies inspired us, leading us towards the proposed directions and enriching the model with their exciting findings. The proposed source-based method combines user profile-based features with complex network measures, exploiting social media users, posting legitimate or misleading content.
Some studies highlight the importance of profile-based features to distinguish between legitimate and misleading. The Exploratory study conducted in [62] shows that some implicit features like IsVerified, age of the account, status count, favorite count are good clues. Some explicit features like personality, gender, age, and the ratio of followers to following (TFF) are highly discriminating. In [63], The authors develop an efficient machine learning system and recommend implicit features (Status Count, Account age, Verified) and explicit features(Location, Political bias, Profile image). The study conducted in [28] concludes that the age of the account, location, bot score, list count, and other postrelated context features are discriminating. Related research studies highly recommend some user-based features are in (Verified account, no of followers and following, ratios of followers and followings named Reciprocity and Reputation) [64], [65]. In [66], the authors claim that no attention has been directed towards modeling the properties of online communities interacting with the posts. They show that it is essential, and they got exciting results as well. It is an important motivation to focus our model over such communities modeling [67]. In [68] authors found that social network structure and propagation network have critical features allowing highly accurate fake news detection. Therefore poster's social network context and their propagation network/community are also exploited in our investigations.
Many social psychology-related theories [53]- [58] are much supporting to our model foundations. Additionally, some important profile-based features highlighted in research studies highly motivate us to develop the proposed model.

III. RELATED WORK:
In disseminating fake news, internet media has done more harm than mass media. YouTube, WhatsApp, Twitter, Sina Weibo, and Facebook are the five social media sites that generate the highest share of fake news [69]. It would be helpful to check the source of messages/posts and isolate phony messages from valid ones. The data that individuals share in web-based media is to a great extent affected by groups of friends, and the structure of connections on the web [70]. Therefore one can use different post-level and poster profile features successfully be used for COVID-19 related deception detection [71]. In [72], the authors show that users of social media share more counterfeit news on COVID-19 compared to substantial and fact-based news [73]. The authors also dissect data sourced from different social networking platforms. They analyze forged and groundless publications about COVID-19 and reveal that compared to Sina Weibo, Twitter has more unsubstantiated and false news [72]. There are seven prevailing subjects relevant to COVID-19 unverified news on media platforms: Politics, terrorism, faith, health or fitness, religion, and miscellaneous cultures. Among all the mentioned subjects, the highest rank in the list is related to health counterfeit news revolving around medical prescriptions and clinical services [74]. The Study in [75] concerns the effect of COVID-19 related news shared over social media. It shows peaks of negative reactions and fear. Deep neural networks have been found to exhibit high accuracy, and precision in classification for fake news [76]. Deep diffusive neural networks, developed for fake news detection, which rely on Recurrent Neural Network (RNN), have gained popularity as the model exhibits good performance by investigating associations between news stories, their subjects/topics, and their writers/spreaders [77]. In [70], the author implements a framework using a combination of Convolution Neural Networks (CNN) and Long-Short-Term Memory (LSTM), a Recurrent Neural Network model, to identify and classify fake news messages from Twitter posts.

IV. COMPOSITION OF COVID-19 DATASET
We took 75 news articles related to COVID-19 (35 False and 40 True) from the Poynter [78], Politifact, and Lead Stories websites for this study. These news pieces are categorized into various types of disinformation such as false news, fake news, incomplete, false news, mislead, and so on, by specific websites. All these fake news about COVID-19 labeled as false, mostly false, pants on fire, misleading, and inaccurate by the fact-checking websites are considered fake news in our study. However, all the items of news labeled true or mostly true by fact-checking websites are labeled as True news in our study.

A. GETTING TWEET INFORMATION USING HOAXY
The labeled items of news collected from fact-checking websites are provided to Hoaxy [79]. It is an API to help users extracting fact-checking-based Twitter data. Out of two options, i.e., tweet search and article search, article search is utilized in the data extraction process. All the articles which matched according to keyword searching are then used to extract data for specific news. The most relevant articles are then selected, and their spread is visualized on Twitter. Hoaxy shares little details of the tweet and retweets concerning Twitter data sharing policy. It includes Tweet Ids, Retweet Ids, publishing DateTime, Retweet DateTime, Author user id, Retweeter user id, and bot scores of each user. The graph extracted by Hoaxy provides a tree-like network structure with a level one hierarchy.

B. CREATING FOLLOWERS AND FRIENDS RELATIONSHIPS USING TWEEPY
Tweepy [80] provides access to extract the Twitter data. For further experimentation and exploratory analysis, Tweepy is used for the remaining data extraction. We extract profile features and a list of followers and followings of all tweet and retweet users. The extracted data is not only helpful in our experiments, but it also facilitates analyzing the rate of propagation, influence ratio, and the epidemic spread analysis based on the SIR [81]- [85] model. It is essential to mention that Hoaxy extracted data does not provide the users' relations. Therefore, Tweepy is used to build userrelated relations as directed edges using followers and the following information. For each extracted news from Hoaxy, we get three user relationships (edges) between each other, exploiting followers and the following information.

1) A-A Relationship (edge) between two tweet authors 2) A-R Relationship (edge) between authors & retweeters 3) R-R Relationship (edge) between two retweeters
The above news-related community network with posters and re-posters of news is created with the help of user relationships (edges). We build a directed graph and carry out an exploratory social network analysis based on the network attributes. Many features have been explored to shortlist the initial set of complex network measures (see section V-C). We calculated node (propagators) and network(community) level features for our further model construction.

C. CONSTRUCTION OF TRUE AND FAKE NEWS COMMUNITIES
The true news networks and fake news networks have been constructed using the above-extracted dataset. The relationship between Tweeter and Re-tweeter provides the edge list for network construction. The directed networks for both true news and fake news articles are loaded using Python's NetworkX library. Each of the 75 news items contains a CSV file with all the relevant features/attributes. The same number of graph files are generated using Cytoscape, showing the links between the users who tweeted/retweeted particular news. Sample snapshots of two such graphs showing real and fake news item community structure from our dataset are presented in Appendix A.

V. METHODOLOGY:
The section discuss the methodology of the research in detail. VOLUME 4, 2016 The framework shown in Figure 2 is implemented at two levels of each news propagator's community: at each poster/re-tweeter level (node level) and a whole community level (network level). For both levels of implementation, we implement user profile-based features and news community/network-related features. These features are then used to train a variety of machine learning and deep learning models. Few complex network parameters have been selected and proposed for COVID-19 related real and fake news article's community networks. These features extract distinct structural patterns found in both communities of COVID-19. Each news item has a network of all users involved in news either through posting or re-posting. News community is constructed by connecting each such user through followers and friends directed connections as well. Our approach to this problem is exploring and analyzing the dominant complex network/social network-related features and user profile features and then combine them to train different machine learning and deep learning models.

B. DATASET
A total of 75 news (35 Fake and 40 True News) about COVID-19 have been extracted. We managed to extract 33,248 tweets which were tweeted by 18,940 users. Table  1 shows the breakdown of news and tweets. It is worth mentioning that the community (only authors are considered) of fake news users and real news users only marginally overlap with only 1% users. All the other users are either fully involved in fake information dissemination or real information dissemination. Almost half of the users are engaged in retweeting/re-posting both types of news. These users are not authors/posters.

C. PROPOSED FEATURES
The proposed set of features includes complex network measures/Network-based features and user profile-based features. We investigate various machine learning and deep learning models to classify news articles as fake and true based on network and user profile-based features.

1) Network Based Feature:
A complex network is a graph with non-trivial topological features that do not occur in simple networks. We compute important complex network properties related to each news article of the COVID-19 dataset. We also propose many user profile-based features to categorize the news articles as real or fake. Therefore, we extracted the following user profile-based features of each user present in a specific news community. Users either posting or reposting the news article were present in the graph related to that particular news community. Each news is posted and re-posted by many users, forming news article's propagators' community as a network with many nodes. The relationships between these Tweeters and Re-tweeters are provided in the form of the edge list for network construction (see section IV). The directed networks are constructed for true and fake news articles and loaded using Python's NetworkX library. All the proposed complex network measures are also computed using the same library.
Features are calculated at two levels,

VI. FEATURE ANALYSIS
We present the feature analysis through; 1. Mean value distribution. 2. Correlations. and 3. Empirical Cumulative Distribution Function (ECDF) plots. Details of these feature analysis basis are discussed below: 1. Mean Value Distribution: Real and fake news have different characteristics that can be important while analyzing a tweet. Figure 6 shows the difference in the distribution of network features of both sets of news through ECDF plots. Similarly, the analysis of the mean values for all features in both items of news labels is presented in table 2. For example, the number of nodes, edges, and avg degree values in fake is very high compared to real news. The number of communities is higher in true news. The profile-based features analysis shows that some undeniable trends are highly dominating. Bot scores (bots are considered malicious profiles only) are high in fake news, whereas listed count values are much higher in real news. In column A of table 2 the decision based on mean value distribution is presented.
Finally, it appears that only 17 features allow discriminating both types of news. Two network-based features (Max. degree centrality and Density) are dropped, and one user profile-based feature (Is Verified) is eliminated from the selected list (see table 2 for analysis summary).

Correlation Analysis:
We measure the Pearson correlations between features to determine the degree of association between features for both categories of features. We also use the Seaborn library in python to visualize the correlations. Figure 3 presents network-based features correlations, whereas figure 4, presents the correlation matrix between the user-based profile features. A high correlation value is in darker shades, whereas a lighter color means a weaker correlation. Based on the correlation we selected our best features [87], [88] (see column B in table 2) from both network-based features and user-based profile features. In correlation analysis, features at S.No: 1,3,4, 8, and 19 in table 2 are highly correlated to other features, therefore, recommended to be replaced with their correlated ones. Initial feature selection decision through correlation and value distribution is shown in decision column (see the last column) of Table 2. All selected features are selected in column A (for mean value distribution) or column B (for correlation analysis). Therefore, three features are dropped because they are not selected in the mean value distribution VOLUME 4, 2016  The correlation matrix that determines the degree of association between features is computed. The higher correlation value is shown by darker shades, whereas a lighter color means a weaker correlation. Therefore correlations between different features are closely analyzed for performing feature analysis. If one or more features are correlated with some feature then only that feature is selected instead of other correlated features. It appears that the following features are highly correlated, i.e., Listed Count vs Followers Count. The following features are moderately correlated i.e., Status Count vs Favorite Count [87], [88]. 8 VOLUME 4, 2016

Exploratory Data Analysis-ECDF Plot:
The Python library offers the ECDF class to fit an empirical cumulative distribution function and compute the cumulative probabilities for data attribute selection. A method for efficiently plotting and applying percentile thresholds for data exploration to an empirical cumulative distribution function (ECDF). Figure 5 presents ECDF plots for our user-based profile features, and figure 6 shows ECDF plots for network-based features for both True and False news labels. The majority of previously shortlisted, through mean value distribution and correlation (total 17) features, are also discriminating in ECDF plots. It also appears that the network features are slightly more dominant than the user profilebased features.

VII. CLASSIFICATION APPROACHES:
We conduct a complete exploratory study by implementing both categories of features. First, we explore the community level, where each news article is associated with its community comprising all its posters and re-tweeters with their interconnections through follower-followings. Second, we consider the individual's/node level, where each node represents a poster or a re-tweeter. We investigate the following four approaches through different combinations of levels and categories of features. As two of them are competing and, therefore, further exploration is conducted. We use the features from the feature analysis phase (a total of 17 from both feature categories). Machine Learning (ML) algorithms from different classes are applied: SVM, Linear regression, Random Forest, AdaBoost classifier, Decision tree, Ensemble learning, Catboost, Light GBM, Gradient boosting classifier, KNN, and Naive Bayes. All these sets of models are evaluated under the following four approaches:

A. APPROACH I: NODE LEVEL-USER FEATURES
In this approach, only user profile-based features are implemented at each participating node within the community of news articles, which are either posters or re-posters. The same set of ML models are trained and tested. The results of experiments related to Approach I are shown in table 3. It appears that this approach performs poorly with an AUC score of 0.70 by EVC and KNN. In this approach, we apply all network features and userbased features at each participating node within the commu-VOLUME 4, 2016

E. SUMMARY OF APPROACHES COMPARISONS:
The node-based combined feature approach called Approach II performs the best. The community-based approach using only aggregated network features named Approach IV is the second-best performer. In the following, we only consider these two approaches for our further considerations, e.g., best  model feature selection, etc.
The following essential findings are helpful to interpret the outcome of the two successful approaches. First of all, network-based features are more discriminating than user profile-based features (see figure 6). Second, considering the particular case of COVID-19, which is rarely understood and authenticity of a message is known to the author/producer only but challenging to identify for the receiver and retweeter. Therefore, half of the users are involved in retweeting/forwarding both types of news, fake news and true news. It creates an overlap between both communities. The remaining half are either pure, true news propagators or fake news propagators. They are the authors/posters of the specific news items, and their feature values are different. In a communitybased approach where these two separate groups of users are combined, their user profile-based features are aggregated, reducing their discriminating ability. These aggregated features produced many similar values for both communities affecting the performance of ML models. Therefore Approach IV, based on network-based features, gives better results (see table 6) than Approach III, which combines user profilebased features with network-based features over community level (see table 5). Approach II produces the best results with features at node level (see table 4). We also trained Deep Learning models named RNN and CNN with Approach II and IV configurations, but results produced (see table 7, for results of deep learning models) are almost similar to our best performing machine learning models discussed and presented earlier (tables 3,4,5,6).

This section discusses candidate Machine Learning and Deep
Learning models used for solving the required supervised classification problem.

A. MACHINE LEARNING CLASSIFICATION MODELS
We test the proposed set of 17 features classified in two categories under four approaches. The same learning algorithms are used in each approach, and the best approach with its best performing features is chosen. This set of machine learning models used in all experiments are discussed in this section. We apply algorithms from different classes, including linear, non-linear, tree-based, non-tree based, non-parametric, probabilistic, large margin classifiers, and Ensemble (boosting, election mechanism) models. For machine learning, the algorithms include SVM, Linear Regression, Random forest, AdaBoostClassifier, Decision tree, Ensemble learning, Catboost, Light GBM, Gradient Boosting Classifier, KNN, and Naive Bayes. All these set of models are evaluated under four approaches and their results are reported in tables: 3, 4, 5, and 6, respectively. The dependent variable is found to be binary when we are dealing with a Logistic Regression classifier. The main application of Logistic Regression is to predict and calculate success probability. Using Logistic Regression classifier, the random state we use is one and set Broyden-Fletcher-Goldfarb-Shanno (bfgs) algorithm as a solver. A hyperplane is created by a Support Vector Machine that distinguishes the data points into two distinct classes. The training data is mapped into kernel space by SVM [89]. Out of many possibilities, we use the linear kernel for the SVM classifier. A requirement for Naïve Bayes classifiers, which are quite scalable, is that the relationship between numbers of parameters to the number of features is linear. The purpose of a k-Nearest Neighbors' algorithm is to store all accessible cases and classifying new cases on the criteria of a distance function. Random Forest is a meta estimator that fits several decision tree classifiers on various sub-samples of the dataset. Furthermore, the predictive accuracy is improved, and overfitting is controlled using averaging. The maximum depth of the tree is one, and reinforced measures are "Gini." The trees that conduct the classification of instances on the criteria of feature values are Decision Trees. A feature, which can be classified in an instance, is represented by each node of the Decision Tree, and the value assumed by the node is represented by a branch. The EVC classifier is used along with three estimators (Logistic Regression, Decision Tree, and SVM)to predict the class. Many weak learning models are combined, forming a strong predictive model. This is achieved by a group of machine learning algorithms called Gradient Boosting classifiers. Fifty estimators are utilized in Gradient Boosting classifiers using a depth of 2. LightGBM can handle huge amounts of data with ease. This algorithm uses the Gradient Boosting framework, which in turn uses a tree-based algorithm allowing the tree to grow vertically by having a default learning rate of 0.1. CATBoot is a classifier that is rated first and does not require tuning of the parameter. It is an extensible GPU version. This classifier helps in the reduction of overfitting and increase the accuracy of the model.
As per the framework, we divide the dataset into two phases, the training phase: which learns how to classify data (constructs a model) based on the training set. Testing phase: in testing, we check those unseen data examples that do not come under the dataset we have considered for training and now the trained model predicts. After the prediction, we get the class labels used for performance evaluation. Different measure are used for evaluations.

B. EVALUATION METRICS
Since the dataset is slightly imbalanced, we use the AUC curve to validate the performance of machine learning algorithms. AUC is statistically more consistent and more discriminating than accuracy [90]. Other evaluation metrics like accuracy, precision, recall, and F1 scores are also calculated but only accuracy and AUC are used for the results [91]. Following are the formulae for these evaluation measures.
Receiver Operating Characteristic (ROC) curve is a plot that shows the performance of a classification model with two parameters TPR vs. FPR at all classification thresholds. TPR and FPR are provided as following. Area under the ROC Curve (AUC) measures the entire two-dimensional area underneath the entire ROC curve.
T rueP ositiveRate(T P R) = T P T P + F N T P R = Sensitivity = Recall

C. DEEP LEARNING CLASSIFICATION MODELS
After testing the data with several conventional supervised methods, we test the deep learning model as it always gives an efficient output. The two most famous deep learning methods Recurrent Neural Network (RNN) and Convolution Neural Network (CNN) are implemented to see how well our data fit into the model. Such algorithms are ideal for various classifications based on datasets. Keras provides easy and reliable high-level APIs and follows the best methods to reduce the cognitive burden for users, using Keras as a neural network library. Unlike normal neural networks, RNNs rely on prior output knowledge to predict upcoming information. This role becomes extremely helpful when combined with sequential knowledge. An LSTM network is a recurrent neural network with LSTM cell blocks instead of the usual neural network layers. Regarding the CNN model, the architectural structure is available in figure 7 with a given number of parameters. Most of the parameters come from the fully connected component, comparing the number of parameters in the function learning part of the network and the fully connected part of the network. In terms of computing costs, CNN is very effective. Parameters are selected to train both models RNN and CNN. Grid search approach is used to optimize the hyper-parameters and optimal values are derived (see figures 8, 10). Parameters are set as follows for both CNN and RNN: batch size=45, epochs=10, learning rate=0.1, loss = 'mse', activation function = 'ReLU', optimizer='adam' and dropout regularization rate which is set at 20%. In CNN, we include the dropout layer after the embedding layer. Using the dropout layer we can generalize the performance of our model, then the 1D CNN layer is added. We further add a max-pooling layer to reduce the dimensionality of the input layer while preserving the depth and avoid over-fitting of the training data. Then converts the pooled feature map to a single column that is passed to the fully connected layer. In the end, the dense layer add the fully connected layer to the neural network. In the RNN model, we add an embedding layer after the input layer. ReLU function is used as an activation function on hidden layers to activate the neurons, dropout layer is added to generalize the model's performance. Batch normalization is used for re-centering and re-scaling the input layer through normalization. It makes the neural network faster and more stable and reduces the number of epoch to train the deep neural network. In this model, we use the LSTM layer. Regular dropout cannot regularize the activations and decrease the learning rate so we use the spatial dropout layer. In the end, the dense layer add the fully connected layer to the neural network. The main objective is to significantly increase the accuracy of the model prediction. The architectural diagram for RNN is shown in figure 7 and CNN is shown in figure  8. The summary/description of layers configuration of the RNN and CNN models are shown in figure 9 and figure 10 respectively.

D. MODEL SELECTION
We consider the best two shortlisted approaches for further explorations. These are Approach II (network-based and user profile-based features) and IV (network features). One needs to find the appropriate best-performing features for each approach. These features are then used to train different machine learning and deep learning models for better-optimized performance. Identification of top-performing models is the precursor to best feature selection. The best model selection is made in this section. We determine the optimal set of hyper-parameters by testing the performance of our models on the training set for different parameter combinations. After tuning our models using an optimal set of hyper-parameters, the four best models are selected as best performing among the two selected approaches. These best models are CATBoost, LightGBM, AdaBoost, and Gradient Boosting.

E. FEATURE SELECTION
In this section, we analyze the importance of features using the best-performing machine learning models. It is important to know what is the final set of features that are highly discriminating. Indeed, using the only best set of features, models can be tuned for much better performance. We use Mean Decrease Impurity (MDI) for feature importance and feature engineering. To evaluate, we use the feature impor-VOLUME 4, 2016 FIGURE 7. Architecture diagram of RNN: In this RNN, an embedding layer is added after the input layer. Embeddings are used for purposes such as finding nearest neighbors, and low-dimensional representations are also learned. ReLU layer is also added, where ReLU function is used as an activation function on hidden layers to activate the neurons. Dropout layer is added to generalize the performance of the model. Batch normalization is used to train very deep neural networks, making the neural network faster and more stable. It reduces the number of training epochs. The long short-term memory (LSTM) layer is added. Unlike standard feedforward neural networks, LSTM has feedback connections. It is best for sequenced data processing. Regular dropout is unable to regularize the activations and decrease the learning rate. Therefore, the spatial dropout layer is used before LSTM. It drops entire 1D feature maps instead of individual elements. Finally, the dense layers are added for the fully connected layer and label predictions.
tance module in the sci-kit learn library. This module calculates the relative weight of each feature in the model, and it sums up to 1. We had 20 features which were reduced to 17 after feature analysis. These 17 features comprising both network and user-based features are trained on machine learning models, accordingly under specified approaches. Among all ML models, only the top four are selected with Approach II and IV for best feature selection using Mean Decrease Impurity. Table 8 shows the top features of the best four performing models with respect to approaches. Remember that Approach II includes both types of features, whereas Approach IV only uses network-based features. Therefore, best features are with respect to their approach. Following are few features that can be eliminated from the best features list because they are not found in the top essential feature list among any of the best models (favorite count, No of nodes, max page rank centrality). User profilebased feature named user account age is considered among FIGURE 8. Architecture diagram of CNN: The normal dropout layer is used after the input and embedding layer in this CNN. Embeddings are used for purposes such as finding nearest neighbors, and low-dimensional representations are also learned. The dropout layer generalizes the performance of our model then the 1D convolution layer is added to extract the high-level features. To highlight the most present features, a max-pooling layer is added in CNN to reduce the dimensionality of the input layer while preserving the depth and avoid over-fitting of the training data. The same 1D Conv and max-pooling is added again. Then, the pooled feature map is converted to a single column passed to the GRU (gated recurrent unit) layer. The GRU networks replace CNN fully connected layers to transform the classification task into a sequence task. Finally, the dense layers are added for fully connected layer and label predictions. selected feature list, as a particular case because it is selected at no 11 by two best models and therefore not shown in top 10 important feature list of table 8. The other two models selected max closeness centrality at no 11, which LightGBM already selects in the top 10 list. The remaining 14 features are the best features for model tuning and performance increase.

IX. BEST PERFORMING APPROACHES II AND IV:
Following the best features selection reported in the previous subsection VIII-E, 14 features are retained (8 network-based and six user-based). These selected features are used for Approach II, while the eight network-based features are used 14 VOLUME 4, 2016 TABLE 8. The best model features: Referring node level combined feature Approach II: Top 10 features across best four performing models. Likewise, community-based Network feature Approach IV: Top 7 essential features across same best performing models. Mean Decrease Impurity (MDI) is used for feature importance: Avg Degree, Avg Clustering coefficient, and max eigenvector centrality have the highest importance among all models. Max closeness centrality only appeared in LightGBM model list. Dominating user features are user listed count and user bot score, found in the best three models. Features like: favorite count, No of nodes, and max page rank centrality, could be eliminated because they are not in the top important feature list among any of best models.  for Approach IV to optimize the model's performance further. This section discusses the results of the best-performing approaches after a final tuning with these features.

A. MODEL RESULTS:
We determine the optimal set of hyper-parameters by testing the models' performance on the training and test sets for different parameter combinations. Table 9 reports the values of the parameters associated with the best results for different algorithms using Approach IV. Table 10 show similar results for Approach II. These results are obtained through the implementation of the best features selected and discussed earlier in section VIII-E. One can observe that Gradient Boosting has the highest test accuracy score: 88%, with VOLUME 4, 2016   9). CAT Boost performs best in Approach II with a test accuracy score of 98.4% and a ROC-AUC score of 0.98 (see table 10). Figure 11 shows the testing accuracies of all machine learning and deep learning models following Approach II (the best approach) .
We also explore Deep Learning models using best approaches II and IV. Results reported in table 7 are in the same vein. Approach IV relies on network-based features, and approach II uses user and network-based features to optimize the ML models. As Approach II outperforms its alternatives, we discuss in detail only the models in the Approach II context. Table 10 reports the results of the investigations using the 14 best features mentioned in section VIII-E. KNN and Naïve Bayes models exhibit respectively an accuracy of 76% and 79%. SVM and Logistic Regression models have 72.4% and 81% accuracy, respectively. Tests on Decision Tree and EVC models show that the performance is slightly optimized with an accuracy of 80% for both algorithms. A performance enhancement of 6% over the previous results is observed for Random Forest. was tested for observing any further enhancement. Ada boosting performs much better with an accuracy score of 91%. Tests with the most recent machine FIGURE 11. AUC-ROC Curves: There are four approaches, and the same set of eleven ML models are under test in each approach. The best performing ML model is named Gradient Boosting in Approach IV (which is second-best among all four strategies) with 92% accuracy. The top-performing ML model with 98% accuracy is CAT Boost in Approach II. It is the best score among the four approaches.
learning algorithms named (Gradient Boosting, LightGBM, and CATboost) have been conducted. These algorithms perform well, and one observes further enhancements. Indeed, the testing accuracy reaches 97.92%, 98.1%, and 98.4%, respectively for Gradient Boosting, LightGBM, and CATboost. Table 10 shows the overall comparison of machine learning algorithms for Approach II. CATboost presents the highest accuracy and ROC-AUC score. State-of-the-art deep learning techniques are also used. The accuracy achieved by deep learning algorithms like RNN and CNN is 98% and 97%, respectively. It is shown in figure 12 with the comparison of all machine learning algorithms. One can observe that CAT Boost and RNN are the most effective models in the combined features approach.

X. COMPARISONS OF FINAL PROPOSED MODEL WITH BASELINES
The model implemented in Approach II is the best among all approaches. One obtains further gains through best feature selection. We consider it as the final proposed model. In this section, its performance is compared with baseline deception detection solutions. Baseline algorithms cover the various categories presented in the fake news taxonomy presented in section II. These categories include 1. Propagation-based/ networkbased model utilizing information exposed through news cascades [29]. 2. Hybrid-based model [30] exploiting both news content and news propagation network information to predict deception. 3. Content-based model [28] fully focused on news contents based linguistic information for early fake news detection. 4. Style-based model [48] that depends on news contents through the extraction of writing style and ability to detect fake news in the early phase. Latent representation methods are not considered e.g., [92], [93] because their performance are not much competitive. 5. Source-based models [62] that indirectly focuses on propagators to detect deceptive contents. The performance of the best-proposed model is compared with the baselines using three datasets (Politifact, GossipCop, Covid-19). Table 11 reports the results of the comparative analysis. it outperforms all its alternatives whatever their type: content-based (linguistic), Propagation/ Networkbased, Hybrid, Style-based, and Source-based models.
One can see that Network/Propagation-based model and Content-based (linguistic) model are far behind, with accuracy scores ranging from 64 % to 80%. Style-based and Source-based models perform better. The hybrid model is the best alternative. Although content-based models are better for early detection. VOLUME 4, 2016 Remember that in the final phase of experimentation, conducted on the publicly available datasets of PolitiFact, Gos-sipCop [28] and our COVID-19 dataset, only the best set of features are used. A supervised classifier is trained for deception detection. It is very encouraging that the proposed model outperforms all baselines. It is trained on different datasets covering three domains (politics, entertainment, and COVID-19 pandemics). The proposed model is implemented with minor modifications to better suit the political and entertainment domains. Table 11 shows that the accuracy scores for the proposed model are much higher than all the others. The proposed model's accuracy score for the COVID-19 dataset is 98.4, and for PolitiFact, it is 92, whereas it is 91 for the GossipCop. Results show that the proposed deception detection model outperforms all baselines and performs well in different domains. It also works on little training data (i.e., 75 news items), and it is well suited for COVID-19 related deception detection.

XI. CONCLUSION
This paper tackles the novel and challenging problem of fake news detection in COVID-19 epidemics over social media microblogs. While many studies deal with fake news detection in politics and entertainment, there are few reports on COVID-19 epidemics. Four main approaches allow coping with this issue. The Content-based methods focus on the structure of the information itself. The propagation/Network methods exploit the propagation network. Hybrid methods combine various features of both approaches. Finally, recently introduced source-based methods rely on user information to identify fake news. This type of method can easily overcome many limitations of earlier solutions. Therefore, we propose a source-based method that uses information about the source and the propagators of the information. The model detects fake news by analyzing the context of their spreader's community on social networks or social media microblogs. The solution analyses connectivity patterns of such communities together with their user's profile features. We investigate different approaches combining features of the propagation network with user profile-based features, two at the node level (user features alone, user and network features), and two at the community level (aggregated user and network features, aggregated network features). Results of the experiments show that two approaches are very effective in fake news detection. The first one combines network and user profile features at the user level. It is called Approach II. The second one uses network-based features at the community level. It is called Approach IV. In this exploratory study, we perform a comparative analysis of different machine learning and deep learning models. The best results are produced by Approach II when these features are applied at the node level. CATBoost and RNN are the most effective algorithms with an AUC score of 98%. All non-linear models perform better than the others. The performance of deep learning models is similar to modern machine learning models. The best set of features is also explored and used for further model optimization. A comprehensive comparative study with baselines covering all spectrum of alternative approaches (content-based, Propagation/Network, Hybrid, Source-based) is also conducted using three datasets from various domains (COVID-19, political, entertainment). Results are very encouraging. Indeed, the proposed approach always produces the best results. Additionally, the proposed model works with little training data. This work paves the way for further development of stronger solutions, including network and user features.

XII. ACKNOWLEDGEMENT
Sumaiyah Zahid (National University of Computer & Emerging Sciences, Karachi, Pakistan) for data fetching and compilation. DR. RAUF AHMED SHAMS MALICK received his PhD at University of Karachi. He has been visiting scholar at NİG(Japan), and UCLA (USA). He has founded several companies with state of the art products related to social media, location based analytics and organizational networks. He is currently involved in complex system research and pursuing problems in the area of biological networks, networked economics, and personality traits. He has distinguished background in designing novel solutions for complex systems. He is currently affiliated to Department of Computer Science, National University of Computer and Emerging Sciences, as Assistant Professor, continuing research in the specialized scientific area of Computer science, Complex Networks, Social Computing, Bioinformatics, Integrated system. He has authored several articles along with chapters in different books.
DR. MUHAMMAD SABIH received the B.E. degree in industrial electronics from the IIEE-NED University, Pakistan, in 2000. He received his M.S. and PhD degrees in Systems Engineering from KFUPM, Dhahran, Saudi Arabia, in 2009 and 2014 respectively. During his research period at KFUPM, he worked for several applied research collaborations between KFUPM and MIT. His data driven research work expanded to a funded industry-academia collaboration project between KFUPM and Yokogawa-Saudi Arabia, and turned into a US patent. He also worked as Algorithm Specialist and developed anomaly detection algorithms using Python on the pipeline inspection data at leading Research and Technology Center (RTRC) of German ROSEN Group at Dhahran Techno-Valley (DTV), Saudi Arabia. He is working in the field of Computer and Electrical Engineering since 2009 and professional member of International Society of Automation (ISA). He is currently an Assistant Professor in DHA Suffa University and actively engaged in developing solutions from industrial data utilizing machine learning methods for estimation, modeling, and compensation. He has one US patent and around 10 peer-reviewed papers. His current research interest include Industry 4.0, Data Science, Modeling, and Estimation for real world problems.
DR. HOCINE CHERIFI is a professor of Computer Science at the University of Burgundy, Dijon, France since 1999. He completed his Ph.D. degree at the National Polytechnic Institute, Grenoble in 1984. Prior to moving to Dijon, he held faculty positions at Rouen University and Jean Monnet University, France. He has held visiting positions at Yonsei, Korea, University of Western Australia, Australia, National Pintung University, Taiwan, and Galatasaray University, Turkey. His recent research interests are in computer vision and complex networks. He has published more than 200 scientific papers in international refereed journals and conference proceedings. He held leading positions in more than 15 international conference organization (General chair, Program chair) and he served in more than 100 program committee. He is the founder of the International Conference on Complex Networks their Applications. Currently, he is a member of the Editorial Board of Computational Social Networks, PLOS One, IEEE Access, Journal of Imaging, Complex Systems, Quality and Quantity, and Scientific Reports. He is the Founding Editor-in-Chief of the Applied Network Science journal. pletely visible which fully support our research hypothesis. Real community graph has less no of communities with large sizes and the majority of them are not connected whereas graph of fake community has more no of communities which are smaller in size and most of them are also connected. Fake  community structure is far more spreading than real.