Assessing Information Quality of Wikipedia Articles through Google’s E-A-T Model

Along with the emergence of Web 2.0, User Generated Content (UGC) is becoming increasingly important for knowledge sharing. Wikipedia being the world’s largest-ever community-based collaborative encyclopedia, is also one of the biggest UGC databases in the world. Wikipedia is dealing with a significant problem of Information Quality (IQ) because of its open-source and collaborative nature. When carrying out attacks such as link spamming, malicious users take advantage of Wikipedia’s popularity on the WWW. As a result, Wikipedia is generally not recommended for academic-related work. There are, however, some articles that are both rich in information and quality. Existing approaches for assessing Wikipedia’s IQ involve statistical models and machine learning algorithms; however, the existing models do not produce satisfactory results. In this study, a novel theoretical model based on Google’s E-A-T framework is introduced to assess Wikipedia’s IQ. The model comprises three IQ constructs Expertise, Authority and Trustworthiness. Based on the empirical findings and study results, a set of IQ dimensions that influence the above three IQ constructs, as well as 45 IQ attributes to measure the IQ dimensions, were identified. The IQ attributes were automatically and inexpensively extracted from the content and meta-data statistics of Wikipedia articles using a Selenium 3.14 web automation script. A sample of 2000 articles comprising 1000 Featured Articles (FA) and 1000 non-FA articles from six WikiProjects was used for the data analysis. The proposed model was compared with three previously published models in terms of classification and clustering accuracy. It received classification and clustering accuracies of 95% and 93% respectively which is a drastic improvement over the existing models. Furthermore, an average inter-rater agreement of 84% was observed. Thus, the proposed model’s effectiveness is fairly validated by this extensive experiment. This study contributes to the related knowledge area by introducing a novel framework to assess Wikipedia articles’ IQ. The study’s limitations include the domain specificity of the chosen dataset and focusing solely on the English language. However, the results can be generalized by improving the dataset by size and replicating the study for the other domains and languages supported by Wikipedia.


I. INTRODUCTION
Wikipedia stands as the world's largest and the most popular community-based collaborative multilingual encyclopedia to date. As of October 2021, it had over 54 million articles in over 300 languages [1]. It is a well-known fact that Wikipedia has become the default site for many internet users looking for information. The relevant domain experts create the content of traditional encyclopedias, and therefore, the Information Quality (IQ) of the contents is to the highest standards. The content of collaborative platforms like Wikipedia, on the other hand, is being improved with the help of numerous contributors. This group of contributors can be either registered users or anonymous users. As a result, Wikipedia's IQ has become always an open topic of discussion.
Mal users use the popularity of Wikipedia in WWW and its open-model concept as advantages when mounting attacks [2]. Wikipedia vandalism is one such issue. The Wikipedia community has defined vandalism as "any change made to content to compromise its integrity" [3]. Insertion of obscenities and crude humor within the content is one such major attack. This inversely affects not only Wikipedia's integrity but the image of the relevant personnel or the organization which the content is about [4]- [6]. Another type of attack is link spamming, which is defined as the placement of inappropriate and promotional external links in Wikipedia materials [2]. It can lead significant traffic to a landing site and result in monetary benefits to the spammer. Other sorts of vandalism include blanking the page or removing the content, changing the article's formatting and adding offensive edit summaries. Furthermore, certain facts are not cited properly, the references' destination links are broken or unavailable and, images, graphs, and maps placed within the content are not copyrighted.
As a result, academics advise students not to use Wikipedia as a research or academic information source [7], [8]. Furthermore, several universities and secondary schools have policies prohibiting students from using Wikipedia in their academic writings [9]. It is worth noting, however, that not all of the Wikipedian contents are unreliable or lacking in quality standards. Even if Wikipedia suffers from IQ issues in the big picture, certain articles are rich in information and quality. Due to this overall IQ issue of Wikipedia, such quality and credible articles are also missed out.

A. WIKIPEDIA'S INFORMATION QUALITY
The term information is defined as "useful data that have been processed in such a way as to increase the knowledge of the person who uses the data" [10]. As defined by Gustavsson and Wänström [11], IQ is the "ability to satisfy stated and implied needs of the information consumer". In other words, it is the "measure of the value which the information provides to the user of that information" [12]. IQ is perceived subjectively. Britannica [13] has defined the encyclopedia as "a reference work that contains information on all branches of knowledge or that comprehensively treats a particular branch of knowledge". Wikipedia is a collaborative community-based encyclopedia. Unlike a typical encyclopedia, which is compiled by domain experts, a community-based encyclopedia is compiled by a broad group of people who may or may not be domain experts. It is therefore challenging to produce accurate and reliable content in a collaborative encyclopedia. Accordingly, IQ of community-based collaborative encyclopedias could be defined as providing more reliable, accurate and comprehensive information on many knowledge areas in a way that a layman can comprehend and understand.
The English Wikipedia currently has 6,291,946 articles, with an average of 594 new articles every day [14]. On the other hand, human judgment based on a limited number of experts is used to determine quality on Wikipedia. This existing quality grading system tags the articles into seven ordinal classes as Featured Articles (FA), A-Class, Good Articles (GA), B-Class, C-Class, Start, and Stub classes from best to worst [15]. FA class is considered as the highest quality class among these classes. However, it is obvious that the community cannot constantly monitor and evaluate the quality of Wikipedia articles against numerous edits and revisions made every day. As a result, a fully/semiautomated method of evaluating IQ in Wikipedia is required to facilitate Wikipedia's collaboration.
Prior studies [16], [17] have presented diverse quality determinant factors and theoretical frameworks for assessing Wikipedia content's IQ. However, the complexity and preciseness of certain frameworks and algorithms do not deliver sufficient classification performance. Certain methods are not automated. Especially the data collection phase requires a substantial amount of time and effort. Furthermore, proper attempts to investigate how certain facts in Wikipedia articles affect IQ, such as content authority, verifiability, and maturity, are lacking [18]. Therefore, this research aims to introduce a novel theoretical model based on Google's E-A-T quality framework [19] to better classify Wikipedia articles by quality. Based on an extensive literature review and evaluation process, authors define E-A-T: i.e., Expertise, Authority and Trustworthiness as the key constructs in assessing the IQ of Wikipedia articles. Thus, this study attempts to understand (1) influencing IQ factors for the E-A-T model and (2) the reasons for getting a better quality assessment for Wikipedia using the E-A-T model compared to previous algorithms.
Accordingly, this paper's major contributions can be summarized as follows. Firstly, the study introduces a novel theoretical model based on Google's E-A-T framework to assess the IQ of Wikipedia. Expertise, Authority and Trustworthiness are presented as the novel constructs for Wikipedia's IQ assessment. Secondly, a set of IQ dimensions that influence the E-A-T constructs was identified. Thirdly, a comprehensive set of 45 IQ attributes (14 new and 31 existing attributes) were presented to measure each dimension. To extract features from Wikipedia, an efficient automated web scraping strategy was used. The scraping strategy is hybrid, where the features were extracted from both article content and meta-data. The dataset was analyzed using a statistical analysis followed by a machine learning approach. A total of 2000 articles were extracted from six WikiProjects for the dataset. Afterwards, the model was thoroughly evaluated in various ways to ensure its validity. Assessing the IQ of Wikipedia articles would be valuable for both readers and collaborators (reviewers, authors and editors). Since Wikipedia publishes millions of articles, it might be difficult for readers to extract high-quality information. Therefore, directing people towards high-quality articles would be beneficial. Furthermore, because manually assessing each article is impractical, reviewers would be benefited from an automated process of identifying flows in the articles. As a result, this would aid in achieving more efficient knowledge collaboration and influencing the role of online communities.
The rest of the paper is organized as follows. Section 2 is about the literature review. The adopted methodology is explained in Section 3. Section 4 is about data analysis and results, and Section 5 is about Evaluation. Finally, Section 6 and Section 7 are about the discussion and conclusion of the study, respectively.

II. LITERATURE REVIEW
The literature review is discussed under existing approaches for IQ assessment of Wikipedia and the chosen theoretical model, i.e., E-A-T.

A. EXISTING APPROACHES FOR WIKIPEDIA IQ ASSESSMENT
Literature has presented three major approaches for determining Wikipedia's IQ. These are (1) Meta-data-based approach -exploits the meta-information of the articles such as edits, editors and revision history. Studies presented in [20]- [24] are some examples; (2) Content-based approachwhich considers the content related features such as article length in words [25]- [28]; (3) Hybrid approach -a combination of the above two approaches. Studies such as [17], [29]- [31] have adopted this hybrid approach. Based on these approaches, various theoretical frameworks, formulas, and statistical models have been proposed to measure Wikipedia's IQ. Table I describes these previous work and their limitations. Most of these approaches require extensive human effort in the feature extraction process. However, machine learning models, such as K-Nearest Neighbour (KNN) and Support Vector Regression (SVR) have been adopted by [29], [32], [33].
IQ is a multi-dimensional construct [34], [35] and thus, aspects of IQ can be split into various dimensions which can be measurable [13]. Accordingly, a set of measurable quality attributes together represent a single IQ dimension. Drawing from Google's E-A-T model and following the hybrid approach of article content and meta-data, the authors reviewed the accumulated literature to understand various dimensions and attributes adopted by previous studies in assessing Wikipedia's IQ.

B. THEORETICAL FRAMEWORK -E-A-T MODEL
Since Google's algorithm update in August 2018, the E-A-T quality framework has been a major focus of attention. The model consists of three constructs E-Expertise, A-Authority and T-Trustworthiness. Expertise is defined as the knowledge and skills of the Main Content (MC) creator. Authority is the "authoritativeness of the creator of the MC, the MC itself, and the website", and Trustworthiness is defined as "trustworthiness of the creator of the MC, the MC itself, and the website" [19]. Thus, through the E-A-T framework, Google aims to deliver the highest quality content and experience to the online search community. Since Wikipedia is an online source of information, authors believe that E-A-T is a more appealing model for assessing the IQ of collaborative platforms such as Wikipedia. None of the previous studies has adopted E-A-T as the base model to conduct a systematic review of online collaborative content IQ assessment to the best of the authors' knowledge.

1) EXPERTISE
A high-quality website requires enough expertise to be authoritative and trustworthy on the respective topics. An "Expert Wikipedia article" is exclusively knowledgeable and informative and are composed by the contributors with relevant expertise. Prior studies have attempted to evaluate IQ through expertise. For example, certain studies [20]- [22], [36], [37] have analyzed how various editor characteristics such as editor experience, concentration, coordination and coordination patterns of editors impact the quality of articles on Wikipedia. Studies such as [38], [39] have adopted graph techniques and networking concepts to identify the behavior of editors. The studies [25], [28] argue that article length is a precise measure of article quality despite its simplicity. Some studies have focused on article lifecycle-based approaches [40]. Further, Lipka and Stein [27] have analyzed the articles' writing styles. The frameworks introduced by [16], [17], [41] included a separate indicator for informativeness. The term "informativeness" can be defined as the ability to provide useful information and comprehensive content. Graphical representation of information improves informativeness [42], [43]. Authors in [44] have analyzed the language quality of Wikipedia. Works in [45]- [47] have adopted readability metrics. Readability scores reflect the easiness to read a given text. From a general encyclopedia's perspective, the information presented should be easily understood by a layperson [13]. Other than Wikipedia related studies, studies such as [48] also have used readability as an index of assessing the quality of web content.

2) AUTHORITY
When a source of knowledge is cited by others more frequently, it becomes more reputable and authoritative within its verticals [49]. When a source is more authoritative, it implies it has a higher level of quality. The authoritativeness of the contributors, the content, and the website itself may all contribute to the authority of a Wikipedia page. Certain studies have attempted to assess IQ based on the authors' reputation. Studies [5], [23] have found that high-rated articles have been authored and edited by highly reputed authors and editors. Similarly, Nemoto et These studies [5], [20], [23] have mainly focused on edit history, concentration and coordination of the contributors in assessing the IQ of Wikipedia articles. Authors in [50] have used editor quality as a quality indicator. The study [51] has used articleeditor network and Page Rank based models. However, this approach requires collecting large amounts of meta-data about the articles. Further, the IQ prediction is indirect since it is based on external factors such as contributors rather than considering the article content-related factors [42]. This approach assesses IQ using the features derived from the article content itself. Article length measured using word count is a simple yet precise measure for IQ assessment of Wikipedia [25]. A high accuracy has been achieved in classifying the articles into FA and non-FA articles using various lexical features [52]. The study [27] has used the article's writing style by exploiting the binarized character trigram features as an IQ determinant factor. Furthermore, features such as completeness, informativeness, number of headings, number of images, references and readability scores also have been used [16], [18], [32], [41]. Other than Wikipedia, certain studies have found that web page content positively impacts the web user's quality of experience [53]. This approach adopts both meta-data and article internal features to assess the IQ of Wikipedia. The pioneering study related to this approach was presented by [17]. Informativeness, Completeness, Complexity, Consistency, Currency, Volatility, Revision History, Popularity, Citation Network and Readability are some of the Hybrid IQ measures introduced under this approach [31], [48], [54]- [56]. Moreover, Objective Revision Evaluation Service (ORES) is a service that automates the tasks of vandalism detection and removal of untrustful edits. Additionally, for certain language versions ORES evaluates articles on a scale between 0 and 1 (currently supports only 9 languages) [6]. al.
[24] have found that articles initiated by reputed authors rapidly progress to the high-quality status. However, there is a lack of studies that have focused on the authority of content and websites in evaluating Wikipedia's IQ. Only Lewandowski and Spree [57] have adopted Wikipedia article rankings in search engines to assess IQ.

3) TRUSTWORTHINESS
The term "reliability" refers to the ability to provide accurate and trustful information. In collaborative contexts, reliability and trustworthiness are important factors. A trustworthy Wikipedia article contains factually accurate and verifiable information. Zeng et al. [58] have proposed a revision history-based technique for evaluating an article's trustworthiness. They have created a Bayesian network trust model based on article revision history information. Halfaker and Taraborelli [6] have created an online Objective Revision Evaluation Service (ORES). The single input parameter for ORES is the revision ID of a Wikipedia article. The authors of [18] argued that the article's content should always be verifiable, which indicates that a reader should be able to confirm the accuracy of information regardless of the authority of the source. Nicholson et al. [59] tried to evaluate Wikipedia articles' reliability through the references. Lewoniewski et al. [43] have conducted an interesting study regarding the popularity and reliability modelling. Moreover, in the hybrid frameworks introduced by [16], [17], [41] separate indicators have been adopted to indicate the reliability of the articles. Besides that, these hybrid studies have presented comprehensive features including content, network, structure and edit history, style, readability and review features, that can be extracted from both content and meta-data statistics [18], [60]. However, most of those features are hand-engineered features. According to the above review, the notable limitations in this area of knowledge are as follows. First, the authority has not been considered as a prominent indicator in assessing the IQ of Wikipedia. Higher web presence leads to a higher web reputation due to other experts or influencers citing the specific website/page as a source of information. Correspondingly, a good reputation improves the content's quality [61], [43]. Furthermore, the authors observed a drastic reduction in the classification accuracy (accuracy reduced from 95% to 79%) when the authority component was removed from the proposed model. Therefore, the authority of the presented content should be considered as a key factor for assessing Wikipedia's IQ. Second, based on the literature, there is a dearth of studies on how verifiability and maturity of the content affect the IQ [18]. Finally, the previously presented models and algorithms [17], [21], [42] do not provide satisfactory classification performance in classifying the articles into correct quality classes [62]. Considering all these facts, it can be concluded that the Expertise, Authority and Trustworthiness of Wikipedia articles precisely represent Wikipedia's IQ. Therefore, it is notable that Google's perception through the E-A-T model can be drawn to assess the IQ. Aligning with the empirical findings, authors present Informativeness, Readability and Understandability as the dimensions that influence Expertise and Maturity, Verifiability and Reliability as the dimensions that influence Trustworthiness. Moreover, the authors present 45 IQ attributes that can be automatically and inexpensively extracted from the content and meta-data of the articles (see Appendix A). The conceptual framework, operationalization of the above IQ dimensions using attributes and thus, synthesizing the three constructs Expertise, Authority and Trustworthiness is presented in the next sections.

A. DATA COLLECTION
This study is quantitative research where statistical analysis and experimentation were used to analyse data. The IQ attributes were extracted from the articles' content and metadata about the articles. Therefore, the data collection technique was secondary. An algorithm was written using Selenium 3.14 to automatically and inexpensively scrape these features from articles using their URLs. Selenium is a cross-browser web application testing and validation framework which is free and open-source [63]. The majority of previous studies were based on manual feature extraction approaches, which is time-consuming [42]. To overcome these limitations, this research proposes an efficient and inexpensive data gathering technique. This technique can be used to assess the quality of not just Wikipedia but also other online collaborative repositories.
The articles in the Wikipedia repository form the study's population. Wikipedia has categorized articles into WikiProjects that share a common knowledge base, such as Medicine and Politics, History, Geography, Sports, Plants and Computer Science, etc. Accordingly, for the data analysis, articles from six WikiProjects Medicine [64] Politics [65], Sports [66], History [67], Science [68] and, Biographies [69] were chosen. These WikiProjects were chosen because they are considered the most saturated and active WikiProjects [64]. They contain a significant number of FA articles that have undergone an extensive review process. Even if almost all the WikiPeojects contain plenty of non-FA articles, most of the WikiProjects contain fewer FA articles. For example, WikiProject Medicine contains only 62 FA articles. Authors had to collect 1000 FA articles and these six WikiProjects helped us to collect this number of FA articles. Aligning with the community assumption about the FA articles' high quality, authors utilized FA as the basis for assessing the IQ of Wikipedia articles [16], [17]. The sample comprised 2000 articles, including 1000 FA and 1000 non-FA articles extracted from the above six WikiProjects. Then, the 45 IQ attributes (see Appendix A) were extracted from each of these 2000 articles. Thus, the unit of the analysis was a single article. VOLUME XX, 2017

B. CONCEPTUAL FRAMEWORK AND OPERATIONALISATION OF THE CONSTRUCTS
Based on the E-A-T framework and empirical findings on Wikipedia IQ assessment, the study proposes the conceptual framework given in Figure 1. According to the conceptual framework, the study defines three IQ constructs, seven dimensions that influence the constructs and 45 attributes to measure each dimension. The three constructs are (1) Expertise, (2) Authority and (3)  Following the data collection phase, a statistical analysis was conducted using SPSS 26.0. SPSS was specifically used because, of its easiness to use and ability to process critical data in simple steps. For the evaluation, a machine learningbased approach was adopted. The algorithms Decision Tree, Logistic Regression, Support Vector Machine (SVM), Naïve Bayes, KNN and K-Means were used because these algorithms are capable of handling both numerical and categorical data. Further, these algorithms have been adopted by previous studies to assess the IQ of Wikipedia [17].
Initially, an Exploratory Factor Analysis (EFA) analysis was conducted to discover the factor structure of Wikipedia's IQ and to examine its internal reliability. Traditionally, EFA has been used to explore the possible underlying latent structure of a set of observed variables without imposing a preconceived structure on the outcome whilst Confirmatory Factor Analysis (CFA) has been used to confirm the latent structure [70]. Since it was required to identify the possible factor structure for the 45 IQ attributes, an EFA was conducted. This methodology has been adopted by previous well-known studies related to this knowledge area such as [17] and therefore, authors followed this procedure when defining the IQ constructs and IQ dimensions. By conducting two EFAs, the IQ dimensions that influence each of the constructs were identified, and the seven dimensions were operationalized using the 45 IQ attributes. In Figure 1, A to G indicate the relevant IQ attributes that affect each IQ dimension. Accordingly, seven metrics (1 st order IQ functions) to measure the IQ dimensions and three metrics (2 nd order IQ functions) to measure the E-A-T constructs were derived.
Subsequently, the authors conducted a regression analysis to observe the association of each quality construct with IQ.

. Proposed conceptual framework and operationalization of the study variables
The suggested model's performance was then evaluated using three approaches. They are (1) classification performance, (2) clustering performance and (3) Fleiss-Davies Kappa [71] inter-rater reliability test.

IV. DATA ANALYSIS AND RESULTS
The dataset was then subjected to a cleaning process using SPSS. When extracting meta-data from the articles, several values (0.8%) were empty for some IQ attributes, and therefore, a series mean was used to replace the missing values to avoid loss of data points [72].

A. 1 ST ORDER IQ FUNCTIONS
The first EFA was conducted for the 45 IQ attributes to identify the underlying structure as the first step. The analysis suggested seven components. Accordingly, the following seven 1 st order IQ functions were defined to derive the values for each of the IQ dimensions based on the nature of the suggested variable groupings and empirical findings (refer to equations 1-7

B. 2 ND ORDER IQ FUNCTIONS
After identifying the variable grouping through the first EFA, the dataset was tested for the four parametric assumptions since these metrics are subjected to regression analysis. These assumptions are normality distribution (Skewness and Kurtosis were above and below zero), linearity (all the scatter plots showed a linear behavior), multicollinearity (VIF was less than 5), and homoscedasticity (residuals plots with regression line method were used and significant homoscedasticity was observed in each measured variable). The descriptive statistics of 1 st and 2 nd order IQ functions are given in Table  II. Satisfied four assumptions revealed that data are ready for further analysis [73]. Next, the dataset with values for all the IQ dimensions was fed to the second EFA to identify the IQ dimensions affecting each E-A-T construct. The analysis exactly suggested three components as the authors expected. The three constructs, Expertise, Authority and Trustworthiness, were derived by observing the nature of the factor loading along with the findings from the literature. Accordingly, the IQ dimensions that affect the E-A-T model were identified (see Figure 1). The following 2 nd order IQ functions were defined to measure each construct (refer to equations 8-10). For both EFAs, Principal Component Analysis (PCA) was used as the extraction method, and Varimax with Kaiser Normalization was used as the rotation method.  The R 2 value of 0.75 indicates that the regression model has high explanatory power. Thus, it can be concluded that Expertise, Authority and Trustworthiness reasonably predict IQ. According to the results, it was observed that the Expertise of a Wikipedia article is positively associated with its IQ (ß = 0.673, p<0.000). Similarly, the Authority of a Wikipedia article is positively associated with its IQ (ß = 0.175, p<0.000) and the Trustworthiness of a Wikipedia article is also positively associated with its IQ (ß = 0.154, p<0.000) (refer to  Table III). Thus, the study results also support that Expertise, Authority and Trustworthiness improve Wikipedia articles' IQ.

V. EVALUATING THE PROPOSED MODEL
The proposed model was evaluated and validated in three ways, (1) Classification performance, (2) Clustering performance and (3) Interrater-reliability test.

A. CLASSIFICATION PERFORMANCE
The model's classification performance was tested using the same dataset used in the study as the first evaluation method. The independent variables were expertise, authority and trust- In the Decision Tree, the tree could be inspected to learn how specific features influence building complex models. Therefore, the authors also adopted the same classifier to assess the proposed model's classification performance in terms of accuracy and F1-measure. The accuracy is the fraction of correct predictions, and the F1-measure combines the precision and recall into a single metric by considering that there is a harmonic mean. Accordingly, the proposed model received a 95% accuracy and a 94% of F1-measure. Consequently, the suggested model's classification performance was compared to that of the previous three wellknown models. Stvilla et al. proposed Model 1 in 2005 and is regarded as the pioneering research work related to this knowledge area [17]. Model 2 was presented in 2013 by Warncke-Wang et al. [16] and Model 3 is a deep learningbased model introduced in 2017 by Shen et al. [42]. Shen et al. have formulated the quality assessment as a classification problem and proposed an LSTM based model using a set of hand-engineered features. This model has achieved 6.5% higher accuracy than the state-of-art approaches. According to the results, the accuracies of Model 1, Model 2 and Model 3 were 73%, 68% and 83%, respectively, along with F1measures of 72%, 59% and 80%, respectively (refer to Table  IV). The results show that the suggested model has notably improved classification performance. Hence, the proposed model has resulted in a finer IQ assessment for Wikipedia than most of the existing approaches.

B. CLUSTERING PERFORMANCE
The model's clustering performance was evaluated as the second evaluation technique. K-means clustering was used by Stvilia et al. [17] to assess the power of their IQ measures in terms of how successfully they classified the articles into the correct classes. Therefore, the authors also decided to adopt the same clustering algorithm. Based on the derived values for the 1 st and 2 nd order IQ metrics, a single numeric IQ value per article was derived using equation (1). This IQ was then clustered using the K-means clustering algorithm and the IQ classes to cluster comparisons were observed. Accordingly, 93% of the articles have been properly clustered by the proposed model. The clustering was replicated to Model 1, 2 and 3, and the comparisons of the results with the proposed model are given in Table V. Accordingly, Model 1, 2 and 3 have appropriately clustered 66%, 62% and 73% of the articles. Therefore, it indicates that the proposed model gives far more insight into Wikipedia's IQ evaluation than earlier models.

C. INTER-RATER RELIABILITY TEST
The authors adopted Fleiss-Davies Kappa [71] as the third evaluation technique. It measures the level of agreement (inter-rater reliability) between a set of raters. A set of three raters who possess prior knowledge on IQ and Wikipedia were chosen for the process. Firstly, each of them was given the 45 IQ attributes and the set of IQ dimensions to group into each of the attributes into one of the IQ dimensions, resulting in a Kappa value of 80%. Secondly, they were asked to group the set of IQ dimensions into the three constructs, Expertise, Authority and Trustworthiness, resulting in a Kappa value of 87%. According to [74], the degree of agreement between raters is represented by Kappa values. Kappa values greater than 0.75 indicates strong agreement above and beyond chance, while less than 0.40 indicates poor agreement above and beyond chance. Kappa values of 0.40 to 0.75 suggest a fair to a good level of agreement. Accordingly, both groups achieved an excellent agreement among raters, with an average Kappa value of 84%, concluding that the proposed model is valid.

A. FINDINGS OF THE STUDY
Through this study, several important findings need to be discussed. Firstly, the study introduces a novel framework based on the E-A-T model. The model was evaluated in three ways. It scored a 95% of classification accuracy, a 93% of clustering accuracy, and an 84% of Fleiss-Davies Kappa value. Accordingly, the study's findings reveal that the proposed model delivers a better Wikipedia IQ assessment than earlier models. This implies that the EAT model can be adopted to assess Wikipedia's IQ. Through this model, authors propose Expertise, Authority and Trustworthiness as the novel and the most precise constructs of IQ for Wikipedia  Secondly, a set of IQ dimensions that influence the above three IQ constructs were presented. Accordingly, the Expertise can be expressed in terms of information richness (informativeness), how well the reader can understand the presented information (understandability) and what amount of effort the reader should exert to read a particular piece of information (readability). Authority also improves the IQ, which means that when other experts cite the content, it implies that the relevant Wikipedia articles are of high quality. Higher Trustworthiness also leads to higher IQ. Trustworthiness can be expressed in the ability to rely on a particular article and how well that article maintains its quality over time (reliability), state of being capable of confirmed (verifiability), and maturity of the content.
Thirdly, a comprehensive set of 45 IQ attributes was presented. These 45 attributes were used to measure each of the above IQ dimensions. Fourthly, to overcome the limitations in previous studies' manual feature extraction approaches [42], an efficient web scraping mechanism was employed to automatically extract the IQ attributes from Wikipedia content and meta-data.

B. IMPLICATIONS
There are several theoretical implications of this study. The study adds to the body of knowledge a novel IQ assessment model for Wikipedia based on Google's E-A-T model, which comprises of (1) three precise IQ constructs Expertise, Authority and Trustworthiness (2) a set of IQ dimensions that influence the above constructs and (3) 45 IQ attributes to measure the above dimensions.
In terms of practical implications, the findings of this study assist readers in identifying high-quality Wikipedia articles. Thus, rather than labelling it as an unreliable source of information in the big picture, this approach can be presented to extract high quality and credible information from Wikipedia, the largest source of freely available online information store on the web. On the other hand, based on the proposed model, the study suggests an automated approach of identifying the flaws of articles that would provide immediate guidance for reviewers, authors and editors to implement the quality improvements in the Wikipedia community, which consumes much time and human effort. Furthermore, it is important to note that this approach is not a replacement for the existing manual reviewing method but should be used as an initial screening tool and a supportive tool during the formal review process.

C. LIMITATIONS AND FURTHER WORK
Firstly, the authors tried to generalize the study results by collecting the data from six different domains: Medicine, Politics, Sports, History, Science and Biographies. However, repeating the study for other domains and increasing the size of the dataset would make the results more generalized than this. Secondly, this study focuses on English language Wikipedia articles only. Wikipedia articles, on the other hand, are available in over 300 different languages. Thus, similar studies in other languages also can be conducted. Furthermore, the presented E-A-T based model with three constructs and the set of IQ dimensions can also be generalized and adapted to assess the IQ of other online collaborative repositories and UGC by replacing them with new IQ attributes available in those repositories followed through the hybrid approach.

VII. CONCLUSION
In conclusion, this study provided a new perception of adopting the E-A-T model to assess the Wikipedia article's IQ. It was proven and validated that this model provided an improved IQ assessment for Wikipedia than the existing methods. Assessing the IQ of Wikipedia benefits both readers and collaborators. It directs readers towards highquality articles and guides collaborators to identify the flaws of the articles and plan for further quality improvements. Furthermore, based on the empirical findings and current analysis, it is worth noting that the hybrid approach of feature extraction which is a blend of content and meta-data statistics aids in the construction and evaluation of IQ variance of any online collaborative resource in a costeffective and scalable manner. The authors intend to extend this study by improving the dataset by size and domain. Furthermore, extending the model to other commonly used languages of Wikipedia articles is also listed as a future work of this study. On the whole, the findings of this study provide good insights towards assessing the IQ of Wikipedia articles and efficient knowledge collaboration in the Wikipedia community.  For any particular page, it is possible to discover how many users have it on their watchlist. When a page is included in one's watch list, he will be able to track any changes done to that page [18] Meta-data