Quantifying Success in Science: An Overview

Quantifying success in science plays a key role in guiding funding allocations, recruitment decisions, and rewards. Recently, a significant amount of progresses have been made towards quantifying success in science. This lack of detailed analysis and summary continues a practical issue. The literature reports the factors influencing scholarly impact and evaluation methods and indices aimed at overcoming this crucial weakness. We focus on categorizing and reviewing the current development on evaluation indices of scholarly impact, including paper impact, scholar impact, and journal impact. Besides, we summarize the issues of existing evaluation methods and indices, investigate the open issues and challenges, and provide possible solutions, including the pattern of collaboration impact, unified evaluation standards, implicit success factor mining, dynamic academic network embedding, and scholarly impact inflation. This paper should help the researchers obtaining a broader understanding of quantifying success in science, and identifying some potential research directions.


I. INTRODUCTION
Success in science refers to scientists' achievements in their academic careers.Quantifying success in science has developed into a very important part of bibliometrics and scientometrics.An influential publication or scholar always brings much to the followers to carry out their research.Therefore, the ability of bibliography retrieval is very important for researchers, including mining, managing, and examining scholarly big data to identify the successful papers and scholars [1]- [5].In addition, quantifying scholar impact has special significance in funding allocation and recruitment decisions.Quantifying the impact of paper and journal can help scientists know the frontier of science development.Therefore, quantifying success in science provides useful guidance to the scientific community, such as offering candidates to university, recommending scientists for promotion, and distribution for research funds [6], [7].
Quantifying success in science mainly focuses on quantifying the current impact of academic entities, including paper, scholar, journal, scholarly team, and institution [8]- [11].Because the research on the impact of paper, author, and journal is very rich, this paper mainly introduces quantified success in science from these three aspects.Generally, the number of citations is used as an evaluation indicator, which is derived from its easy availability.Lots of factors influence a paper's success, such as paper's visibility [12], [13] and paper's age [14].A common method to judge the success of a scholarly paper is to use evaluation indicators, which may take into several important factors.The counting-based and network-based evaluation methods are frequently used to quantify success in science.The counting-based methods are the most direct representation of evaluating, such as citations, author's h-index [15], and Journal Impact Factor (JIF) [16].Different academic entities form different kinds of academic networks, such as citation network, co-author network, and co-citation network [17].Currently, the HITStype and PageRank-type algorithms can mine the complex scholarly relationship based on different scholarly networks and give reasonable evaluation.The features of scholarly networks are also critical to evaluate paper impact.Further, based on these features, many researchers have improved PageRank [18] or HITS [19] algorithms to make them more suitable to measure the impact of paper.
Same as quantifying the impact of paper, scholar impact is also influenced by many factors.Lots of methods and indices to measure scholar impact are proposed, such as h-index [20], g-index [21], and hg-index [22].These indices can be unfair for some young researchers because the quality and quantity of a scholar's publications are associated with their academic ages.The methods based on the network can avoid this situation to a certain extent.
Evaluating journal impact is an important part of quantifying success in science.Many network-based evaluation methods and indices are used to quantify the impact of paper and author, which can also be used to evaluate journal [23]- [26].These methods are based on PageRank, HITS, or consider the structural position of a journal in the journal citation network.In addition, Journal Citation Reports (JCR) is very popular for ranking journals.
Even though the existing research provides a tool to quantify success in science, it still has some limitations.Every indicator to quantify scientific impact has its shortages.In particular, in quantifying scientific success research, one of the most challenging problems stems from the heterogeneous attribute and the dynamic nature of big scholarly data.At present, in most of quantifying scientific success methods, implicit features and implicit relationships have attracted the attention of researchers [27].
This paper presents a review of recent developments in quantifying success in science and this review complements relevant work in the past: Wildgaard et al. [28] present a review on author impact evaluation.One limitation of this review is that it does not consider paper and journal impact evaluation research.Bai et al. [9] offer a review of the literature on paper impact evaluation.This overview covers key techniques and paper impact metrics.The limitation of this work is that authors have not consider author and journal impact evaluation.In addition, factors influencing scholarly impact have not analyzed.Therefore, in this paper, the progress of impact evaluation of the paper, author and, journal is described in detail.
FIGURE 1 shows the framework of quantifying success in science.Quantifying success in science includes the following parts: data collection, data pre-processing, relationship analysis, evaluation method and evaluation indices.Several public accessible data sets are used to quantify success in science, including American Physical Society (APS)1 , Digital Bibliography & Library Project (DBLP) 2 2 , and Microsoft Academic Graph (MAG) 3 .Data pre-processing in quantitative scientific success studies is very important because it relates to the accuracy.The homogenous and heterogeneous scholarly networks are used to research the scholarly relationships such as citation relationships, co-author relationships, and paper-journal relationships.Spearman's rank correlation coefficient, Discounted Cumulative Gain, and RI can be used as evaluation metrics for quantifying success in science [29], [30].Specially, the heterogeneous scholarly network structures have increased the challenges in scholarly network analysis.
To retrieve the papers of quantifying success in science, based on Google Scholar, we enter search terms such as the success of science, paper impact, scholar impact, journal impact, etc.We first search for the related papers recently published in top journals and top conferences, and then look for their references, and the papers cite these papers to obtain more related papers.Search for papers in a step-by-step manner, then filter and classify from three aspects: paper impact, scholar impact and journal impact, and retain the representative related papers.Based on the above work, we mark the publication years of these papers, read through these papers by year, analyze and summarize the following aspects: the features that influence scholarly impact, evaluation methods and indices.For example, in terms of these features of evaluation paper impact, we classify these features, including reference, references, selected features, statistical feature, network feature, explicit feature, implicit feature, and evaluating paper impact.By analyzing and summarizing these evaluation methods, we identify open issues and challenges, and provide possible solutions.
The rest of this survey is organized as follows.In Section II, we discuss the evaluation of paper impact.In Section III, we introduce the evaluation of author impact.The evaluation of journal impact is discussed in section IV.Open issues will be discussed in Section V. Finally, we conclude this survey in Section VI.

II. EVALUATION OF PAPER IMPACT
In this section, we will make a detailed introduction to the evaluation methods and indices of paper impact.Besides, we will discuss the evolution of the existent methods and indices, showing their advantages and shortcomings.At first, we begin with the evaluation of paper impact, because many assessment methods and indices of scholars and journals are based on the assessment of their papers.Therefore, it is of great significance whether the quality of papers can be quantified accurately.Although the value of a paper is mainly based on its content, the evaluation of its content is easily influenced by subjective factors, and the evaluation efficiency cannot meet the demand of scholarly bid data.This phenomenon drives researchers to give some accurate, efficient automatic evaluation methods.One possible solution is to construct a multi-dimensional metric in which the importance of citation, social relationships of authors, the relationship between the impact of early citers and scholarly paper impact, and citation inflation need to be explored.statistical feature, network feature, explicit feature, implicit feature, and evaluating paper impact.

A. FACTORS INFLUENCING THE IMPACT OF PAPERS
The number of citations has been used as a metric to evaluate paper impact for a long time [31].Since the number of citations is relatively easy to obtain, it is frequently manipulated such as self-citation, mutual citation, and friend's citation.Although some scholars can cite their papers, because their research subjects can have several stages output and the former results can be the foundation of the latter.But if a self-citation only means to increase the number of citations, it will mislead the scholarly evaluation and bring unfair factors to the evaluation system.For inappropriate citations, previous researchers proposed corresponding methods to waken the influence of self-citation by relying on the higher-order citation network [27].
Previous research shows that the impact of paper will decay over time, which confirms that the age of a paper is a factor influencing its impact.Generally, an old paper has more citations than a new one, but its work was already covered by new papers so that it could get fewer citations in the future.Parolo et al. [14] showed that the decay of the attention paid to a paper is a universal phenomenon, and the decay rate is close to a power law.In some cases, papers can be forgotten more quickly so the attention decay is faster, which fits an exponential curve.The time factor, the prestige of a paper, and the prestige of the author were used to evaluate scholarly paper impact [37].Based on the three factors, they evaluated scholarly paper impact by predicting the number of citations of scholarly papers in the future.Wang et al. [33] considered the aging factor to evaluate paper impact because it can capture the fact that new ideas are integrated in subsequent work.Wang et al. [38] first developed the three indices: the time-weighted citation count, the citation width, the citation depth.They then leveraged entropy to weight these indices to evaluate paper impact.Chan et al. [39] discussed that the impact of authors and affiliations can influence on the impact of their papers.In their research, they argued that the reputation of authors and the impact of their affiliations had the power to boost paper impact in the early stages of publication, but this influence could decay fast and in the following stages.Chen et al. [34] found the scientific gems using Google's PageRank algorithm in the citation network.Zhang et al. [35] evaluated the impact of authors and papers based on the heterogenous author-citation academic networks.
In addition to the factors mentioned above, some other factors were also used to evaluate paper impact, such as individual, institutional and international collaboration, reference impact, reference totals, keyword totals, and abstract readability [40].Preferential attachment, fitness, and aging factors were used to quantify the long-term scientific impact, and the three factors can drive the citation history of scholarly paper [33].In this research, the preferential attachment captures the fact that highly cited papers are more likely to be cited again than less-cited papers.Fitness captures the inherent differences between scholarly papers.The aging has been introduced before.It can be traced back to the journal impact factor that was once used as a criterion for assessing the impact of a paper [41].Altmetrics evaluated scholarly impact based on the activities in the social media platforms, such as citations, blogs, tweets, download statistics, and attributions in research articles [36].Altmetrics scores were used to complement the evaluation of scholarly paper with new insights [42].Since we have already known most factors that influence the impact of paper, the evaluation methods and the corresponding indices can be designed.

B. COUNTING-BASED EVALUATION METHODS AND INDICES
TABLE 2 shows the comparison of counting-based evaluation methods and indices from the following aspects: method and reference, selected factors, importance of each citation, advantage and disadvantage.
Garfield et al. [47] first proposed using the number of citations to assess the impact of scholarly papers.Citations are the simplest and most direct counting-based index of paper impact.However, citations as an evaluation metric have some drawbacks.For example, it relies heavily on the time of publication of the paper.The longer the time is, the more the citations are.Considering this drawback, previous   [14] citation rate of a paper, time yes no yes no the citation rate of a paper at a given time [27] a relative citation weight yes no no yes applying a relative citation weight to the higher-order quantum PageRank algorithm [30] collaboration times, the time span of collaboration, citing times and the time span of citing yes no yes no weakening the relationship of Conflict of Interest (COI) in the citation network [31] number of citations yes no yes no using the number of citations [32] citations, authors, journals/conferences and the publication time information no yes yes yes integrating the selected features into PageRank and HITS algorithms [33] preferential attachment, aging, fitness yes no no yes identifying the three fundamental mechanisms to evaluate long-term impact [34] importance of paper no yes yes no applying the Google PageRank algorithm to obtain the relative importance of all publications [35] citation relevance and author contribution no yes yes no using the selected features to weight citation network and authorship network to evaluate paper impact [36] Altmetrics yes no yes no monitoring citations, blogs, tweets, download statistics and attributions in research articles [37] prestige of a paper, prestige of author, time yes yes no yes using the citation network, the authorship network and the publication time of the article for predicting future citations [38] the time-weighted citation count, the citation width, the citation depth yes no no yes using entropy weight the three indices research used the journal impact factor to quantify the impact of paper [41].The reason is that to a certain extent, journal impact can characterize paper impact.However, Seglen et al. [41] summarized problems associated with the use of journal impact factors, and they found that the journal impact factor is not representative of individual paper.It has been recognized that not all citations are equal importance and hence the importance of citation needs be distinguished [45].
To distinguish the importance of citation, previous researchers have made many attempts.Wan et al. [44] divided the importance of citation into 5 levels, which was called citation strength.In their research, the importance of citation was determined by the following features: occurrence times, located section, time interval, the average length of citing sentences, average density of citation occurrences, and self-cited.Then a SVR model was used to calculate every citation's importance level with giving some artificially labeled data.The impact of a paper is calculated by summing up all the citation strengthes.Their experimental results showed that ranking papers using citation strength fitted the ground truth better.Zhu et al. [45] distinguished the importance of citation by identifying a set of four features that are useful to determine the impact of a scholarly paper, including citation location in paper, semantic similarities between titles of cited paper and the content of citing paper, cited frequency, number of citations in a literature.
Anfossi et al. [46] argued that it was more reasonable to rank papers by combining the information of several indicators than using only one.In their paper, an evaluation tool was proposed, which used paper's normalized distribution of Easy to be manipulated.citation and JIF and located a paper in the (citation, JIF) space intuitively as a scatter plot.Then this space was divided into regions by drawing thresholds as weighted linear combinations of the paper's citation and JIF, shown in Function (1), where const n is a constant that controls the segmentation of the region, and CIT indicates paper's citation.The different calibrations of the segmentation result in different classifications of articles.Before Anfossi's work, Ancaiani et al. [48] performed an analysis of a large amount of research outcomes submitted by Italian universities and other research bodies.
Nowadays more and more research results or papers are spreading on social media, which is helpful to promote a scholar's impact.The times of downloading, sharing, or commenting of papers on the online social networks have already been a group of metrics to evaluate the research outputs, which are known as Altmetrics [36].The social network-based Altmetrics are used more and more widely as a new emerging evaluation metrics of paper.Xia et al. [49] performed an analysis on how the Twitter and Facebook users impact the paper's influence published on Nature.They found that the users of Twitter are easier to spread the impact of papers published on Nature.Although Altmetrics are able to complement and improve the assessment of paper impact, Altmetrics are not authoritative as an evaluation indicator.Mainly because Altmetrics are easily manipulated as citations.The method of quantifying academic impact based on Altmetrics needs further exploration.

C. NETWORK-BASED EVALUATION METHODS AND INDICES
TABLE 3 shows the comparison on network-based evaluation methods and indices from the following aspects: method and reference, selected factors, scholarly network, algorithms, advantage and disadvantage.
A classical network-based evaluation method is PageRank algorithm [18].Another famous algorithm for evaluating the importance of nodes in heterogeneous networks is HITS.The two methods have been used to quantify the impact of papers.PageRank algorithm is used in a homogeneous scholarly network, and HITS is used a heterogeneous scholarly network.FIGURE 2 shows several typical scholarly networks for paper impact evaluation, such as citation network, co-author network, paper-author network, paper-journal network.The four scholarly networks are generated from randomly selected 10 authors for the computer science area in the MAG dataset.The different color nodes indicate different types of academic entities and the lines between them indicate their scholarly relationships.
Chen et al. [34] applied the Google PageRank algorithm on all publications in the Physical Review family of journals from 1893 to 2003 to find out some exceptional papers.PageRank can find the linear relation among papers in the citation network.Recently, London et al. [55] proposed a local form of PageRank to evaluate the impact of paper only on a small set of nodes extracted from the whole citation network.A paper that has more citations or has been cited by an important paper will be set a higher score through the algorithm.But the classical PageRank algorithm is non-timesensitive.This leads to an unreasonable result that an out-ofdate paper may still get a high impact because of its citations accumulating long before, but its true value has already been replaced by many new publications.To overcome this problem, Walker et al. [50] introduced CiteRank, to weight with time-based on PageRank to promote recent publications.The function of this method is as follows: T is a matrix of the final scores of all papers.W is the transferring probability matrix where W ij = 1/k out j if j cites i and 0 otherwise, where k out j is the out degree of the jth paper.ρ i is the initial probability of selecting the ith paper in the citation network, there given as ρ i = e −agei/τ dir , where age i indicates years the ith paper's after published.
Many efforts have been paid for updating the PageRank to make it fit the characteristics of the academic network.Yao et al. [51] introduced nonlinearity to the PageRank algorithm by aggregating the score from downstream neighboring nodes in a nonlinearity way.The iteration function changes into the following form correspondingly: By tuning the value of θ, this method can control the paper's score accumulation and make it more sensitive to the citer's impact.This nonlinear method considers that the value of a citation from high impact paper is more important than the one from low-level paper.Wang et al. [52] proposed a PageRank-type method that used several scholarly networks to rank papers, including a time-aware co-author network (M AA ), a time-aware paper citation network (M P P ), an author-paper network (M AP ) indicating the paper's authorship, a paper-text feature network (M P T ) indicating the paper's textual features and an author-text feature network (M AT ).The iteration equation is: where R = [A_P T , A_A T , A_F T ] T , and M =   Λ I and Λ E are both diagonal matrixes with the diagonal elements Λ ii = 1 and Λ ii = E i , respectively.Λ 0 is a zero matrix.Vectors A_P T , A_A T and A_F T are authority of paper, author and text features respectively.Jiang et al. [56] took this dynamic evolution of citation network into account and put forward a method with the same idea of PageRank.The method integrates three factors in scientific development, including knowledge accumulation by individual papers, knowledge diffusion through citation behavior, and knowledge decay with time elapse.Then it uses a random walk process on the citation network to describe these three factors.The dynamically evolving process is simulated by dividing all papers according to their publishing time and adding into the citation network partially with the time sequence.
Another type of method is based on HITS [19].Zhou et al. [53] performed the HITS algorithm on paper's citation network and co-author network, which were connected by authorship.In both citation network and co-author network, nodes' scores were first calculated by PageRank, and then a HITS was performed on the bipartite graph to get the final scores of papers and authors.So this method can evaluate the impact of authors and their papers at the same time.The iteration function is as follows: where matrix A and D are the transferring probability matrix of co-author network and citation network correspondingly.And A is the iteration matrix in the PageRank process on coauthor network, which is given by A = (1 − α)A + α n A I, where I is a matrix with all elements equaling 1.D is the same meaning.Vector a storages the scores of all authors and vector d storages the scores of all papers.A similar method is the Tri-Rank algorithm proposed by reference [54] in 2014, which took the paper's publication information into account and performed a HITS-type method on three linked networks, adding a venue citation network on the two networks used before.
In addition, some methods that combine PageRank and HITS to evaluate the impact of papers.A typical one is FutureRank, proposed by reference [37].Different from other methods, FutureRank ranks the impact of papers and authors by predicting their future PageRank scores.PageRank algorithm is first used to rank papers via the citation network, and then the HITS algorithm is used to calculate the authority VOLUME 4, 2016 score of papers and hub score of authors based on the hybrid network.After calculating the PageRank score of papers, the authority score of papers, and the hub score of authors, the final result of the evaluation is finally obtained by weighting to these scores, seeing function (6).

S(P
where n is the number of nodes in the network.Wang et al. [32] proposed a similar method that added a journal/conference network to show where the paper was published.The evaluation method's form is the same as Futur-eRank but it can rank journals/conferences together.Using the HITS algorithm can also evaluate paper and author's quality.Based on their work, Bai et al. [30] ranked scholarly papers by investigating the citation relationships to weaken the relationship of Conflict of Interest in the citation network.To a certain extent, this method weakens the impact of self-citation.Besides, Bai et al. [27] quantified the impact of scholarly papers based on the higher-order weighted citations.In this research, a higher-order weighted quantum PageRank algorithm is developed to reflect the multi-step citation behavior.One advantage of the method is that it can weaken the effect of manipulated citation activities.

III. EVALUATION OF SCHOLAR IMPACT
The evaluation of scholars always relates to their papers.Many methods can evaluate paper together with its authors, such as Co-rank [53], Tri-Rank [54], FutureRank [37], sindex [57].These network-based methods usually rank several academic entities together because using information provided by a single network is always not enough to give a reasonable evaluation.There are also some counting-based evaluation methods like the famous h-index for quantifying author impact.In this section, we compare different counting-based methods, including method and reference, selected factors, importance of each citation, advantage, and disadvantage.We also compare different network-based methods based on the following several aspects: method and reference, scholarly network, homogeneous network, heterogeneous network, algorithms, advantage, and disadvantage.Besides, we will discuss the evolution of the existent methods and indices and summarize the issues of these methods.One possible solution is to explore the higher-order academic network analysis, author impact inflation, and academic success gene.

A. FACTORS INFLUENCING THE IMPACT OF SCHOLARS
The author impact evaluation has undergone a transition from unstructured measure to structured measure [29].The factors used by researchers to assess author impact ranged from simple statistical factors to structural factors, from explicit factors to implicit factors.Currently, the commonly used factors influencing the impact of scholars can be divided into six categories including paper-related, authorrelated, venue-related, social-related, reference-related, and temporal-related factors.TABLE 4 shows an example of selected factors for evaluating author impact.
In the scientific community, scholars can continuously accumulate academic impact but to some extent, the inherent impact of scholars determines their final research results.Since the papers published by scholars can represent the impact of scholars, the paper-related factors are frequently used to measure the impact of scholars.These factors can be selected primarily to consider the quality and quantity of the papers.However, these factors can lead to bias.The academic output of scholars is generally related to their academic age.Scholars with an old academic age may have more output.In this way, simply evaluating scholar impact in terms of output has a big bias for newcomers.Such biases also exist when evaluating scholar impact across research fields.Scientists have made many attempts to eliminate the imbalance between disciplines in evaluating scholar impact.In addition, the allocation of contributions of co-authors of a scholarly paper may also lead to bias in scholar impact evaluation.Shen et al. [68] developed a credit allocation algorithm to capture the co-authors' contributions.
To a certain extent, author-related factors and venuerelated factors can reflect the scholar's impact.Dong et al. [66] found that two factors, the impact of scholars and venues, played a key role in improving the h-index of lead authors.Deville et al. [69] discussed the mobility patterns of scientists at an institutional level and success in science in their careers.They found that the consequence of scholars switching from high-impact institutions to low-impact institutions is a decline in both research quality and output, suggesting that the academic environment has an impact on academic outcomes.Scholars also use online platforms (Google Scholar, Microsoft Academic Search), and social media to enhance their academic impact.Mas-Bleda et al. [70] found that although most highly cited scholars working in European institutions had their institutional web pages, they rarely maintained them.Most of them used other social media, which also accelerated the development of Altmetrics.
In addition, reference-related factors and time-related factors have attracted scholars' attention.Dong et al. [66] researched scholar's impact considering two reference-related factors: the ratio of max-h-index citations of references to the total number of references of the paper and the average number of citations accumulated by references of the paper.Zhang et al. [67] considered academic innovations and assessed scholar impact by a Time-aware ranking algorithm, allocating more credits to the newly published papers according to the representative time functions.Based on the above factors, many evaluation indices have been proposed to quantify scholar impact.In the following two subsections we introduce the counting-based evaluation methods and indices, and network-based evaluation methods and indices respectively.

B. COUNTING-BASED EVALUATION METHODS AND INDICES
In 2005, Hirsch [20] proposed the famous h-index to evaluate scholar impact, which is the most famous metric widely used in the whole scientific community.A scholar's h-index means that he has at least h papers cited at least h times.The advantages of h-index include that it is easy to compute and the definition combines quantity and quality of a scholar's outputs.But there are still some scholars who argue that hindex has many shortcomings such as the unbalance between different disciplines, the allocation of co-authors' impact, and the impact of highly cited papers ignored.To keep the impact of highly cited papers from being ignored, Egghe et al. [21] proposed the g-index.If the citations of all papers published by an author are listed in descending order, the g-index is top g scholarly papers with g 2 citations.Similar to the g-index, Jin et al. [71] proposed R-index and AR-index to overcome the shortcomings of h-index.The R-index is defined as where h is the author's h-index and cit i indicates the author's papers that have been cited more than h times, also known as the h-core papers.The AR-index takes age of publications into account, which is calculated by where a i denotes the i-th paper's age.
For the same purpose, Zhang [72] divided the author's citation function into three parts: the h-squared representing the information of the h-index itself, the excess representing the information of papers having more citations than h-index and the h-tail representing the information of papers with fewer citations.Then, a triangle mapping technique was used to map these three parts to a regular triangle to make the analysis easier.An author's impact was mapped to three parts correspondingly the excess (e-index) representing the research quality, the h-tail (t-index) representing the research quantity and the h-square (h-index) representing the average.This method used three independent parts to quantify an author's impact.In this paper, the authors are divided into two types.The first type of authors have published several high-quality papers but these authors have lower H-index or higher e-index; the second type of authors have published a large number of low-quality papers, but these authors have relatively high h-index, t-index, and lower e-index.Dorogovtsev et al. [73] developed the o-index to improve the impact of most cited papers.An author's o-index is defined as o = √ hm, where h is the author's h-index, and m indicates the citations of his/her most cited paper(s).
Another disadvantage of h-index is that it considers all authors of a paper equally.Authors of a multi-authored paper always don't have equal contribution to the work, therefore, the h-index leads to bias.Many studies have tried to solve this problem.Wang et al. [74] presented A-index to quantify the relative contributions of co-authors.Based on A-index, Stallings et al. [60]developed a collaboration index, C-index, to quantify the author impact.C-index was defined by VOLUME 4, 2016 where A k was the author's A-index.The P-index was proposed to quantify researcher's impact by considering the quality of publications, which was given by where the JIF k was the impact factor of the journal where the kth paper was published.Besides, some researchers pointed out that even authors that had different citation patterns may get the same h-index.Farooq et al. [75] proposed the DS-index, which is an extension of g-index and intend to provide a distinctive ranking for authors with similar citation pattern.The DS-index is defined as where g is the number of g-core papers and cit k is the kth g-core paper's citation.Same as h-core papers, the g-core papers are papers that are used to calculate the g-index of the author.The indices introduced above are all extension and improvement of h-index.Using h-index can partly reflect the publication behavior and the citation distribution of an author.To more reasonably quantify scholar impact, Sinatra et al. [76] explored the citation distribution of physicists and found that the highest-impact of a scholar was randomly distributed in their academic careers.Based on this randomimpact rule they proposed a stochastic model, in which a unique parameter Q was assigned to predict scholar impact.The Q-value of an author i is calculated by where Q i is the Q value of an author i. logc iα indicates the average logarithmic citations of all papers published by author i. α is the α-th paper of author i. µ p is the average impact of luck in the success of papers.Citation-based author impact evaluation methods show differences among disciplines.Waltman et al. [77] found that using the fractional counting method can give a more suitable result for cross-field scholar evaluation.Radicchi et al. [78] proposed a universal variant h-index to solve this problem, named h f -index.In 2013, together with Radicchi, Kaur et al. [79] improved the h f -index and proposed a new method to compare scientific impact across disciplinary boundaries.The new h s -index was introduced in their work, which was a normalized h-index by the average h-index of all authors in the same disciplines.Lima et al. [80] considered that a paper can belong to several research areas and the author's impact in an area was calculated by the papers published in the area, which was used by the author's percentile rank.Finally, the impact of an author was quantified by summing up impact across all areas.By this method, although the bias among different disciplines can be reduced, the authors who are active in a rapidly developing area can also get a higher score than others in the basic disciplines.

C. NETWORK-BASED EVALUATION METHODS AND INDICES
Because counting-based evaluation methods are easily manipulated in evaluating scholar impact, scholars explore the structured methods to overcome the shortcomings.The network-based evaluation methods of scholars have evolved from homogeneous scholarly networks to heterogeneous scholarly networks [32], [37], [52]- [54], [57], [65], [81].The scholarly networks are made up of academic entities, including scholars, papers, journals or conferences, and institutions.Ding et al. [82] used the PageRank algorithm to quantify the impact of the author based on author co-citation network.Yan et al. [83] developed P-Rank, which used three different networks, including citation network, authorship network, and publish-relationship network, to evaluate the impact of authors, papers and journals.A HITS-type method was first performed to update the scores of papers, authors, and journals in the authorship network and publish-relationship network.Then these scores were used as nodes' initial values to run a PageRank in the citation network to get the final scores of papers.Because the HITS-type algorithm is more suitable for heterogeneous academic networks, mining the academic relationships of heterogeneous networks in depth can make the HITS-type algorithm work better.Amjad et al. [84] considered the topic distributions of scholarly entities that were generated by Latent Dirichlet Allocation (LDA) [85] and proposed a topic-based ranking method called Topic-based Heterogeneous Rank (TH Rank).Because of the network complexity and the cost of computing LDA, TH Rank is not an efficient algorithm.Li et al. [86] put forward a method named QRank for the purpose to rank authors effectively and efficiently.Nykl et al. [87] used the PageRank algorithm together with several individual evaluation indices, including h-index, publication count, citation count, and author count of a publication to rank scholars.
Although the existing network-based evaluation methods have achieved certain results, the existing evaluation methods still have the following problems: (1) most previous studies quantify author impact based on the first-order academic networks; (2) the citation inflation influences the real impact of the author; (3) the origin of the academic success genes is unknown.Therefore, the higher-order academic network analysis, author impact inflation, and academic success gene need to be explored.

IV. EVALUATION OF JOURNAL IMPACT
The impact of journal generates from papers published.Authors are more willing to publish papers on the journal with a high impact.The evaluation of journals is associated with the evaluation of papers and authors.There are several famous publishing groups around the world.They are Elsevier, Springer, Wiley, Wolters Kluwe, and Pearson.It is worth mentioning that the most famous journals, Lancet and Cell, are published by the Elsevier, and Nature is published by the Macmillan.From 1975, Journal Citation Reports (JCR) started to provide the last year's Impact Factor (IF) of journals, together with other evaluation indicators such as the journals' current rank, abbreviated journal title, International Standard Serial Number (ISSN), total cites, immediacy index, total article and cited half-life.Since JCR was taken as an important data resource for quantifying journals.The JCR metrics have become the most popular indices to evaluate journals, and several other metrics have been proposed by the Thomson-Reuters, such as EigenFactor Score (EF), the yearly JCR, and CiteScore4 .Nowadays, many other evaluation methods and metrics have been proposed except the JCR metrics.In the following subsection, we will discuss the evolution of the existent methods and indices, and summarize the issues of these journal evaluation methods.One possible solution is to explore journal impact inflation and the higherorder academic network analysis.

A. FACTORS INFLUENCING THE IMPACT OF JOURNALS
Some classical high-impact journals always have been lasting for many years, such as Nature and Science.Journal's quality is decided by the quality of papers published on it.Many metrics for evaluating journals are based on citations.The development of the Internet has promoted the paper's citation, as well as the impact of journals.Therefore, open access journals may have a higher impact than the private ones.
Journal impact is with strong discipline, that is, different disciplines have different authoritative journals.Besides, the journal's type may influence its impact factor.Some journals prefer to publish review papers, and some others publish long research papers and short papers.Generally, a review journal impact factor is higher than other journals in the same discipline.

B. THE JOURNAL CITATION REPORTS
The Journal Citation Report started in 1975.Now it provides more than 10,000 high-quality journals rank every year and is released on the Web of Science (WoS).The evaluation of journal impact contains several usually used metrics, such as journal's total cites, journal impact factor, impact factor without journal self-citations, 5-year impact factor, immediacy index, cited half-life, citing half-life, Eigenfactor score, article influence score and number of citable items of the journal and other metrics.This report is always seen as the most authoritative assessment of journals.
Journal impact factor, which always refers to the 2-year impact factor, was proposed by Garfield in 1955 [47].The JIF of a journal in year n was defined as follows: where P n−1 is the number of papers published on this journal in year n − 1 and C n−1 is the number of the journal's citations in year n − 1.The computation of the 5-year journal impact factor is the same as the 2-year impact factor, which is considering the number of papers and citations of the journal in recent 5 years.The impact factor without journal self-citations eliminates the influence of the journal's selfcitations, which gives a more objective evaluation of the journal's impact.The cited half-life is years that are taken to reach half of the total citations of the journal, which indicates the persistence of a journal's impact.The citing half-life is defined as the years for the number of references accumulating to half of all, which indicates the novelty of the references.
Other metrics, such as immediacy index, Eigenfactor score, and article influence score, are to cover shortages of the impact factor.The immediacy index is defined as the average citation of papers published on journals in the given year, which can reflect the impact of the journal in that year.Eigenfactor score is calculated by the journal's citation network without self-citation, using a PageRank-type method [88].

C. ANALYSIS AND IMPROVEMENT OF THE JCR
Although the JCR metrics are used widely, it leads to bias if only using a single metric to assess journals.Many efforts have been paid to overcome the shortages and many other metrics have been proposed, such as H-index for journals [15], SCImago Journal Rank (SJR) [89], Source Normalized Impact per Paper (SNIP) [90].In addition to using a single metric, it is found that the ranking result can be improved by combining these common metrics in some ways, like computing their harmonic means [91] or using the Neural Network to find a non-linear represent [92].Serenko et al. [93] found that scholars always preferred to the familiar journals and gave them a higher evaluation.It suggests that introducing personal opinions in the evaluation of journals may be helpful.Tsai et al. [94] studied the correlation between subjective evaluation (scholars' personal opinions) and objective evaluation (journal rank by JIF and h-index) and used the Borda counting method to combine the two ranking results.Beets et al. [95] ranked accounting journals referencing the departmental journal lists, which were used to evaluate faculty publications in several famous business schools.
There are also many scholars concern about the relationship among the different journal rank by these metrics [96]- [101].Setti [99] argued that it was impossible to capture the real impact of journals by any single indicator.Different evaluation methods quantify journals from different views, so which metrics are more useful is always based on application scenarios.Sometimes it is meaningful to rank journals only by the percentage of highly cited publications of a journal [102].Besides, the evaluation of journals in different disciplines or different fields of the same subject also needs discussion [102], [103].
Chatterjee et al. [104] studied the citation distribution and found that a few high-cited papers had hold most citations in both journals and institutions.Based on many research of the VOLUME 4, 2016 citation distribution of journals, Kao et al. [105] proposed a stochastic dominance analysis based method to evaluate journals.

D. NETWORK-BASED EVALUATION METHODS AND INDICES
The most used methods for evaluating nodes in network are PageRank and HITS.As discussed in the previous sections, HITS algorithm can be used to rank paper, author and journal together.There are some PageRank-type methods being designed for ranking journals, which have a basic form like where x i indicates the adaptive damping factor and satisfies N i=1 x i = 1.Generally, the value of x i is set as 1 N .r(J i ) represents the importance score of journal i [106].
Based on the PageRank algorithm, Chen [23] added the expert judgments on the method as a weight part, and optimized the function by Particle Swarm Optimization (PSO).In the same way, Lim et al. [107] used the relevance and importance of the citations between journals to design the weighted PageRank.Zhang proposed the HR-PageRank algorithm to evaluate journal impact via weighted PageRank according to the author's H-index, and relevance between citing and cited papers [108].Bohlin et al. [109] studied the different performances of zero-(the classical Markov model), firstand second-order Markov model while ranking journals and found that higher-order Markov models performed better and were more robust.Some evaluation methods consider the structural position of journals in the journal citation network.Zhang et al. [24] proposed an indicator named Quality-Structure Index (QSI), which ranked journals by the intrinsic popularity and structural position of journals.The intrinsic popularity was quantified by some frequently used metrics, such as JIF, Eigenfactor score, PageRank score.Similarly, Leydesdorff [25] introduced the betweenness centrality of journals in the journal citation network to the assessment task.Su [26] gave a link-based representation to some frequently used metrics for journals, such as JIF, and proposed a link-based fusing method to fuse several metrics together according to the links in and among paper citation network, authorship network and paper publishing network.This method has found a new way to consider many metrics together to evaluate academic entities.
Based on the above analysis, the existing journal evaluation methods still have the following problems: (1) most previous studies quantify journal impact based on the firstorder academic networks; (2) the citation inflation influences the real impact of a journal.Therefore, researchers need to explore the higher-order academic network analysis and journal impact inflation to resolve the challenging issues of journal evaluation.

V. OPEN ISSUES AND CHALLENGES
In this section, several open issues and challenges are shown for further research in this area, including the pattern of collaboration impact, unified evaluation standards, implicit success factor mining, dynamic academic network embedding, and scholarly impact inflation.

A. PATTERN OF COLLABORATION IMPACT
A significant amount of work has been focused on quantifying the impact of scholarly papers, scholars, and journals [27], [76], [108].However, little is known about how the impact of scientific collaboration evolves over time.Previous researchers measure the impact of co-authors by citations, which are easy to be manipulated.The structured methods for measuring the impact of co-authors are urgently needed in the science community.With the available large-scale datasets on citations and collaborations, it becomes possible to explore the patterns of collaboration impact in scientific collaborative careers over time and their potential relationships with scientists' success.Since the structured methods are needed to quantify the impact of co-authors, how to construct the network to measure the collaborative impact and how to model remain the broader challenge.One possible solution is to construct a heterogeneous academic network in which the impact of the co-authors are quantified.Based on this, researchers explore the pattern of collaboration impact.

B. UNIFIED EVALUATION STANDARDS
We have mentioned many automatic evaluation methods that try to find high-quality papers from a mass of publications.But these methods can only give researchers suggestions that which paper may be useful, the contents of the papers recommended are not concerned by the algorithm.Therefore, there is still a strong demand for efforts to find the papers you need in the research process.Although there are many automatic evaluation methods, we can not find a unified evaluation standard to evaluate which method can outperform others.A widely accepted ground truth is of great need in the evaluation systems.To solve this problem, the data set must be unified first.

C. IMPLICIT SUCCESS FACTOR MINING
In the past, more attention has been given to explicit success factors.In author impact evaluation research, researchers have found some explicit success factors such as academic age, institution, research field, and country [110].However, little is known about the mechanisms of the temporal evolution of success in science.Uncovering the origin of the success factors in science is a challenging task.Success in science may depend on exogenous factors, such as mentorstudent relationship, learning habits, and education level, remains unknown.Actively exploring the relationship between exogenous factors and academic success may provide a method for implicit success factor mining.

D. DYNAMIC ACADEMIC NETWORK EMBEDDING
Many static network embedding methods have been proposed, however, academic networks evolve over time.For example, in citation networks, citing papers and cited papers always dynamically change over time, e.g., new citations are continuously added to the citation networks when authors cite previous research work.To learn the representations of nodes in dynamic scholarly networks, the existing academic network embedding methods need to run repeatedly and take time.Therefore, further study on dynamic scholarly network embedding algorithms remains an open challenge in this area.To obtain the efficient representation, a deep feature learning and the associated representation model supported by dynamic academic data may need to be established.

E. SCHOLARLY IMPACT INFLATION
Scholarly impact inflation, which arises from the exponential growth of scholarly papers, affects the real value of scholarly impact, therefore, impacting the comparative evaluation of papers, scholars, journals, institutions, and country output across different periods [111].Scholars can increase their citations by relying on their friends and co-authors, indicating that citations are easily manipulated.Many work has focused on unraveling the dynamics of inflation for citations [30], [112]- [114].Under the background of the inflation of citations, how to construct the evaluation network of scholarly impact and how to model are surprisingly difficult, highlighting the broader challenge of evaluating the scholarly impact in the science community.One possible solution is to weaken citation inflation through the higher-order academic networks.

VI. CONCLUSION
In this paper, we conduct a comprehensive review of the literature in quantifying success in science, focusing on evaluation indices of scholarly impact.Two changes have taken place in quantifying success in science research: (1) from unstructured evaluation indices to structured evaluation indices; (2) from single-disciplinary impact assessment to interdisciplinary impact assessment.However, the literaturebased analysis has led to the conclusion that despite a large number of evaluation indices have been used to resolve the problems in quantifying success in science, the solutions of some potential issues remain unknowns, such as the pattern of collaborative impact, implicit success factor mining, dynamic academic network embedding, and scholarly impact inflation.To solve these challenging issues, researchers can explore from the high-order scholarly network, heterogeneous network analysis and modeling, and academic relationship identifying.

FIGURE 1 :
FIGURE 1: Framework of quantifying success in science.

FIGURE 2 :
FIGURE 2: Several typical scholarly networks for paper impact evaluation.

TABLE 1
shows an example of selected features for evaluating paper impact, including references, selected features,

TABLE 1 :
An example of selected features for evaluating paper impact.

TABLE 2 :
An example of counting-based method comparison for evaluating paper impact.

TABLE 3 :
An example of network-based method comparison for evaluating paper impact.

TABLE 4 :
An example of selected factors for evaluating author impact.