Discovering Interdisciplinarily Spread Knowledge in the Academic Literature

With the increase in scientific publications and the diversification of research areas, science has become complex and interdisciplinary. Discovering important knowledge has become difficult even for researchers in specific domains. Previously proposed keyphrase extraction methods focus mainly on detecting intensively discussed topics in specific domains but do not distinguish concepts with the interdisciplinary spread from those discussed in narrow areas. Here, we propose a diffusion meme score that evaluates the knowledge diffusion distance in a paper citation network. The distance between papers that contain specific terms is measured by the network embedding space of the citation network. Using 57 million publication records from 48 years of Scopus, we evaluated newly appearing terms in and after 1975 in biomedical science papers using the proposed indicator. Approximately half of the top 20 terms were related to Nobel Prize or Clarivate Citation Laureates, and the top terms of the indicators were more likely to appear in Wikipedia than terms extracted using existing methods. Moreover, the top terms were unlikely to include specific minor diseases, which are often extracted using existing methods. Therefore, the diffusion meme score evaluates important terms from scientific literature and citation networks that are more interdisciplinary. Our method improves the understanding of young researchers regarding domains, the development of the history of science, and the evaluation of researcher contributions.


I. INTRODUCTION
As science becomes massive and fragmented into specific domains [1], interdisciplinary research activities play an important role in developing scientific discoveries [2], [3]. Recent complex issues have become more important, such as climate change, food security, and healthy aging. In this context, funding agencies [4] and research organizations [5], [6] emphasize the interdisciplinarity of research. In specific domains, such as medical science, new therapies are developed by scientists in many disciplines, such as pharmacologists, molecular biologists, and neuroscientists, and involve scientists of public health, gerontology, zoology, education and engineering [7], [8]. For example, the hepatitis C virus was discovered to be the main cause of hepatitis by virologists [9]- [11], and treatment and prevention were established by scientists of internal medicine and preventive The associate editor coordinating the review of this manuscript and approving it for publication was Hocine Cherifi . medicine [12]. Therefore, detecting such interdisciplinarily knowledge is important for gaining a better understanding of scientific achievements.
In general, learning a domain is not the result of understanding the fragments of knowledge. However, existing keyphrase extraction methods focus on detecting intensively discussed topics in specific domains [13], [14]. To better understand science, interdisciplinarily important knowledge that bridges academic fields should be extracted. Comprehending the connections between domains helps researchers who tend to make conservative choices to delve deeper into their areas of expertise [15] and find potential applications [16]. Moreover, this knowledge also helps science historians unravel the complicated evolutionary process of science and is essential for institutions that distribute budgets and award prizes.
Literature-based discovery [17], [18] is an intensive research domain for supporting researchers in finding exceptional concepts from the scientific literature. However, these FIGURE 1. Differences in the forms of citation relationships that are highly valued by the meme score and diffusion meme score. A term that includes papers that cite each other and are published in distant domains is highly valued by the meme score and diffusion meme score (left figure). A term that includes papers that cite each other and are in similar domains is highly valued only by the meme score (central figure). A term used in some papers that do not cite each other is evaluated as zero in both indicators (right figure).
methods are highly domain-specific and are not applicable to large-scale datasets. Therefore, discovering exceptional interdisciplinary knowledge from a large data set is still a challenging issue [19]. In the context of natural language processing, discovering knowledge from the academic literature is approached as a keyphrase extraction task, which detects important terms from scientific publications [14], [20], [21]. Some unsupervised methods extract important terms by focusing on the appearance or relationship of terms [14]. For instance, TF-IDF [13] evaluated the appearance of terms in a document through arithmetic calculations of their frequencies in each document [22]. PositionRank [23] evaluated the importance of words by computing a PageRank-based score in a word co-occurrence network [24]. However, these indicators do not directly measure the interdisciplinary significance of scientific terms.
Chains of knowledge among scientists [25] should contain important information for evaluating interdisciplinary knowledge. Frequently, knowledge is represented as a term, and citation networks explicitly represent the chains of scientists' knowledge evaluations [26]. Therefore, for evaluating knowledge diffusion of a scientific term, it is straightforward to focus on a citation network among the papers that contain the term. Recently, Mao et al. detected important scientific terms in a domain that connect subfields by evaluating the amount of knowledge diffusion from the same domain [27]. However, these studies have focused only on a specific field comprising several thousand papers, and they use journal categories that do not effectively reflect the increasingly complex relationships among research disciplines. Closely related to our study, Kuhn et al. proposed the meme [28] score Mm, which evaluates the ''local'' importance of a term using a citation network [29]. The meme score evaluates the convergence of documents that include the term in the citation network and detects the knowledge that is discussed between scientists as a meme. Based on the idea of memes, the patterns of knowledge diffusion within and between disciplines have been examined [30], [31].
However, the meme score does not differentiate the terms used in a limited area (the middle of Fig. 1) or the terms that spread interdisciplinarily across academic fields (the left of Fig. 1). The former terms are likely to be highly specialized, such as the name of a minor surgery technique or minor proteins. Interdisciplinary concepts are assumed to be spread widely across academic fields because they are referenced in various fields. The distance between the endpoints of the citations indicates unanticipated knowledge transfer under the assumption that the distance between the fields of the two papers correlates with the similarity of the term occurrence of the papers. Therefore, an interdisciplinary concept can be evaluated by its spreading distance across academic fields. We use this idea to evaluate the unexpectedness of knowledge diffusion for detecting valuable interdisciplinary knowledge.
We propose a diffusion meme score D, which computes the sum of the distances of propagation across scientific fields and extracts exceptional scientific terms in terms of interdisciplinarity. Using this indicator, the terms that have propagated across diverse and distant communities obtain higher scores than those used by specific, narrow communities in a limited way. The distance indicates the difficulty or effort of the knowledge creation process. The accumulation of the distance describes each term's total impact on scientists in their research activity. Therefore, it makes sense that the top diffusion meme score terms are important terms (that scientists research frequently) in the process of academic evolution.
To measure the distances among academic fields, we obtain the latent representation of each node (paper) and measure the distance between the nodes. Conventionally, the reciprocal density of the citations between fields represents the distance between the fields. However, this indicator does not take into account the global structure of scientific fields. Recently, network representation learning has been widely used in machine learning tasks because the obtained vector representation retains both local and global structures [32]. Thus, we measure the interdisciplinary uniqueness of terms by computing the diffusion distance with network representation learning (DeepWalk [33]). The detailed procedures are described in the materials and methods section.
By conducting a case study, we demonstrate that the diffusion meme score successfully discovers exceptional concepts in terms of interdisciplinarity. We calculated the diffusion meme scores of newly appearing terms in the text (title and abstract) and citations of 21 million biomedical science papers in and after 1975. We calculated the diffusion distances using 57 million papers in whole domains. Approximately half of the top 20 scientific terms in terms of diffusion meme score are related to notably exceptional concepts, including those that had won the Nobel Prize or Thomson Innovation Award. However, the (original) meme score provides a higher score for concepts that are researched intensively in a narrow area. This tendency is confirmed in an evaluation using outside databases. The scientific terms that are evaluated highly in terms of the diffusion meme score are more likely to appear in Wikipedia than those evaluated highly in terms of the meme score; however, the opposite result was observed in an evaluation using a disease database (MalaCards [34]) that contains many minor diseases. The proposed method demonstrates that the spread of a scientific term across diverse academic fields is strongly related to social evaluation. This insight is significant to clarify the process of academic development.

A. DEFINITION OF THE DIFFUSION MEME SCORE
We define the natural logarithm of the sum of the distances d between the cited/citing references that contains the meme as the diffusion meme score D. 1 is added so that the logarithm can be calculated even if the sum of the distances is zero. We use network clustering to obtain small groups of articles that represent different research fields. Using the vector representation of the clusters obtained by embedding, the distance between the source clusters (cited) and destination (citing) documents is calculated as the Euclidean distance. The meme propagation between less involved clusters is large, and the propagation between closely related clusters is calculated to be small. Clustering and embedding are described in a later section.
Each term is evaluated using a citation network of papers that contained the term until ten years after the first 20 appearances of the term. The reason for the ten-year limit is to evaluate recent and old terms equally, supposing that most innovations are made in the decade after the terms become recognized.
d: Cosine distance between two nodes e s : Position of the cluster to which the cited reference belongs e t : Position of the cluster to which the citing reference belongs

B. CITATION NETWORK CLUSTERING
An unweighted directed network is constructed by connecting each literature node with an edge that connects the citations in the references from the source to the destination. We use the Leiden method [35] to classify the literature network into clusters of research fields. Clustering the literature citation network provides clusters of closely related references. Each of these clusters is defined as a research field. This categorization is used to obtain a distributed representation of a term or to calculate a term's distance.
The results of the clustering reflect the research activity of the citation relationship. This is the judgment of the authors of the article, who are familiar with the content, regarding the relevance of the content, and it allows for a better classification of research areas than methods based on journals and keywords.
After clustering approximately 57 million papers that have citation relationships with other papers, the clusters with more than 1000 references were further clustered twice (for a total of three recursive clustering steps), resulting in 24,908 clusters.

C. CLUSTER EMBEDDING
Graph embedding is a technique of projecting nodes in a graph onto a vector space. The vector representation of clusters indicates whether the clusters are in relatively similar or completely different fields. Several graph embedding methods have been proposed in the literature. In these methods, a long distance in the embedding space indicates the rarity of citations between them. It is difficult to choose the most adequate method by considering their embedding mechanisms. Thus, we evaluate several embedding methods and confirm the small difference between them (in ''Results -Comparison with other embedding methods''). In this paper, we adopt a method called DeepWalk [33].
The citation network is rewritten in the form of an intercluster citation network using the clustering results. Next, we randomly walk a certain number of times starting from a randomly selected node and obtain the series data of the visited nodes. Then, using Skip-Gram [36], we learn the variance representation vectors of each cluster to predict which clusters will appear around a given cluster. By projecting the citation network onto a high-dimensional (128-dimensional) space, the global structure of the citation network is preserved in the vector of each node while maintaining the local structure.

D. EXISTING METHODS
A variety of methods have been used to measure the information value of terms [18], [37]. Among them, the meme score M m defined by Kuhn et al. [29] is a method that solves a problem with the conventional methods, as it does not require a threshold of the number of times a term appears in the literature or expertise. The meme score M m is determined by the frequency of the occurrence of a term m in the literature, f m , and the heritability of the term m in the citation network, P m .
In (4), d m→m indicates the number of publications such that the given meme term appears in the publication and in its cited publications. d →m indicates the number of publications that cite publications that contain the meme term. d m→ ¡ m indicates the number of publications such that the given meme term appears in the publication and it does not cite publications that contain the meme term. d → ¡ m indicates the number of publications that do not cite publications that contain the meme term. delta is a controlled noise for preventing the high evaluation for low frequent terms.
However, this method does not consider the distance of term propagation. It is suitable for evaluating terms that are closely discussed in a narrow community of knowledge and have been established in a research field, but it is not possible to evaluate terms that have interdisciplinary influences in many fields. In this paper, we propose a novel index that can evaluate influential terms that have an interdisciplinary spread in many fields.

E. RESULT EVALUATION
The evaluation of the proposed method was based on the two axes of the expertise and breadth of the extracted terms.
MalaCards [34], an exhaustive database of disease names, was used to assess whether the diffusion meme score can extract specialty terms. The number of diseases in this list among the terms extracted by the proposed method was treated as the extraction accuracy of the expertise terms. MalaCards integrates disease names and annotations for human diseases from 75 data sources. The diseases are assigned to 18 categories representing body regions such as blood, bone, immune system, and muscle and to six global categories such as cancer, genetics, and infections. As of January 20, 2020, MalaCards contained 22,371 diseases.
Wikipedia was used to verify whether the proposed method could extract a broad range of well-established terms in a global society. Wikipedia is an Internet encyclopedia operated by the Wikimedia Foundation and contains more than six million pages related to its content. Although the reliability of Wikipedia's content has been discussed frequently over many years [38]- [41], it was used as an index for evaluating the breadth of information because it is a useful tool for obtaining general information. Terms consisting of letters, numbers, and ''−'' from the list of titles were collected and downloaded from the Wikipedia database on January 1, 2020. The result was a total of 2,912,156 words. The percentage of the terms on Wikipedia among the terms extracted by the proposed method was treated as the overall accuracy of the terms' extraction.

III. DATA
Our analysis relied on 57,757,843 publication records from Scopus published between January 1970 and December 2018. We extracted terms that were formed of 3 words or fewer from 21,242,007 publication records that belonged to the category of Medicine and Immunology as well as Microbiology and focused primarily on titles and abstracts. We did not analyze terms from the first five years of the data, and we focused on literature published in and after 1975. Due to the large number of terms, the analysis was limited to terms that appeared more than 50 times in the decade beginning with the year in which the term appeared more than 20 times in total. Additionally, in this paper, terms that did not contain alphabetical letters and terms that included the year of publication were omitted as meaningless. As a result, 276,901 terms were analyzed, and each term was evaluated using a citation network of papers that contained the term from the first appearance of the term to 10 years after the first 20 appearances of the term.

A. TERMS RATED HIGHLY BY THE DIFFUSION MEME SCORE
In preparation for the calculation of the diffusion meme score (1), we created a map of 57 million scientific papers. We applied network clustering recursively for 57 million publication records in the whole academic field of Scopus. The top 15 largest clusters are listed in Table.1. We found that the academic literature is composed of a divergent field of research. The clusters with more than 1,000 references were further clustered twice (for a total of three times), resulting in 24,908 clusters. Hereafter, we call a third-level cluster a cluster. We constructed a directed and weighted network of clusters, where each edge in the network represents the number of citations between the papers belonging to both end clusters. We calculated each cluster's 128-dimensional embedding from the network with DeepWalk. These clusters were visualized in 2D space by using t-distributed stochastic neighbor embedding (T-SNE) for dimension reduction (Fig. 2). We found that the clusters that belonged to the same top-level clusters were gathered in a certain space. The distance from the source to the destination of the term propagation via a citation was calculated in the 128-dimensional space. Formally, the distance d(e s , e t ) is the cosine distance between the positions of both clusters e s , e t to which the source and target nodes of the citations belong.
We calculated the diffusion meme scores (1) for words that newly appeared in and after 1975 in 21 million biomedical papers. We calculated the meme score of each term using the citations published within ten years of the first 20 appearances of the term. The purpose of the ten-year limit is to not overemphasize old terms. Table 2 shows the 20 terms that were rated most highly in the diffusion meme score D, the existing meme score Mm, and frequency (the  Terms that are rated highly in each method. For the same term with different divisions or different names, the highest score is adopted (e.g., helicobacter, helicobacter pylori, and h. pylori). A term with + indicates that the person who discovered the drug or other substance related to the term received the Nobel Prize. Terms marked with * denote terms for which the discoverer was awarded the Clarivate Citation Laureates.
top 100 terms of each method are listed in Supplemental  Table S2-4). Five terms (marked with +) are related to drugs or other substances that are the main contributions of scientists who received the Nobel Prize. The other five terms, marked with *, are terms for which the discoverer was awarded the Clarivate Citation Laureates. 1 The results indicate that the diffusion meme score can determine terms that are likely to be included in Nobel Prize wins. For instance, the term ''hepatitis c virus'' was also included in research eligible for the 2020 Nobel Prize, while the analyzed data are from before 2019. Other examples are ''helicobacter'' and ''toll-like'', which were terms used in the research that won the Nobel Prize in Physiology or Medicine in 2005 and 2011. Although the paper citation network in 2018 is used for embedding clusters, the period of evaluated term-related citation networks is from 1992 to 2001 (former) and from 2001 to 2010 (latter). This 1 An award presented by Clarivate Analytics, which identifies researchers who are likely to win the Nobel Prize in the near future. indicates the possibility of predicting future Nobel Prizes related to these terms. Therefore, the proposed method can be used as an index to evaluate terms that are associated with FIGURE 3. Diffusion network diagrams of a term that is highly rated in each method: The circles represent research field subclusters, and the color represents the main cluster to which each paper belongs. The size represents the number of times the term appears, and the edges indicate citations across disciplines for a term. The more citations a term has, the thicker the edges become. Terms that had high diffusion meme scores, such as (a) helicobacter pylori, have many research field clusters and edges between them. Terms that had high meme scores, such as (b) IgG4-related terms, appear in fewer research field clusters. Additionally, there is a hub cluster. Words that appear more frequently but are not highly rated in their respective meme scores, such as (c) Wolter Kluwer health, appear in diverse clusters but do not have many edges, indicating that they appear independently within clusters.

FIGURE 4.
The evaluation of the diffusion meme score, meme score, and term frequency using Wikipedia and MalaCards: (a) Percentage of terms in Wikipedia among the terms extracted using each method illustrates that the diffusion meme score can extract terms that are more socially prevalent than other methods. (b) The number of terms in the disease list among the terms extracted using each method illustrates that the meme score can extract more specialized terms than other methods. discoveries that have a great impact on society. However, diffusion meme scores falsely detect ''biomed central ltd'' because many papers in diverse fields contain this term and eventually connect to each other. This paper does not use the dictionary-based approach to remove such terms and evaluate the method correctly. However, the top 100 terms of each method in Supplemental Table S2 suggest that most misdetected words are related to publishers and are easy to remove using the dictionary-based approach.
However, the existing meme scores are high for medical devices, rare diseases, and surgical methods used for a small percentage of procedures. For instance, ''sialendoscopy'', a device used for salivary gland diseases, was highly rated by the meme score even though it was mentioned in only 280 papers. ''Onychomatricoma'', neoplastic nail lesions, was also highly rated, although it was mentioned in only 76 papers. This is consistent with the characteristics of the meme scores, which rate terms that are closely cited within a narrow field of study. The right side of Table 2, which lists the terms in order of their frequency of appearance, shows the publishers' names (''Biomed Central ltd'' and ''Elsevier Ireland ltd'') and general terms (''clinical trials'' and ''study design''). The diffusion meme score does not remove these words from the top lists completely but reduces their relative importance. The general terms are not evaluated highly in the diffusion and meme scores. Fig. 3 illustrates the diffusion network diagrams of highly rated terms in each method. The terms that are highly rated in the diffusion meme score (the left of Fig. 3) have a large number of clusters, and most of the clusters have edges between them. Compared to the terms that are highly rated by the diffusion meme score, the terms that are rated higher in the meme score (the middle of Fig. 3) appear in fewer FIGURE 5. Differences in score distribution between diffusion meme score and other methods: (a) is a comparison of diffusion meme scores and meme scores, and (b) is a comparison of diffusion meme scores and the logarithm of the frequency of occurrence. Each dot represents a term, with Wikipedia words in orange and non-words in blue. The terms that scored highly on the meme scores or the logarithms of the frequency of occurrence are a mix of those listed on Wikipedia and those that are not. The distribution of listed and non-listed terms is split in the diffusion meme score.
clusters. As a representative example, IgG4-related 2 consists of one large node and a small connected node. This implies that the term propagation takes place mostly in one particular 2 IgG4-related disease (IgG4-RD) is a chronic inflammatory condition characterized by tissue infiltration with lymphocytes and IgG4-secreting plasma cells, various degrees of fibrosis (scarring), and a usually prompt response to oral steroids. IgG4-RD has an incidence rate of 0.28-1.08 per 100,000 people.
cluster. The right side of Fig. 3 illustrates the example of words (''wolters kluwer'') that are highly evaluated only by the word frequency. These words appear selectively in some specific clusters and can be evaluated highly in the TF-IDF of these clusters. This confirms that the original meme score can differentiate highly specialized terms from frequent terms.

B. STATISTICAL EVALUATION WITH WIKIPEDIA AND MalaCards
Wikipedia was used to assess whether the diffusion meme score could extract socially prevalent terms. The vertical axis in Fig. 4 illustrates the percentage of extracted terms that are listed on Wikipedia. In the case of random sampling, the vertical axis value hovers around 0.14, which is almost the same as the percentage of terms on Wikipedia in the total data used. The diffusion meme score demonstrates that around 60% of the top 10% of terms are listed on Wikipedia, and even when the number of extracted terms increases, the diffusion meme score contains the highest percentage of terms in Wikipedia compared with the other methods. The method using the number of appearances has the second-highest accuracy at first. This indicates that the diffusion meme score can evaluate terms related to generally important knowledge in society.
MalaCards [34], an exhaustive human disease database, was used to validate the extraction of expertise terms in the category of Medicine and Immunology as well as Microbiology. The right side of Fig. 4 indicates the number of terms in MalaCards that are in the top n% of each index. The vertical axis in the figure is the number of terms related to the disease's name among the extracted terms. The case where the terms are randomly extracted in addition to using the proposed diffusion meme score D and the comparison method, that is, the number of occurrences and the existing meme score Mm, is indicated in gray. This figure illustrates that the proposed method and comparison method always extract more terms than the random extraction case. Comparing each index, the results demonstrate that we were able to extract more disease name-related terms, always in the order of meme score, diffusion meme score, and the number of appearances.
The meme score is more likely to be evaluated when the propagation is tight and there is a community that actively discusses the term of interest. Therefore, in evaluating a list of disease names, including rare diseases, meme scores are useful for evaluating specialized terms discussed in minor communities. For example, the top 10% of terms in the existing meme score includes a genetic disease called Pallister-Killian [42], which appears in only 51 references, and an infectious disease called neuroschistosomiasis [43], a disease caused by a tropical worm, which appears in only 53 references. However, there are many terms that the meme score assesses that are confined to a narrow knowledge community; they are important within that community but are not widely known in society. In contrast, the diffusion meme score tends to be higher for terms that have spread to as many communities or distant communities as possible, which is why FIGURE 6. The difference in terms extracted using multiple embedding methods: (a) Comparison of the percentage of terms in Wikipedia among the terms extracted using multiple embedding methods. (b) The number of terms in the disease list among the terms extracted using multiple embedding methods: (a) illustrates that it does not make a significant difference which embedding method is used, and (b) illustrates that the basic method has the highest score for the extraction of highly specialized terms, showing a similar trend to existing meme methods.
the Wikipedia-based evaluation demonstrates better accuracy with the diffusion meme score. These analyses indicate that the meme score is adequate for comprehensively detecting topics discussed in academic papers and that the diffusion meme score is useful for detecting important interdisciplinary knowledge.
We also confirmed the differences among the diffusion meme score, meme score, and term frequency by the scatter plot of each term in Fig. 5. Most of the top words in both metrics appeared in Wikipedia. However, the figure illustrates that the results of these methods are not highly correlated. The terms ''helicobacter'' and ''onychomatricoma'' discussed above are plotted in the figure. The diffusion meme score evaluates words that are not evaluated in the meme score, and the diffusion meme score is more tightly correlated with term frequency (Fig. 5(b)). It is assumed that knowledge diffusion via citation networks is an essential process for a term (excluding publisher names) gaining high popularity. However, the difference in the top-evaluated words in both methods was significant. Typical examples include ''LASIK'' and ''nova science publishers'': the former is interdisciplinary applied surgery used to improve visual acuity, and the latter is the name of an American journal publisher.

C. COMPARISON WITH OTHER EMBEDDING METHODS
In this paper, DeepWalk was chosen as the embedding method to calculate the distance between clusters in (1). Here, we examine the differences in the results that appear when this embedding is replaced by other methods (Line [44], Laplacian eigenmaps (Lap) [45], HOPE [46], SDNE [47], GraRep [48], and graph factorization (GF) [49]). In addition, instead of cluster embedding using the citation relation, we set the basic method that calculates all diffusion distances as 1. Fig. 6 shows how the accuracy changes when other embedding methods are used. In terms of broad and global importance, there was no significant difference between the methods. On the other hand, the basic method extracts the most words with high expertise, which is similar to the meme score. The basic method does not involve the distance between clusters and therefore cannot distinguish between interdisciplinary terms and terms that are narrowly discussed in closed clusters.

V. DISCUSSION
Our proposed method (the diffusion meme score) can extract terms from the corpus that are important globally. Most of the top 20 diffusion meme score terms are important in the medical science field, and some are Nobel prize-related terms. Most of the other top diffusion meme score terms are significantly important. For example, hepatitis c virus (HCV) has an interdisciplinary spread (71 million people are infected) and causes liver cancer [50]. Most of the top 200 terms are also important in medical science (such as Catenin, endothelin-1, and MCP-1). These terms provide a brief overview of recent progress in medical science. Our results confirm that the diffusion meme score, focusing on the spread of terms across disciplines, is effective, as it allows us to extract terms that impact global society.
The idea that the distance (surprise) of information spreading represents the importance of the information differs from other major term extraction methods such as TF-IDF [13] and the meme score [29]. The diffusion meme score applies to documents with relationship links, such as academic paper datasets of other domains and patent documents with reference data. Additionally, the diffusion meme score can explore important knowledge of human communications such as Twitter and other social networking sites data. Information diffusion on Twitter [51] has been intensively researched, especially for fake news [52]. Our approach may contribute to examining each path and estimating the global impact of information. VOLUME 9, 2021 VI. CONCLUSION We proposed the diffusion meme score D, which evaluates the knowledge diffusion distance in a paper citation network. The distance that is calculated in the network embedding space of the citation network indicates the difficulty or effort of the knowledge creation process. We confirmed that the sum of the distances indicates the importance of knowledge in science and society. Approximately half of the top 20 terms are related to Nobel Prize or Clarivate Citation Laureates, and the top terms of the indicators are more likely to appear in Wikipedia than terms extracted using existing methods. Our method improves the means by which young researchers can understand domains, the development of the history of science, and the evaluation of the contributions of researchers. The extracted important knowledge may provide a quick look at medical science for students, researchers and academic administrators. The diffusion meme score D is applicable to any other data composed of texts and their relationships.
The diffusion meme score does not evaluate the knowledge that is not represented as a term, and the correction of word polysemy and ambiguity leads to better results. However, this has a limited impact on the results because scientists tend to define concepts, objects, processes, and facts as terms and tend to use well-defined words. The ambiguity of a term between the cited and citing papers causes the decrease of the diffusion meme score. However, the effect is not significantly greater than that of other indicators, such as the term frequency and the original meme score. Another limitation is that documents without relationship data are not within the scope of the diffusion meme score. However, our method is applicable to these data using link prediction between documents, which is an important subject for data mining researchers. There is a possibility that implementing other distance calculation methods will improve the results of our method.

AUTHOR CONTRIBUTIONS STATEMENT
Kimitaka Asatani and Maiko Kamada designed the model and the computational framework. Maiko Kamada analyzed the data with the support of Kimitaka Asatani. Maiko Kamada and Kimitaka Asatani wrote the manuscript. Masaru Isonuma and Ichiro Sakata reviewed the study and assisted with the data analysis. Ichiro Sakata supervised the project.