Fault diagnosis of rotating machinery: a review and bibliometric analysis

Fault diagnosis of rotating machinery (FDRM) has attracted continuous attention because of its great importance to industrial engineering, promoting the healthy and prosperous development of the field. A large number of literature reviews on FDRM have been reported, including signal processing methods in FDRM; artificial intelligence techniques in FDRM; fault diagnosis of bearings, gearboxes and induction motors; and others with broader areas of focus. Using bibliometric techniques, this paper provides objective insight and presents a comprehensive review of FDRM. The review begins with rigorous bibliometric analyses of 2532 published studies. These analyses enable mapping the scope and structure of the discipline, discovering the established collaboration patterns among countries and institutions, and identifying authoritative papers and authors. In addition, a deep analysis of the co-citation network allows graphically classifying the key research topics, illustrating their evolution over time, and identifying the current research interests and potential future research directions. The findings in this paper provide an overall picture of the development of FDRM from 1998 to 2019 and a robust roadmap for future investigations in this field.


I. INTRODUCTION
Widely applied rotating machinery plays a crucial role in modern industrial applications. Because it operates under severe and complex conditions over a long period of time, such as heavy load, high temperature, and high speed, rotating machinery is inevitably subject to failure [1]. These faults of varying degrees lead to decreased machinery service performance and even cause emergency shutdowns, which may result in very large economic losses and catastrophic safety accidents [2]. Therefore, the condition of machinery must be monitored in a timely manner, and faults should be diagnosed as early as possible. As one of the key techniques, fault diagnosis of rotating machinery (FDRM) consists of four basic tasks: determining whether an abnormal condition occurs among the machines or key components, finding the incipient failure and its original cause, assessing its level of severity, and predicting the trend in fault development, namely, fault detection, fault isolation, fault assessment, and fault prediction. It is important to implement effective FDRM that can help avoid abnormal event progression, reduce offline time, forecast residual life, reduce productivity loss, and finally, avoid major system breakdowns and catastrophes [3]. Meanwhile, with the development of modern industrial manufacturing, machinery equipment and systems have become considerably large scale, complex, and automated. FDRM is becoming not only increasingly important but also difficult due to the greater demands of higher performance, safety, and reliability. Therefore, FDRM has received increasing attention and has experienced considerable development in recent decades.
In 1969, bibliometrics was defined by Pritchard as "the application of mathematics and statistical methods to books and other media of communication" [36]. Different from traditional narrative reviews relying on the experience and knowledge of researchers, bibliometrics examines science as a knowledge-generating system [37] and provides a perspective that can easily be scaled from the micro to macro level [38]. Specifically, bibliometrics can help researchers not only understand the state of the art of a certain field from a macro perspective by analysing leading journals, regions, institutions, authors and papers but also track the development trends and research hotspots from a micro perspective by analysing keywords and their distribution over time [39]. In recent years, bibliometric techniques have been applied in various fields, such as energy [40], academic research in innovation [41], water footprint [42], open innovation [43], green supply chain management [44], and analytic network process [45]. This paper presents a comprehensive and objective review of FDRM by means of bibliometric techniques, including bibliometric indicators and citation and co-citation analyses. On the basis of these analyses, we review the state of research on FDRM, classify the leading and influential research works, identify gaps in the existing research, and discover potential directions and interests for future research.
The remainder of this paper is organized as follows. Section II briefly overviews FDRM. Section III describes the framework and methodology of this paper. Section IV presents the results of the study and a discussion of these results. Finally, Section V presents conclusions and limitations.

II. Fault diagnosis of rotating machinery
FDRM is essentially a pattern recognition problem consisting of two key steps, namely, feature extraction and fault recognition, which can be resolved by signal processing technologies and AI technologies [32]. Signal processing technologies, an important topic in FDRM, have long been widely applied in various industrial applications. In 2006, Jardine et al. [46] summarized three types of signal processing methods for waveform data: time domain analysis, frequency domain analysis and time-frequency domain analysis. Additionally, according to the different development periods, Rai and Upadhyay [16] reviewed various signal processing methods for rolling element bearings and their diagnostic capabilities and divided them into three stages. After continuous study and development, signal processing technologies in FDRM have developed into a series of methods and tools, including wavelet-based methods [47][48][49], empirical mode decomposition (EMD) methods [50][51][52][53], autoregressive methods [54,55], cyclostationary methods [56,57], spectral kurtosis (SK) and kurtogram methods [58][59][60], morphological signal processing methods [61,62], entropy-based methods [63][64][65], and data reduction tools [66,67]. Furthermore, AI technologies are used for fault recognition to map information in the feature space to machine faults in the fault space. Due to increased attention, numerous AI technologies have been used or developed for FDRM. Generally, AI technologies in FDRM can be divided into three categories: supervised methods [68], semi-supervised methods [69] and unsupervised methods [70]. Among them, the most widely used classifiers include the k-nearest neighbour methods [71], Bayesian methods [72,73], support vector machine methods [74,75], random forest methods [76,77], and artificial neural network methods [78,79]. In recent years, with the continuous development of AI and computers, deep learning techniques have been introduced into FDRM to avoid manual feature extraction using multiple hidden layers of the deep learning architecture [80]. Due to its ability to automatically extract features and process massive amounts of data, deep learning has become a research hotspot in FDRM. A large number of deep learning methods, such as convolutional neural networks (CNNs), stacked autoencoders (SAEs), restricted Boltzmann machines, deep belief networks (DBNs) and deep neural networks (DNNs), have been integrated into FDRM [81].
Comprehensive reviews of FDRM have been reported on various topics. However, a thorough bibliometric and network analysis, which would be valuable for mapping the scope and structure of the discipline, identifying the most authoritative papers, and discovering key research topics precisely and objectively, is lacking [44]. Therefore, by collecting a large number of publications on FDRM and using rigorous analytic tools, this paper provides a bibliometric perspective to track the evolution of the field over time, investigates the areas of current research interest, and determines potential future research directions.

III. METHOD
As shown in Fig. 1, the framework of this paper consists of three components: data collection, analytic methods, and results and visualization.

A. DATA COLLECTION
The Web of Science (WoS) database is widely regarded as the standard and the database of greatest authority for scientific research [82,83]. In particular, the core collection of the WoS, including the science citation index expanded (SCIE), the social science citation index (SSCI), the art and humanity citation index (A&HCI), and the conference proceedings citation index-science (CPCI-S), contains more than 12000 world-leading academic journals, books, and conference proceedings, the topics of which include natural sciences, social sciences, engineering, biomedicine, arts and humanities. Therefore, based on the theme of this paper, the SCIE and SSCI are selected as the databases. In addition, the types of literature are restricted to articles and reviews, and the years of publication of the retrieved documents are limited to the period from 1998 to 2019.
The selection of search keywords is very important for the document collection and subsequent analysis. On the one hand, due to the richness and diversity in the connotation of FDRM, it involves too many research aspects to traverse every research keyword. On the other hand, since a document contains much information, including the title, abstract, keywords, references, etc., if the scope of the collection of documents is too wide and the number is too large, effective quantitative document analysis will be difficult to achieve. In summary, when collecting document data, this paper adopts the search expression "object" + "technology" when searching the titles of documents to effectively find those documents that best match the theme of this work. For the object, bearings and gears are the basic and representative elements of rotating machinery, constituting many machinery systems, such as turbines, pumps, fans and engines. Thus, they are selected as the search keywords, as well as variants of rotating machinery, with keywords such as "rotating machinery", "rotating machine", "rotary machines", "rotary machinery", "machinery", "mechanical fault", "bearings", and "gears". For the technology, the keywords related to fault diagnosis include "diagnosis", "fault identification", "fault detection", "condition monitoring", and "prognostics and health management". Therefore, the search query is defined as follows: OR " " OR " *" OR " * " OR " " OR " " " * *" OR " " bearing or gear mechanical fault After excluding unrelated topics, such as radiology, orthopaedics, and biochemistry, 2532 documents were collected, and their essential paper information, such as their titles, author names and affiliations, abstract, keywords, and references, was stored.

B. ANALYTIC METHOD
As shown in Fig. 1, the analytic method consists of three components: analytic tools, general data analysis and deep data analysis.
In this paper, three powerful analytic tools are used to quantitatively and visually analyse the collected data: EndNote, InCites and CiteSpace. EndNote is the industry standard software tool for publishing and managing bibliographies, citations and references; all the information related to a paper can be stored in EndNote [45]. InCites is an authentic tool for research performance evaluation that is built on the selective, structured and complete data provided by the WoS. By means of InCites, the academic performance and impact among researchers, research institutes, and research regions and areas can be analysed comprehensively and visually.
Although new science mapping systems and generic tools continue to be developed, few tools are specifically designed to generate a systematic review of a fast-moving and complex field [84]. CiteSpace, an excellent bibliometric software, can visualize the relationship between documents in the form of scientific knowledge maps, which can not only help researchers identify past research trajectories but also provide a clear understanding of future research prospects. In addition, CiteSpace can help support structural and temporal analyses of a variety of networks derived from scientific publications, such as citation and co-citation networks [85].
By means of these tools, the characteristics and trends of FDRM can be quantitatively and qualitatively analysed via general data analysis and deep data analysis. On the one hand, traditional bibliometric indicators, including publication volumes, journals, countries, institutions, authors and papers, are used for general data analysis. On the other hand, to better illustrate the trends of present and future research, deep data are also considered by means of citation and co-citation networks.

A. PUBLICATION VOLUMES
As mentioned above, 2532 documents are obtained, including 2486 articles and 46 reviews, and 70 countries or regions are found to have contributed to the FDRM research field. The publishing trend in the field of FDRM over time is shown in Fig. 2, from which geometric growth is observed in terms of the cumulative number of published papers. Only 13 relevant papers were published in 1998, whereas that number was almost 500 in 2019, which is a substantial increase. Specifically, when analysing the proportion of papers published each year, four years of continuous explosive growth can be observed from 2016 to 2019, with high proportions of 11.7%, 11.4%, 14.3% and 19.5%. In addition, almost 85% of the retrieved literature was published in the past decade. Therefore, FDRM has attracted increasing attention worldwide and has become a hot scientific research field and frontier.

B. LEADING JOURNALS
It is important to identify the leading journals in a specific field. On the one hand, by means of rigorous review processes, leading journals can publish meaningful and highquality research work and disseminate the latest progress in a research field. On the other hand, leading journals cover the history, process and trend of a research field, which are indispensable for the inheritance of knowledge and scientific research. In addition, an effective peer research and communication platform can be established by leading journals for researchers. Therefore, those leading journals that can provide extensive, advanced and authoritative research progress and insights should be identified. The 2532 collected papers were published in 311 journals. To identify the leading journals, several evaluation indexes are considered, including the total number of documents (TD), average number of citations per document (ACD), impact factor (IF), and total number of citations (TC). The top 20 productive journals, ranked according to the TD, are listed in Table I. Mechanical Systems and Signal Processing (MSSP) has published the most documents and obtained the most citations, 348 and 23762 in total, respectively, accounting for 13.74% and 38.82% of the overall total. In addition, its ACD and IF scores rank second. The high publication volume of this journal indicates its wide influence and popularity. Moreover, the substantial number of citations shows the high quality and absolute impact of the published documents. Clearly, MSSP is the dominant journal in this field. In addition, Measurement, Shock and Vibration, Journal of Sound and Vibration (JSV), Sensors, and IEEE Access are 5 other influential journals that have each published more than 100 documents. To demonstrate the comprehensive influence of the 20 selected journals, a 3-D evaluation space is constructed, as shown in Fig. 3. An evaluation of the ACD, TC and IF indexes identifies two leading journals in addition to MSSP: IEEE Transactions on Industrial Electronics (IEEE TIE) and Expert Systems with Applications. Notably, the ACD and IF values of these two journals rank in the top three. Moreover, rather than pursuing high publication volumes, these two journals favour high-quality papers, with 2713 and 2553 citations in total, respectively, ranking fourth and fifth among the top 20 journals. Additionally, a bubble and column chart of annual publications on FDRM of the top 20 journals is constructed to assess the distribution of the published documents (Fig. 4). When analysing the overall publication distributions, we find that few documents were published during the period from 1998 to 2008. Notably, JSV is the only journal other than MSSP that continuously published research in this field from 1998 to 2019, which confirms its integrity and importance in the FDRM field. Furthermore, the publication volume of the top 20 journals increases steadily overall, but three explosive years of growth are observed for MSSP in 2013, Shock and Vibration in 2016 and IEEE Access in 2019, which indicates the popularity of these journals and their accumulated influence in the field of FDRM. Furthermore, an opposite development trend is observed for the Journal of Vibration and Control and Expert Systems with Applications. In 2015 and 2016, the Journal of Vibration and Control published 10 and 11 relevant documents, respectively, but only 2 documents were published in the following year. Similarly, Expert Systems with Applications published 8 relevant documents in 2011, after which no more than 3 documents on FDRM were published annually.

C. LEADING COUNTRIES AND INSTITUTIONS
An analysis of countries and institutions can identify advanced research communities around the world and help researchers track study directions. In addition to the traditional evaluation indicators, an algorithm named PageRank, which was developed by Google to assess the importance of web pages via their link structure, is employed to evaluate the importance of the collaboration networks among countries and institutions [86].
Assume that node T k is associated with nodes T 1 , …, T n , and define parameter d as a damping factor whose value is set between 0 and 1 to indicate the fraction of random walks that continue to propagate along relationships. In addition, C(T) is defined as the number of relationships proceeding out of node T. Therefore, the PageRank of node T k , (PR(T k )), in a network of N nodes can be calculated as follows:   Compared to the analysis of countries, the analysis of institutions can provide detailed and specific information. Table III shows the top 20 most productive institutions in the field of FDRM, in which the total number of published articles, TC, and h-index are analysed to evaluate the authority and influence of an institution. Moreover, PageRank is employed to estimate the popularity of institutions in terms of collaboration. As shown in Table III Additionally, the University of Science and Technology Beijing has a wide and effective collaboration network with many institutions, including the University of Alberta (13 coarticles), the University of Ottawa (9 co-articles) and Tsinghua University (8 co-articles), as shown in Fig. 5(b). These institutions play an important role in the development and promotion of the FDRM research field.

D. LEADING AUTHORS AND INFLUENTIAL PAPERS
The great breakthroughs in the development of the FDRM field can be understood by analysing the leading authors and influential papers. In this paper, two analyses, namely, citation analysis and co-citation network analysis, are conducted to achieve this goal and to discover the interactions among authors and research themes.

1) CITATION ANALYSIS
Citation and PageRank analysis can measure both the popularity and prestige of an author or a paper. According to the statistics of the collected data, 4002 scholars worldwide have authored or co-authored articles in the FDRM field. According to the TD, the top 20 most prolific authors are listed in  Among the 2532 collected articles, 108 were cited more than 100 times; Table V shows the top 20 most cited documents. Notably, almost all of these papers were published at least 5 years ago, and 14 are at least 10 years old. This observation is not surprising because highly cited papers need sufficient time to accumulate citations. Additionally, researchers from the University of New South Wales published 4 of the 20 most cited papers, occupying the top of the list. Moreover, Randall RB has the most articles (5 articles) on the list, with 6824 total citations. The most cited paper, entitled "A review on machinery diagnostics and prognostics implementing condition-based maintenance", published by Jardine, Lin and Banjevic [46], has 1859 citations, followed by Randall RB and Antoni [18] and Lei et al. [7], with 869 and 697 citations each. In terms of PageRank, the paper authored by Jia et al. [87] tops the list at 3.09.

2) CO-CITATION ANALYSIS
Co-citation analysis was proposed by Henry Small [88] to evaluate the similarity of literature and the importance of an article. Articles are co-cited when they appear together in the reference lists of other publications. The higher the cocitation frequency two publications have, the tighter their association [89]. A publication with a high co-citation frequency is also assumed to be key literature in a certain research field. In this paper, we used CiteSpace [85] to visualize and analyse the co-citation network of the collected 2532 publications.
A co-citation network in CiteSpace consists of nodes, links and colours, where the nodes represent the co-cited articles, the links represent the co-cited relationship between articles, and the colours represent the years of publication of citations. The larger the radius of a node is, the more co-citations the publication has. In addition, a node is composed of rings of different colours, indicating the co-citations of an article for different years. The radius of a ring indicates the number of co-citations received in the corresponding year. In terms of the link, its thickness represents the co-citation strength between two co-cited publications, and its colour indicates the first year of the co-citation.
The co-citation network is generated by analysing the cocitation relationships of the 2532 publications collected from 1998 to 2019, as shown in Fig. 6. The network is composed of 1206 nodes and 2863 links. The contrasting colours of the co-citation network make it clear that the development process begins at the white bottom and proceeds to the red top. The majority of the most highly co-cited documents were completed between 2006 and 2016. The top 20 most co-cited articles and their co-citation relationships are shown in the figure. Ref. [18] tops the list with 309 co-citations and obtained the most co-citations (57) with Ref. [90]. Meanwhile, 5 articles, Refs. [91][92][93][94][95], authored by Antoni J are co-cited with Ref. [18], indicating that Antoni J and Randall RB have tight cooperation and similar research directions. Second, Ref. [7] has attracted 225 co-citations, with newer citing documents, as indicated by their colours. In addition, Ref. [5], Ref. [6] and Ref. [96] are the three most co-cited documents with Ref. [7], receiving 66, 44 and 25 cocitations, respectively. Moreover, among the top 10 co-cited documents, 4 papers are authored by Lei YG, which again proves that he is an outstanding scholar and that his research work is widely recognized. , are not included in the collected 2532 publications but have harvested large numbers of co-citations in the FDRM field. A common characteristic of these articles is that they all propose novel basic mathematical methods that lead to their integration into diagnostic methods to greatly improve the FDRM. In [96], a signal processing method named ensemble empirical mode decomposition (EEMD) is proposed to decompose noisy signals into a series of components to solve the problem of mode mixing in the original EMD. Similarly, the empirical wavelet transform (EWT) [97] is a decomposition method that can be used to build an adaptive wavelet basis to decompose a vibration signal into sub-bands, which improves the matching ability of fault features. In addition, variational mode decomposition (VMD) is proposed to decompose a non-stationary signal into coupled intrinsic mode functions adaptively and non-recursively [98]. On the basis of these signal processing methods, many studies have improved the effectiveness of feature extraction to enhance the diagnostic performance [101]. In 2015, as an important breakthrough in AI, deep learning was reviewed and summarized in [99], which claimed that deep learning can dramatically improve pattern recognition in many domains and avoid the subjectivity of manual feature extraction. With the implementation of the deep learning methodology, FDRM has entered the era of deep AI.

E. HOTSPOT AND DEVELOPMENT TREND ANALYSIS
A crucial function of the bibliometric review and co-citation network analysis is to identify research hotspots and development trends by mining the data value of past literature. By means of data clustering, research topics can be extracted from keywords, and research hotspots can be refined by analysing their evolution over time. Keywords are phrases of 3 to 8 words selected from the title, the abstract and the main body of a document that reflect the theme and the main idea of the document. In this paper, we used CiteSpace to cluster keywords and visualize the distribution of the classified topics. The default clustering algorithm in CiteSpace is the log-likelihood ratio (LLR) [102]. A total of 117 topics were obtained by applying this method to the generated co-citation network. According to the number of documents in which they appear, the top 15 topics were selected, and their distributions, as well as detailed information on the top 6 topics, are shown in Fig. 7. VOLUME XX, 2017 9

FIGURE 7. Research topic classification based on keyword clustering
As shown in Fig. 7, the top 15 research topics can be classified into 3 categories: research object, research goal and key technology. As the basic and universal types of rotating machinery, planetary gearboxes, bearings, and induction motors are the three most important research objects. In terms of the research goal, weak signal detection is an important research area, with 57 related documents. In addition, the key technology in FDRM is the most interesting and attractive research field and occupies 11 of the top 15 clusters. Moreover, the top 6 research topics, namely, artificial neural network, signal processing, SK, permutation entropy, deep learning and acoustic emission, all belong to the key technology. The three most co-cited documents of these 6 topics are also shown in Fig. 7, along with their distributions. In addition, detailed information on the top 10 research clusters is presented in Table VI, including the top 3 topics according to the LLR algorithm, the number and the mean year of the publications, and the 5 most co-cited papers in each cluster.   [144][145][146][147][148] By analysing the distributions of the 6 clusters in Fig. 7 and the top 10 pieces of clustering information in Table VI, we identified two key characteristics of the establishment and development of a research topic: core literature and time. On the one hand, each topic is formed around several leading documents, which can be review papers, such as Ref. [18], innovative basic papers, such as Ref. [99], or papers presenting new methods for solving old problems, such as Ref. [87]. On the other hand, most of the publications and citations under the same research topic occur in the same period, which is confirmed by the fact that the colours of the nodes and links, representing the publication and citation years of the documents, in each cluster are usually similar, even the same. Therefore, the current research hotspots and development trends can be determined by observing the growth law of each research topic over time and identifying and studying the leading literature in each topic.
To better understand the development of the important topics in the FDRM field, a timeline map of the top 10 clustered topics is constructed and shown in Fig. 8. As in Fig.  6, the size and colour of the nodes in Fig. 8 represent the number and year of the documents, respectively. respectively, far earlier than other clusters. In addition, the mean publication years of these two clusters are 2007 and 2000, which indicates that they are no longer hotspots. However, as two key technologies in the field of FDRM, they are developing in a specific and detailed direction rather than being abandoned. In-depth research causes keywords to no longer be generalized to signal processing and artificial neural networks and leads to these results. For instance, clusters 2, 3 and 5 are subcategories of cluster 1, and cluster 4 is a subcategory of cluster 0. This illustrates that the current hotspots tend to be specific and targeted. Another phenomenon is that several clusters are currently continuously developing with excellent influence, such as clusters 2, 3, 4, 6, 7, 8 and 9, indicating their bright research prospects. However, unlike research objects and goals, the innovation of technology is the foundation of research field development. Therefore, this paper focuses on the key technologies, and clusters 2, 3, 4, and 9 are selected to analyse the current hotspots and future directions of FDRM research.
Four hotspots, namely, SK, permutation entropy, deep learning, and sparse representation, have been identified; they are described in detail in the following.

1) SPECTRAL KURTOSIS
As one of the powerful techniques for envelope analysis, SK is an efficient method for the detection of rotating machinery faults based on vibration signals. It was first introduced by Dwyer in 1983 and was defined as a statistical tool that can indicate non-Gaussian components in a signal and their locations in the frequency domain [141]. Limited by the recognition of SK, it has typically been used as a supplement to the classical power spectral density and developed slowly. In 2006, Antoni J analysed SK thoroughly and proposed a formalization of SK by means of the Wold-Cramer decomposition of conditionally non-stationary processes [95]. Simultaneously, Antoni J and Randall RB showed that SK can not only provide a robust way to detect incipient faults even in the presence of strong masking noise but also offer an almost unique means of designing optimal filters for filtering out the mechanical signature of faults [91]. Since then, improvements in SK have attracted considerable attention, and many research works emerged in the following decade.
In 2016, Wang et al. [8] reviewed the development of SK and its applications in fault detection. The main developments of SK are the short-time Fourier-transformbased estimator of SK [95], the kurtogram [91], the fast kurtogram [94], the adaptive SK [142], and the protrugram [90]. These SK techniques are very powerful for detecting impulsive signatures from vibration signals even when accompanied by substantial noise. In addition, compared with other time-frequency analysis methods, such as the wavelet transform and EMD, SK techniques can automatically indicate in which frequency bands these signals occur. Therefore, considerable research has been conducted to improve SK techniques for the fault diagnosis of rolling element bearings [59,[143][144][145][146][147][148][149] and gearboxes [144,[150][151][152][153], and good diagnostic results have been achieved.
However, SK techniques are not suitable for signals obtained from runup or rundown experiments of a machine due to the limitations of the fundamental assumptions of the theory. An effective solution and an important development trend are the combination of time-frequency decomposition methods and SK techniques, in which SK is used as a tool to select frequency bands for demodulation. Another popular research direction is to combine SK with AI techniques to perform intelligent diagnosis, in which SK is usually employed to preprocess the signals.

2) PERMUTATION ENTROPY
Among the data-driven methods of FDRM, feature extraction is one of the two most important key techniques. The concept of entropy is introduced in feature extraction to express the nonlinear and non-stationary dynamic characteristics of vibration signals. Many methods, such as Shannon entropy [154], approximate entropy [155], sample entropy [156], fuzzy entropy [157], and multi-scale entropy [158], have been integrated. Permutation entropy was proposed by Bandt and Pompe [159] to measure the complexity of time series. Due to its robustness under the nonlinear distortion of time series signals and its high calculation efficiency, permutation entropy was also introduced into FDRM [160]. By comparing the approximate entropy and L-Z complexity, the permutation entropy is proven to effectively represent the working features of the vibration signal of rolling bearings under different conditions. Since then, many permutationentropy-based methods have been proposed for FDRM [161][162][163][164], most of which focus on the development of permutation entropy [165][166][167][168][169] and combinations with decomposition algorithms for feature characterization, such as EMD [111], wavelet packet decomposition [170], local mean decomposition [171] and VMD [172].
In 2018, Li et al. [11] reviewed several entropy algorithms and their variants in FDRM, in which permutation entropy and other entropy-based methods achieved many successful applications. However, three issues limit the development of permutation entropy. First, the entropy algorithm is usually combined with decomposition methods to extract features, which consumes much time. Although permutation entropy was proposed to improve the computational efficiency, it is insufficient for online monitoring. Second, complex parameter settings make it difficult to apply permutation entropy without sufficient experience, weakening its intelligence. Last, and most importantly, as a traditional data processing technique, permutation entropy and other entropies are manual feature extraction methods, which are now being replaced with the current deep-learning-based methods. Therefore, the main development trends of permutation entropy are to improve its computational efficiency, automate its parameters and integrate it into deep learning.

3) DEEP LEARNING
Although data-driven methods have achieved good performance in FDRM, two crucial issues hinder further development. On the one hand, feature extraction depends on prior knowledge of the signal processing technique and diagnostic expertise. On the other hand, the classifiers used for fault recognition are shallow learning models, such as extreme learning machines [173], artificial neural networks [174], support vector machines [175] and random forests [76]. The shallow structure limits their capacity to learn complex nonlinear relationships and handle big data in the FDRM. As a breakthrough in AI, deep learning unifies feature extraction and fault recognition and can mine useful deep information in big data by means of a multiple-level structure. With increasing computer power and data size, deep learning methods can dramatically improve pattern recognition and have been successfully applied in many fields, with better performances than those of other machine learning techniques, including speech recognition [176], image recognition [177], robotics [178], and medicine [179]. Due to its automated feature learning process and powerful classification ability, deep learning can effectively solve the two aforementioned deficiencies and is considered a promising tool for fault characteristic mining and the intelligent diagnosis of rotating machinery [87]. In recent years, a large number of papers have reported the implementation of deep-learning-based methods in machinery fault diagnosis [34], of which the typical modes are the CNN [180], SAE [181], DBN [182], DNN [183], recurrent neural network (RNN) [184], and generative adversarial network (GAN) [185].
Although deep-learning-based methods have been proven to outperform shallow learning methods, their performance depends substantially on the quality and quantity of the data. However, in practical industrial applications, FDRM lacks a large amount of available and high-quality data, presenting data complexity with characteristics of data insufficiency, data incompleteness and data imbalance. To solve these problems, knowledge transfer is believed to be a promising technique, in which transfer-learning-based methods aim to build models that can perform well on target tasks by leveraging knowledge from semantically related but different source domains [35]. In addition, due to its similarity to human and animal learning, unsupervised learning is an efficient way to address data complexity [99]. Therefore, a popular research hotspot in deep learning for FDRM is to develop transfer learning and unsupervised learning to decrease the diagnostic difficulty caused by data complexity.
In addition, the commonly used deep learning models are developed based on the artificial neural network, which suffers from two key problems: the large number of required calculations in the training process and the large number of hyperparameter settings [186]. In 2019, Zhou and Feng [187] conjectured three crucial characteristics of deep learning, i.e., layer-by-layer processing, in-model feature transformation and sufficient model complexity, and proposed a non-NNstyle deep model based on decision trees, named deep forest. In their paper, because of the small number of hyperparameters and the automatically determined model complexity, the deep forest model effectively solves the two problems of DNNs and outperforms the CNN across various domains. Therefore, a research hotspot is to develop non-NN-style deep models based on the features of FDRM.

4) SPARSE REPRESENTATION
Sparse representation, a signal analysis and feature representation technique, can describe arbitrary complex signals compactly, reveal the signal features thoroughly from more perspectives, and extract rich detailed information. Sparse representation is initiated by the matching pursuit algorithm [188] and originates from atomic decomposition [189]. Based on a dictionary, atomic decomposition decomposes a complex signal into a superposition of optimal elementary waveforms that best match the major structure of the signal. In this way, the signal is sparsely represented, and the sparsity is measured by the number of atoms used in the signal representation. Due to its excellent adaptability and high flexibility in fault feature identification, sparse representation has also been implemented in FDRM [190][191][192].
In 2017, Feng ZP et al. [10] thoroughly reviewed the sparse representation for complex signal analysis in FDRM and considered atomic decomposition algorithms and dictionary design methods as the two key issues. Increasing interest in these two aspects has emerged. For instance, Cai et al. [193] proposed a new fault feature extraction technique for gearboxes using the sparsity-enabled signal decomposition method. Focusing on dictionary design, He et al. [192] proposed a new sparse representation for fault feature extraction by constructing an over-complete dictionary through the unit impulse response function of a damped second-order system. Cui et al. [136] proposed a new impulse dictionary model for bearing fault diagnosis that considers the fault size, rotational frequency, bearing dimensions and other parameters. Although many papers have been reported, the development of these two aspects remains a research hotspot, for example, a reduction in the computational complexity of dictionary design and the combination and improvement of the existing decomposition algorithms. Another research trend is to develop decomposition algorithms based on the properties of the dictionary employed [10].

V. CONCLUSIONS AND LIMITATIONS
Tens of thousands of papers on FDRM have been published, and many literature reviews have been reported on relevant subjects. However, a thorough bibliometric and network analysis to analytically and objectively identify influential journals, regions, works, authors, and research hotspots and their development has not been completed. This paper presented a structured review of the FDRM literature using the bibliometric technique to analyse 2532 articles and reviews from the WoS database published from 1998 to 2019.
In the first 15 years (1998-2012), the publication growth rate was relatively stable; it then increased substantially in 2013 by 72.22%. This result is not surprising since FDRM evolved with the development of signal processing techniques and integration with machine learning techniques. In subsequent years, many papers have reported new theories, methods and applications using the typical mode, which consists mainly of feature extraction and fault classification. With the continuous development of AI techniques, deeplearning-based research has become a hotspot in the field of FDRM and has led to a sharp increase in publications, which is supported by the fact that over 50% of the works were published in the past 4 years (2016-2019).
In terms of the journals, MSSP dominates FDRM research both in quantity and quality, and contributes a large number of publications with high influence. This observation is confirmed by the fact that 15 of the 20 most cited papers are from MSSP. In addition, the JSV, Measurement, and IEEE TIE are leading journals in terms of publication volume and number of citations. Recently, open source journals have attracted increasing interest, among which IEEE Access published 70 papers in 2019, ranking first for that year.
China is the most active and productive region, and Chinese institutions have published over 1600 documents, followed by the United States and Canada. Moreover, Australia and France are also among the most influential countries publishing high-quality papers. In addition, influential authors, who have long been devoted to research on FDRM and have produced fruitful work, are also discussed in Section IV. However, it should be pointed out that based on the scope of the specified title keywords, this article evaluates studies only from the perspectives of publications in the WoS database. There are still many studies dedicated to various aspects of FDRM that have not been included and discussed in this article. Moreover, collaborations among nations, institutions and researchers must be deepened to influence the field in innovative and interesting directions.
As of 2019, Ref. [46] has received 1859 citations, making it the most highly cited paper. In addition, Ref. [18] and Ref. [7] are two very highly cited papers and the top two papers in the co-citation network, indicating their large impact both in terms of popularity and connectivity of the FDRM. Meanwhile, an interesting finding is that FDRM research is also influenced by other disciplines and research outside the field, such as deep learning [109].
Moreover, the top 10 research topics were identified based on clustering keywords of the co-citation network. A map of the 10 research clusters shows their development trends, and 4 topics were summarized and reviewed: SK, permutation entropy, deep learning, and sparse representation. Except for deep learning, the other three topics are signal processing techniques. In our opinion, with the continuous development of AI, combining deep learning and these signal processing techniques based on the characteristics of vibration signals will be a hot research direction of the FDRM field. Additionally, it is a great challenge and practical requirement of industrial engineering to study solutions for FDRM under data complexity, such as data insufficiency, data incompleteness and data imbalance, based on deep learning, transfer learning, and ensemble learning. In addition, a key point for future research is to implement fault diagnosis at the system level and product level of engineering applications based on actual industrial data.
This paper provides an overview of FDRM research and helps identify several hotspots and their development trends through bibliometric and network analysis. However, some limitations remain. Expanding the search query from the title expression to the keywords and themes could result in a more exhaustive review of this field. Furthermore, databases other than the WoS, such as Scopus and Google Scholar, were not included, leading to the absence of valuable publications. However, the inclusion of additional issues and databases would result in an enormous increase in the number of papers. Most of the existing tools have difficulty working with such large datasets. Thus, another future research direction is to develop effective bibliometric and network tools for more comprehensive data.