A Supervised Learning-Based Approach to Anticipating Potential Technology Convergence

Technology convergence can trigger technological innovation and change. Therefore, it is required to develop an approach to predict the convergence between technology fields that did not exist in the past. It will allow a frim to preoccupy a completely new competitive advantage that is different from that of its competitors. The timely anticipation of converging technology fields allows the innovating firms to recognize the changing business developments associated with the technology convergence. A variety of researchers have presented supervised learning-based approaches to predict potential technology fields where technology convergence is taking place using patents. They have developed machine learning models which capture the associations between the past and future connections between technology classes. Although their contributions are absolutely significant, they have a limitation in that they do not consider in depth the technological properties that are outputs of technological activities performed in each technology field. To ensure that the predicted future connections between technology fields are reasonable, technological properties that can specifically imply technology convergence should be clearly reflected in the process of the supervised learning. Motivated to remedy this problem, this study proposes a supervised learning-based approach to anticipating potential technology convergence by using the link prediction results, the technological influence relationships, and the technological relevance between technology classes. Using these as input features, several classification models that predict new technology convergence are trained and a voting classifier is developed to ensemble all the models. This study is expected to contribute to identifying new technology opportunities that can be realized through technology convergence. Furthermore, this study will assist firms to reflect the identified opportunities on their technology roadmap and make business decisions to penetrate the relevant market in a timely manner.


I. INTRODUCTION
Technology convergence refers to a phenomenon in which connections between technology domains disparate from each other in the past are newly generated to create novel technologies [1]- [3]. It blurs the boundaries between existing domains, thereby forming a technological overlap in various technology fields and consequently allowing new products and services to be developed [4], [5]. Therefore, it is recognized as a decisive factor driving innovation between technology fields and overcoming the current obstacles that outstanding innovations are no longer manifested within one The associate editor coordinating the review of this manuscript and approving it for publication was Claudio Zunino. single technology field [6]. In this context, it is considerably important to have a systematic way of supporting firms to quickly discover potential technology opportunities that can be realized through technology convergence [7], [8]. Transforming the discovered opportunities into new products or services enables firms to achieve sustainable growth based on securing competitive advantages [9]. Technology convergence can trigger technological innovation and change that introduces new value to the market in a timely manner [10], so it is essential to identify new opportunities based on technology convergence. To do that, an approach to predict the convergence between technology fields that did not exist in the past is required because it allows a frim to preoccupy a completely new competitive advantage that is different from that of its competitors. The timely anticipation of converging technology fields allows the innovating firms to recognize the changing business developments associated with the technology convergence [11].
A variety of researchers have presented several approaches to predict potential technology fields where technology convergence is taking place using various types of data sources [4], [10]- [13]. Among the data sources, patents have been dominantly used [14] because they can serve as a reliable source of knowledge reflecting aspects of rapid technological advances in a well-structured format [6], [15]. Mainly, the patent-based approaches have used link prediction analysis which estimates the possibility of future links between nodes based on the existing links to predict future convergences between disparate technologies [4]. Utilizing multiple proximity indexes together, they usually compute the index values of pairs of unconnected technology classes and designate only pairs with the index values greater than a predefined threshold value as potential technology convergence [4]. It is certainly possible to predict future convergence by identifying potential links in the future from the existing connections [10]. However, these approaches determine the threshold value arbitrarily and ignore the weights of the proximity indexes when combining them. It inevitably lowers the reliability of the predicted technology convergence results. Patent bibliometric information has also been widely used for the purpose of exploring technology convergence [10], [11]. They generate a knowledge flow network using International Patent Classification (IPC), which is representative patent bibliometric information, and anticipate converging technology fields based on the network. However, using only bibliometric information is not sufficient in examining future technology convergence fields since the knowledge it provides is quite limited [16]. To remedy this problem, the supervised learning-based approaches are introduced for the purpose of predicting future connections between technology classes [12], [17]. They develop machine learning models which capture the associations between the past and future connections between technology classes. Although the performance of these models was relatively high, they still have a limitation in that they do not consider in depth the technological properties that are outputs of technological activities performed in each technology field. To ensure that the predicted future connections between technology fields are reasonable, technological properties that can specifically imply technology convergence should be clearly reflected in the process of training the models. Of course, there have been attempts to incorporate semantic properties into the prediction process using the text corpus of patens [10], but it is not sufficient to fully encompass the technological properties.
Anticipating future technology convergence results in dominance of the market for associated products by recognizing prospective technology opportunities in advance. In the past, much of the anticipation was based on a qualitative approach, such as in-depth discussions with relevant experts. However, it is inappropriate to actively respond to the rapidly changing technology environment. As a result, numerous studies implying a quantitative approach have been done. Because quantitative approaches frequently incorporate multiple indications, there is an issue with combining them effectively. A great answer to this problem is a supervised learning. It effectively illustrates the complicated relationships between multiple inputs and single output. Therefore, this study proposes a supervised learning-based approach to anticipating potential technology convergence by using the link prediction results, the technological influence relationships, and the technological relevance between technology classes. Using these as input features, several classification models that predict new technology convergence are trained through various machine learning and deep learning algorithms. We perform a comparative analysis of the trained models and develop a voting classifier to ensemble all the models. To explore the feasibility of the proposed approach, we conduct a case study by choosing specific technology fields where convergence frequently occurs. Moreover, we measure the performance of the voting classifier and discuss the details of future technology convergence based on the prediction results by the voting classifier. This study is expected to contribute to identifying new technology opportunities that can be realized through technology convergence. Furthermore, this study will assist firms to reflect the identified opportunities on their technology roadmap and make business decisions to penetrate the relevant market in a timely manner.

A. TECHNOLOGY CONVERGENCE
Technology convergence has become a common feature of innovation, which leads to the securing of competitive advantages of firms and the evolution of industrial structures [8]. Discovering emerging trajectory of technology convergence can not only present the way to create new inventions by convergence of outstanding technologies across industrial boundaries, but also increase the opportunities for innovation [6]. Technology convergence, a noteworthy feature of current innovation trends, has created new opportunities for firms to gain competitive advantages and core competencies [8], [18], [19]. Therefore, capturing current trends and anticipating future aspects of technology convergence will be beneficial for firms in that they can seize innovation opportunities for sustainable growth.
The existing studies on technology convergence had mainly focused on identifying its patterns and measuring the degree to which it occurs actively from a static perspective. For example, Han and Sohn [20] discovered key technology fields that played an important role in technology convergence. To examine technology convergence more precisely, several multidimensional indicators had also been designed. Using those indicators, some meaningful insights about furthering industrial convergence were derived and the processes of technology convergence were established [21], [22]. These previous studies from a static perspective obviously VOLUME 10, 2022 contributed to enabling firms to properly understand the current technology convergence trends, explore the convergence innovation, and presenting the significant implications. However, in order to derive proper action plans to actively respond to the rapidly changing technology environment, it is required to anticipate potential future technology convergence from a dynamic perspective [8]. Thousands of patents, the key data of the proposed approach, are issued every day. Thus, anticipation models should not be used in a static form for a lengthy period of time. They need to be updated continuously and frequently. It necessitates the periodic re-establishment of the intricate relationships between various inputs and an output. Supervised learning is a great way to perform such frequent model updates automatically. It can quickly and effectively depict the complicated relationships between inputs and outputs with the help of improved hardware performance and the emergence of powerful algorithms for data analytics. For this reason, the supervised learning-based studies to anticipating or forecasting new technology convergence have been extensively carried out [12], [17]. Accurate forecasting of technology convergence can facilitate firms and governments to improve innovation efficiency [8]. However, the previous studies do not consider in depth the technological properties that are outputs of technological activities performed in each technology field. Therefore, this study develops classification models that predict new technology convergence by extracting various features that can comprehensively capture relationships between technology fields.

B. EMPERICAL STUDIES OF TECHNOLOGY CONVERGENCE USING PATENTS
Several data sources have been used to investigate technology convergence, including Wikipedia [13], research papers [21], and some relevant outputs of the government-supported R&D programs [23]. Among them, patents have been used most due to its features of the latest reliable sources to capture technological advances and innovative practices [8], [24]. A patent is often classified into several IPCs at the same time, indicating that the unique solution inherent in a patent is applicable in several technology classes and an exchange of technological knowledge occurs between them [25]. The IPC is represented by a set of alphanumeric code and is organized into a hierarchy of section, class, subclass, and group [26]. Many studies have noted that using only IPC subclasses is sufficient to generate an appropriate number of technology classes with clear technological boundaries [25], [27], [28]. Thus, this study also uses IPC subclasses to define the necessary features.
Patent co-classification analysis aims at capturing the aspects of knowledge exchange and sharing between technology classes by extracting their relational information in which technological knowledge is implicitly embedded [29]. Using the patent co-classification analysis leads to building a technological knowledge flow network [30]. Investigating the network will serve as the basis for depicting the convergence between technology classes. Song et al. [11] proposed a novel approach to anticipate converging technology areas by analyzing the knowledge flow in the patent co-classification network. Lee et al. [31] identified convergence patterns of various technology fields and predicted future patterns performing the link prediction analysis on the IPC co-classification network. Gauch and Blind [32] presented a patent-based method of identifying trends in technology convergence by exploring the structures of convergence in technological development and standardization. They proved that it is possible to properly illustrate the aspects of convergence between technology fields through pairs between technology classes represented by IPC. Therefore, this study also uses patent co-classification analysis at the IPC subclass level to extract useful features representing complex relationships between technology classes and to predict future technology convergence.

III. APPROACH TO ANTICIPATING POTENTIAL TECHNOLOGICAL CONVERGENCE
There have been studies to anticipate technology convergence by linearly combining multiple proximity indexes. However, they ignore the weights of the indexes when combining them. In addition, they do not deeply consider the technological properties that are results of technological activities carried out in each technology field. Technological properties can specifically imply technology convergence, so they should be clearly reflected in the process of training the models. It will ensure that the predicted future connections between technology fields can be reasonable. This study proposes a supervised learning-based approach to anticipating potential technology convergence by developing classification models that predict new technology convergence. The proposed approach consists of 3 steps as shown in Fig. 1: 1) extracting multiple features suitable for the prediction of technology convergence, 2) training classification models using various machine learning and deep learning algorithms, and developing a voting classifier that ensembles all the classification models, and 3) measuring the performance of the classifier and identifying potential technology convergence based on the prediction results.

A. EXTRACTING FEATURES FOR TECHNOLOGY CONVERGENCE CLASSIFICATION MODELS
We collect patent bibliographic information such as IPCs and textual data including titles, abstracts, and claims because this study is based on the patent co-classification analysis and explores technology convergence by extracting technological features from patent text data. Features to be obtained from these data can be classified into three types. First, we assess the possibility that new links between technology classes will be formed in the future using link prediction measures. They compute proximity values for all pairs of unconnected technology classes, which indicates the likelihood that new links between them will be created in the future. It can certainly examine the potential technology convergence in that it refers to the possibility of convergence between technology classes. Second, we explore the cause and effect relatedness between technology classes. Convergence is influenced by the relationships between interactants. In general, technology classes, which were separated from each other at first, begin to exchange and share knowledge over time, and eventually convergence occurs between them. Therefore, exploring their influential relationships is quite appropriate to investigate the possibility of convergence. We evaluate comprehensive influential spillovers by extracting direct influential relationships between technology classes through the patent co-classification analysis, deriving indirect relationships from the direct ones, and then aggregating them all. It can properly quantify the readiness of convergence between different technology classes to emerge. Finally, we examine the technological relevance between technology classes. The two types of features mentioned above only explore their external closeness, but do not consider the technological properties that should be involved in the process of anticipating technology convergence. This third type of feature complements the previous two types by incorporating technological properties into the proposed approach. Technology classes with technological relevance tend to have their own technological elements easily converged with each other. It can explain the environment in which convergence between different technology classes can occur more actively.
This study proposes a supervised learning-based approach to deal with the effects of these features on technology convergence from a technology-centric perspective. Information about the class or label used when training classification models with these input features is required. If patents classified into different technology classes are granted, it can be considered that new inventions are derived by convergence of knowledge of relevant classes. Therefore, we quantify technology convergence using those patents and determine label information. There should be some time gap between the input features and the emergence of converging technologies. For the time gap, we collect patents by dividing them into three periods. We extract the input features and labels from the patent data of the first and second periods, respectively. Several classification models are trained through supervised learning using them. In addition, we evaluate the performance of the classification models using the input features and labels from the patent data of the second and third periods, respectively. Finally, after extracting the input features from the patent data of the third period, we put them into the classification models to predict future technology convergence. Based on the prediction results, we will discuss the details of future technology convergence.

1) NEW LINK POSSIBILITY
We build a co-classification network using the IPCs of the collected patents and apply the link prediction measures to the network to calculate proximity values for potential technology connections. In the co-classification network, a technology class is depicted as a node and a relation between them as a link. The relation shows how actively the knowledge within the relevant classes is being applied to the inventions in a convergent way. Technology classes linked in a certain period mean that converging technologies are already emerging between them. Conversely, unlinked classes indicate that an investigation should be made to see VOLUME 10, 2022 if technology convergence can occur in the next period. Link prediction identifies pairs of technology classes that are likely to be linked in the next period among the unlinked classes in the current period [33]. Multiple link prediction measures have been proposed to calculate the proximity between different nodes in a network [34]. A node pair with higher proximity is more likely to be linked. Link prediction measures are grouped into three categories: local, global, and quasi-local [35]. Local measures only focus on the neighborhoods of a given pair of nodes [36]. Most local measures are variants of the common neighbors (cn) which embodies an intuition that two researchers are more likely to work together if they have worked with the same group of people in the past. Jaccard (jaccard), preferential attachment (pa), Adamic-Adar (aa), resource allocation (ra), hub depressed index (hdi), and Leicht-Holme-Newman (lhn_local) are representative common neighbors-based local measures. Global measures consider the properties of the entire network as a whole [36]. Katz (katz), matrix forest (mf), average commute time (act), and random walk with restart (rwr) belong to the global measures. Quasi-local measures are somewhere between the local and global measures [36]. Local path (lp) counts the number of paths of length two and three, and computes the weighted sum of them [37]. Studies using link prediction usually tend to utilize only a few measures [38], [39]. However, in this case, significant information loss is inevitable [10], [25]. Therefore, we use all the commonly used measures for link prediction together. The proximity scores for the measures are summarized in Table 1.

2) CAUSE AND EFFECT RELATEDNESS
The predicted links represent only undirected associations between technology classes. Using only these undirected relationships is not sufficient to properly extract features that are likely to influence technology convergence. Therefore, we apply association rule mining to generate directional connection rules between them. Association rule mining, one of the representative unsupervised learning techniques, is the process of revealing important hidden relationships among sets of items in a huge database [51]. Let I = {i 1 , i 2 , i 3 , . . . , i n } be a set of items, and X and Y are two subsets of I , association rule mining generates rules in the form of X → Y , where X ∩ Y = ∅ [52]. The rule X → Y means X implies Y in which X and Y are said to be antecedent and consequent itemsets, respectively. Three measures, support, confidence, and lift, are examined to investigate the generated rules [53]. The support measure indicates how often the antecedent and consequent itemsets appear simultaneously in the entire transaction. The confidence measure expresses how closely the antecedent itemset is related to the consequent itemset. The lift measure explains the correlation between the antecedent and consequent itemsets. The confidence values of the rules generated based on the patent co-classification information can show the amount of influence of the antecedent technology classes on the consequent classes because they illustrate how strong the implication relationships between these classes are.
We produce comprehensive influential spillovers among technology classes using the Decision Making Trial and Evaluation Laboratory (DEMATEL). As one of the network-based decision making techniques, it encompasses both direct and indirect influential effects of each class on other classes [54]. First, we create an average matrix representing the degree of direct influence between technology classes from the patent co-classification information. Next, the average matrix is normalized by dividing all elements by the maximum of the column and row sums. Let D be a normalized average matrix, a total relation matrix is computed by D (I − D) −1 . From the total relation matrix, we can compute the cause and effect relationships of technology classes by calculating the sum of rows and columns, respectively [55]. Cause refers to the extent to which each class affects all other classes and effect represents extent of influence that each class receives from all others. Converging technologies will highly likely to emerge between technology classes that have strong cause and effect relationships. We, therefore, use the cause and effect for pairs of technology classes as input features to investigate technology convergence. Fig. 2 shows exploring the cause and effect relatedness between technology classes.

3) TECHNOLOGICAL RELEVANCE
Technological relevance can explain the environment in which convergence between different technology classes can occur more actively since technologically similar classes can be more easily converged. To explore the technological relevance between them, we define technological features by extracting technology topics and calculating the topic similarities. Latent Dirichlet Allocation (LDA) [56], as a generative topic model, retrieves latent topics hidden in massive textual documents [57], [58]. It assumes that a document is made up of multiple topics and determines which topics are related to the words contained in the documents [56]. Topics are represented as a set of words based on the probability that the words will be included in a particular topic. To extract technology topics from patent documents, we need textual data of patents such as titles, abstracts, and claims. A corpus is configured with the collected textual data and is cleaned through the pre-processing techniques such as stemming and eliminating stop words. A document-term matrix is constructed with the cleaned corpus. Before applying LDA to the matrix, we have to determine in advance the number of topics to be extracted, which affects the quality of the topic model. A commonly used criterion is perplexity [59]- [61]. In general, the lower the perplexity value, the better the quality of the model [60]. We compute perplexity values while varying the number of topics and adopt the number at which the rate of change in perplexity values becomes low.
Applying LDA to the document-term matrix leads to the creation of a document-topic matrix, which expresses probable relationships between patent documents and technology topics. We convert them into the relationships between technology classes and topics by grouping them according to the IPC subclasses into which each patent is classified. Technological relevance is finally computed by vectorizing the relationships between technology classes and topics, and calculating the similarity between technology classes. A lot of different methods have been used to calculate vector similarity. This study extracts input features representing technological relevance using two similarity calculation methods: bray-curtis distance and cosine similarity. The former computes the ratio of the absolute difference of individual elements to the absolute sum of elements in the two vectors [62], [63]. The bray-curtis distance is limited to values between 0 and 1 [64], so it can be easily combined with other methods. The latter is commonly used to calculate distance between documents when they are embedded into a vector space [65]. It determines whether the two vectors are approximately in the same direction by measuring the cosine of the angle between them [66]. It can theoretically range from −1 to 1, but usually has values from 0 to 1 between two documents because term frequencies in documents cannot be negative. The cosine similarity is often preferred for measuring the document similarity in text analysis over other methods since it does not depend on the vector magnitudes [67]. We use these similarities representing technological relevance for pairs of technology classes as input features to investigate technology convergence. Fig. 3 shows examining the technological relevance between technology classes.
This study uses 16 input features in three different types to achieve supervised learning.  convergence occurs in the pair of two technology classes. It could be expressed as patents classified into those classes. To create the labeled data for training and testing, we divide the patent dataset into three periods. For each pair of technology classes, after extracting the input features from the patent data in the first period and whether technology convergence occurs in the second period, several classification models are trained using them. The input features in the second period and whether convergence occurs in the third period are used to measure the performance of the models. The twelve input features naturally have a very wide range of values. Therefore, they should be normalized before being used to train the models.
Various machine learning and deep learning techniques have been used for classification or prediction problems. Because each of them has their own strengths and weaknesses, this study intends to anticipate the convergence between technology classes by using multiple techniques together. Decision Tree (DT) is represented by nodes and branches, where nodes are composed of features for classification and branches show the values of those features [68]. It is one of the most widely used techniques for classification because of its simplicity and interpretability [69]. Logistic Regression (LR) is also a popular statistical generalized linear model. It is often chosen as a reference baseline for machine learning because its implementation is simple and straightforward [70]. Support Vector Machine (SVM) aims to define the best separating hyperplane in the input feature space to maximize the interval of positive and negative samples in the training set [71]. Compared to a single model, ensemble methods can effectively improve the performance of prediction by averaging the results of different models [72], so we also utilize Random Forest (RF), Gradient Boosting Machine (GBM), XGBoosting (XGBoost), and Categorical Boosting (CatBoost). The ensemble methods suppress the dispersion of the prediction results and improves the generalization and robustness of the multiple classification models by aggregating the results of the different models [73]. We additionally train a Deep Neural Network (DNN) which learns a set of hierarchical nonlinear transformations [74]. To control the learning process, we have to tune hyperparameters because they cannot be inferred while training classification models [75]. Grid search with cross-validation has been frequently used to obtain optimal hyperparameter combinations. It can find the best combinations by iteratively applying different values of various parameters for a given model to the validation dataset [76]. The hyperparameters from the grid search are evaluated by the cross-validation determining which ones are superior [77].
When different classification models make decisions based on input data instances, there are bound to be different answers. Therefore, we develop a voting classifier which employs multiple models when making predictions. Voting is an ensemble method for making predictions that integrates the results of numerous models. Voting will not be hindered by significant errors or misclassifications from a single model because it is based on the performance of multiple models. A model's poor performance can be compensated for by the strong performance of other models. It is quite applicable in situations where there is some confusion as to which classification techniques are appropriate for a given problem [78]. There are two types of techniques for a voting classifier: hard and soft [79]. In hard voting, the class with the most predictions is selected as the final voting result, whereas in soft voting, the class with the highest average of the probabilities of the classes generated by each classifier is chosen as the final voting result [80]. Since all the techniques used in this study generate prediction probabilities, we develop a soft voting classifier.

C. ANTICIPATING POTENTIAL TECHNOLOGY CONVERGENCE
The performance of the voting classifier is measured using the input features from the second period and whether convergence has occurred in the third period. It performs binary classification, so we examine Area Under Curve (AUC) from Receiver Operating Characteristic (ROC) curve in addition to basic metrics such as accuracy and f1 score. This study ultimately aims to predict the potential future technology convergence beyond the third period. We will discuss the details of future technology convergence based on the prediction results.
In this study, we use quite a number of features to train various classification models and then develop a voting classifier. Organizing all processes so that they can be clearly identified at a glance will undoubtedly help to increase the applicability of this study in various research areas. Therefore, we provide pseudocodes for extracting input features, training supervised learning algorithms with the extracted features, and assessing performance in Algorithms 1 and 2.

A. FEATURE EXTRACTION
This study develops multiple classification models that predict new technology convergence. The birth of a converging technology can be expressed by the first appearance of a pair of two IPC subclasses [10]. To investigate technology convergence, we collect patents related to wearables granted by the United States Patent and Trademark Office (USPTO) from 2011 to 2019. Wearables are portable electronic systems that are recently attracting attention from the consumer goods industry because they are light and small enough to be worn on a human body [81]. Wearables are typically equipped with a lot of sensors that can recognize a wearer's emotional patterns [11], so significant innovative products are being derived by the convergence of various technologies with them. It means that wearables are suitable for our case study. Song et al. [11] have argued that the two technology fields of signal transmission and telecommunications, and medical Train classification models using machine learning or deep learning technique i for j = 1 to M do P ij = Evaluate the performance of the trained classification model i using the measurement indicator j end for end for return P ij equipment are very closely related to wearables. In addition, they have defined 21 and 18 IPC subclasses that are in concordance with each of the technology fields. Therefore, we can discuss wearables-related technology convergence using pairs between these two groups of technology classes. Table 3 shows the number of patents collected by year.  As interest in wearables is growing, the number of patents is also constantly increasing.
Several classification models are trained to predict whether technology convergence occurs in the second period using the input features extracted from the patent data in the first period. In this case study, observations are all pairs of technology classes that can be generated by selecting one from two groups of technology classes closely related to wearables. As shown in Table 4, the total number of possible pairs is 378 (21 × 18), and the number of pairs for which technology convergence has not occurred in the first period is 307. Therefore, we extract the input features from the patent data in the first period, label them according to whether technology convergence has occurred in the second period, and then train multiple classification models using them as a training dataset. In the second period, the number of pairs in which technology convergence occurred is 99, and the number of pairs that did not occur is 208. The test dataset for measuring the performance of the trained classification models is presented in Table 5. The number of pairs where technology convergence has not occurred in the second period is 213, and among them, the number of pairs where technology convergence has actually occurred in the third period is 45. It is worth noting that the number of pairs increases with each period, from 71 in period 1 to 165 in period 2. This increase is entirely natural given the growing interest in wearables. Many businesses recognize the potential of wearable technology and make efforts to develop innovative products to preoccupy the relevant market. These efforts result in the creation of numerous patents, which leads to the birth of a new convergence technology that is distinct from existing ones in order to differentiate it from competitors. Therefore, it is critical to anticipate future opportunities for new technology convergence in order to gain a competitive advantage. It is the impetus for conducting this study.

1) POSSIBILITY OF NEW LINKS BETWEEN TECHNOLOGY CLASSES
We build a patent co-classification network from 660 patents in the first period. A total of 109 technology classes including ones related to wearables constitute the network. Link prediction has a limitation in that it cannot produce prediction results unless all nodes in the network are connected to each  other without any isolated subnetworks. It is necessary to check whether there are such isolated subnetworks in the co-classification network. In our case data, there are no isolated nodes. Therefore, for all theoretically possible links between the technology classes in the patent co-classification network, we can quantify the likelihood that links between them will be created in the next period.
Proximity values indicating the possibility that all technology class pairs that are unconnected in the first period will be linked in the second period are computed applying various link prediction measures. Table 6 shows the descriptive statistics of the measured proximity values. These values can represent positive implications for the degree of technology convergence in the next period. As can be seen in Table 6, the values vary greatly depending on measures, so it is necessary to normalize them before using them as input features for a supervised learning.

2) RELATEDNESS OF CAUSE AND EFFECT BETWEEN TECHNOLOGY CLASSES
We apply the apriori algorithm, which is one of the most representative algorithms for association rule mining, to the co-classification network of technology classes to generate direct relationships between technology classes. It is necessary to set the threshold values of several measures in advance. We determine the minimum threshold values so that, while a sufficient number of rules are obtained, technology classes showing extremely low frequency of occurrence in the collected patent dataset are not included in the final rule list. Applying the apriori algorithm leads to generate 1,252 rules between technology classes. Among them, top 100 rules sorted by confidence are shown in Fig. 4. Next, we encompass both direct and indirect influential effects of each class on other classes by applying DEMATEL. The generated rules are considered as an average matrix because the confidence in the rules can represent the degree of direct influence between the antecedent and consequent technology classes. After normalizing the average matrix by dividing all elements in the matrix by the maximum of the column and row sums, we compute the total relation matrix. In the total relation matrix, the sum of rows and columns indicate the cause and effect relationships of technology classes, respectively. It is natural to be concerned that the influential relationships between technology classes may be exaggerated because applying DEMATEL reveals not only direct relationships but also indirect ones. The upper left and lower right triangle of Fig. 5 show the cause and effect relationships, respectively. This heatmap ensures that no such concerns have arisen since in most cases, very slight relationships were created, and only a few exhibit relatively strong influential spillovers. Therefore, examining cause and effect between technology classes provides a rational view of the influential relationships between them.

3) TECHNOLOGICAL RELEVANCE BETWEEN TECHNOLOGY CLASSES
We generate technology topics applying LDA to the patents' textual data including titles, abstracts, and claims. After configuring and cleaning a corpus, a document-term matrix is created. Before applying LDA to the matrix, we have to decide how many topics to create. To do that, perplexity is examined. Fig. 6(a) depicts the perplexity values measured while increasing the number of topics to 200, Fig. 6(b) shows the perplexity values that change as the number of topics increases by one. The perplexity value naturally tends to decrease as the number of topics increases. A lower perplexity value usually gives better performance, but creating too many topics can cause problems such as topic redundancy. Thus, in this process of extracting input features for the training dataset, we decide to create 70 topics where the curves in Fig. 6(a) and Fig. 6(b) start to flatten. After applying LDA to the patent dataset, we obtain a document-topic matrix which denotes the weighted associations between 660 patents and 70 topics. The associations are represented as a document-topic matrix. Specifically, the document-topic matrix shows topic-based representations of patent documents where each row determines the degree of association between a topic and documents. After converting these into the associations between technology classes and topics, we measure technological relevance between technology classes by computing the similarity between them. We utilize two vector similarity measures. If the results of VOLUME 10, 2022 these two measures are completely different from each other, it may be difficult to use them together as input features. We calculate the Pearson's correlation coefficient between the results of the two measures and make a scatter plot as shown in Fig. 7. It demonstrates that these two measures do not conflict with each other, and consequently can examine technology convergence in a consistent way. Therefore, it is quite reasonable to use these two similarities representing technological relevance for pairs of technology classes as input features to investigate technology convergence.

B. CLASSIFICATION MODELS FOR TECHNOLOGY CONVERGENCE
We train classification models to predict new technology convergence using the features defined in the previous step. The training data is preferentially normalized using a standard scaler since the range of values of input features is quite different. Of course, there are some techniques that do not require normalization of the training data, such as decision trees, but we do it to ensure consistency for all models. We train multiple classification models using various techniques and ensemble them to develop a voting classifier. Training of models requires setting various hyperparameters. The grid search with cross-validation is used so that models with optimal parameters can be trained. Table 7 shows hyperparameter range of each technique utilized in this study.
We obtain eight classification models with different characteristics to predict new technology convergence by setting the selected optimal combination of hyperparameters to each technique. These models are trained to predict whether technology convergence occurs in the second period using the input features extracted from the patent data in the first period. Finally, we develop a voting classifier to employ multiple models to make predictions. The number of the pairs of technology classes to be trained is limited. Therefore, even if a typical x86 processor is employed, the computation time of model training is only a few tens of seconds. Of course, since we use the grid search with cross-validation to identify the optimal parameters for each model, this process takes up to 5 minutes. Even so, as it is not a long period of time, the proposed approach can be said to be time-efficient.

C. POTENTIAL TECHNOLOGY CONVERGENCE
The performance of the voting classifier is measured using the input features from the patent data in the second period and whether convergence has actually occurred in the third period. Table 8 shows the performance measurement results for each classification model and the voting classifier. The AUC based on the ROC curve is illustrated in Fig. 8. All of them show generally similar performance. If a single model can outperform a group of models, we simply need to use it without voting. For example, in a regression problem, if there is a strong relationship between the predictive features and the target variable, a single linear regression model can undoubtedly perform well. However, a voting estimator made with other models will neutralize the linear regression model's accurate predictions. Table 8 shows that there is no such outstanding model in our case study. Thus, we develop a voting classifier. Of course, the voting classifier performs slightly worse than some individual models in some performance metrics. However, since the performance difference is so little, it is not reasonable to choose only SVM or DT based on Table 8. It is more feasible to have the voting's advantages of improving the generalization and robustness of the multiple classification models. Moreover, it is not quite important to predict the potential technology convergence in the next period for all pairs of technology classes. It is more important to help firms to seize a small number of major technology convergence opportunities by identifying them in advance. Therefore, it is more reasonable to determine whether the future convergence opportunities predicted with high probability by the classifier are feasible in a qualitative way. We will discuss the feasibility of the proposed approach in the next section.
We trained eight individual classification models, and finally developed a voting classifier that ensembles them. They must have similar characteristics in order to be fused into a single model and represent a single decision. Exploring the correlation between the prediction probabilities that all  models produce is one good way to examine the similarity in their decisions. As shown in Fig. 9, there are fairly strong positive correlations between all of them. According to Table 9, they are all statistically significant at the 0.01 level. So, by grouping the decisions they generate, we can arrive at a single aggregated final decision. However, if the decisions of individual models are meaningless because they are generated at random, the decisions derived from their aggregation will also be meaningless. Thus, we perform runs tests on the probabilities produced by the eight models to determine whether probabilities were randomly distributed as shown in Table 10. The null hypothesis is that the sequence of elements obeys a random distribution. The rejection of the null hypothesis means that the sequence does not follow the random distribution. As the p-values show, all the cases reject the null hypothesis at the 0.01 level. As a result, we can be certain that the decisions of all models were not made at random.
In fact, a supervised link prediction approach to anticipating technology convergence, similar to what we did in this study, had already been presented [4]. The performance of the proposed approach will be compared to that of this similar work, which will serve to establish our model's reasonableness. The similar work generated input features using a total of ten link prediction measures and predicted technology convergence training seven classification techniques. Table 11 shows the performance measurement results of seven classification models obtained by applying them to our VOLUME 10, 2022 FIGURE 9. Scatter plots between the probabilities produced by individual classification models.  Fig. 9, there are fairly strong positive correlations between all of them. They are all statistically significant at the 0.01 level.

TABLE 10.
Results of runs tests, where the null hypothesis is that the sequence of elements is random. The statistic Z follows a normal distribution and * * * indicates the significance level at 0.01. case data. They have similar AUC scores to the proposed voting classifier, but their accuracy and f1 scores are much lower. Therefore, we can conclude that the proposed approach reasonably predicts technology convergence.

V. DISCUSSION
We predict potential future technology convergence with the patent data in the third period using the voting classifier. For 42 pairs of technology classes, the soft voting probability is 0.5 or higher. Among them, some pairs with high probability are summarized in Table 12. The pairs of H01Q and A61L, and H02J and B01L show the highest probability. H01Q and H02J relate to data reception and systems for distributing electric power, respectively. A61L and B01L indicate chemical or physical objects. Wearable electronic skin is an example of new technology convergence that can be derived from these pairs. As an ultra-thin and lightweight e-skin, it is a wearable sensor that can capture signals such as electrical impulses from muscle movement. A small wireless transmitter transmits biometric data to a cloud, allowing users to monitor it remotely. The amount of biometric data is generally very large, so it is quite important to exchange data effectively and provide adequate electric power to allow users to access their biometric information in real time. Artificial skin sensor, next generation of wearable and stretchable electronics, can be another good example from these pairs. By putting it on the skin, it helps to quickly understand multiple body parameters by capturing all types of signals generated in our body and analyzing them in real time. H04M and A61C relate to telephonic communication and methods for dental hygiene, respectively. Sensors that attach to teeth can be thought of as a technological implication between these two technology classes. These sensors can reveal what nutrients a person is deficient in by extracting the nutrient information from the food he or she eats. Therefore, it can contribute to the development of a personalized nutritional recommendation system. In addition, sensing the physical reaction and inflammation of teeth caused by chemicals and bacteria contained in food can enable the predictive treatment of potential dental diseases. H04L and A62B relate to transmission of digital information and methods for life-saving, respectively. Wearable lifesaving devices can be seen as convergent products that can be derived from these two technology classes. The devices including wearable robots will be useful for rescuing people in fires or building collapses. Other types of wearable devices can also be used to improve safety in construction sites and chemical industries where physical and chemical accidents frequently occur. Jackets, vests, glasses, and helmets with a wide range of sensors can track the location of workers in real time, and in a designated area, they can be easily detected to prevent accidents that can occur because they are in unexpected places. Measuring environmental conditions including air quality and airborne pollutants will also protect them from hazardous environmental conditions by notifying them if there is any hazardous gas leaking. H03G and A61H are for control of amplification and physical therapy devices, respectively. We can think of smart hearing aids based on artificial intelligence as a convergent product from between these two classes. It will help people with hearing problems hear and communicate with greater clarity by amplifying the sound intelligently. The amplification technology can also be applied to image information. The development of smart vision aids to help people with vision limits by amplifying image data is also considered as one of convergent products in this pair of technology classes. H04J and A61F relate to multiplex communication and filters implantable into blood vessels, respectively. A medical device that is inserted into a living body can be considered here. For example, a smart biosensor in the form of taking, not sticking to the skin, stays in the human body for a certain period of time and monitors changes in biometric information. It is naturally decomposed so that it does not have any other effect on the human body. Therefore, it can lead to the development of smart pills that release medications only where needed in the human body. Some of technological implications derived using the supervised learning-based approach proposed in this study have already been attempted from various perspectives. Thus, we believe that our approach is quite reasonable and feasible.

VI. CONCLUSION
Technology convergence can trigger technological innovation and change. Therefore, it is required to develop an approach to predict the convergence between technology fields that did not exist in the past. It will allow a frim to preoccupy a completely new competitive advantage that is different from that of its competitors. The timely anticipation of the technology fields to be converged enables the innovating firms to be aware of the changing business developments associated with the technology convergence. A variety of researchers have presented supervised learning-based approaches to predict potential technology fields where technology convergence is taking place using patents. They have developed machine learning models which capture the associations between the past and future connections between technology classes. Although their contributions are absolutely significant, they have a limitation in that they do not consider in depth the technological properties. As specifically implying technology convergence, the technological properties should be clearly reflected in the process of the supervised learning. Motivated to remedy this problem, this study proposed a supervised learning-based approach to anticipating potential technology convergence by using the link prediction results, the technological influence relationships, and the technological relevance between technology classes. Using these as input features, several classification models that predict new technology convergence were trained through various machine learning and deep learning techniques. Finally, we developed a voting classifier to ensemble all the models and compared its performance with that of the previous classification models. The voting classifier outperformed all previous models. We believe that the most fundamental reason that the proposed approach works well in anticipating technology convergence is that it incorporates technological relevance in the anticipation process. Technology classes with technological relevance tend to have their own technological elements easily converged with each other. It can describe the conditions under which more active convergence across different technology classes is possible. This study is expected to contribute to identifying new technology opportunities that can be realized through technology convergence. Furthermore, this study will assist firms to reflect the identified opportunities on their technology roadmap and make business decisions to penetrate the relevant market in a timely manner.
Despite the contribution, further research should be carried out. We investigated the technology convergence only in pairs between two technology classes. Technology convergence can occur across multiple classes, so we have to explore how to anticipate clusters, not pairs of technology classes. In addition, we performed supervised learning using only data related to patents. Efforts should be made to use other kinds of data that can account for technological advancements, such as journal papers. Finally, the verification of the prediction results of the proposed approach was only performed from a technological point of view. It has to be done with other perspectives. For example, examining the impact of financial factors will help to deepen the verification.

ACKNOWLEDGMENT
The author Wonchul Seo would like to express his deepest gratitude to the experts who have conducted an in-depth discussion of the technological implications derived from the proposed approach. He is currently an Associate Professor with the Department of Industrial and Data Engineering, Pukyong National University, and the Director of TEAMLAB. After his Ph.D., he worked with the CTO Office at Samsung Advanced Technology, as a Technology Strategy Manager, and the Department of Industrial and Management Engineering, Gachon University, as an Associate Professor. He has researched patent analysis, technology roadmapping, and planning using text mining approaches. Recently, he has been working on applying machine learning and deep learning approaches to scholarly big data and has been conducting various studies related to NLP and artificial intelligence.
MOKHAMMAD AFIFUDDIN received the B.S. degree in industrial engineering from the Sepuluh Nopember Institute of Technology, Indonesia, in 2014.
Since 2015, he has been working as a Data Analyzer at the Training and Education Center, The Ministry of Industry, Indonesia. He has been an Assistant Professor at Textile Engineering of Vocational Education in Indonesia. He is currently a Graduate Student with the Department of Industrial and Data Engineering, Pukyong National University, South Korea. His research interests include technology convergence anticipation, technology cluster identification, and technology opportunity discovery.
WONCHUL SEO received the B.S. and Ph.D. degrees in industrial and management engineering from the Pohang University of Science and Technology, South Korea, in 2003 and 2010, respectively.
He was a Senior Engineer with Samsung Electronics, from 2010 to 2012, and was an Associate Research Fellow with the Korea Institute of Intellectual Property, a Korean Public Research Institute, from 2012 to 2013. Since 2013, he has been an Associate Professor with the Department of Industrial and Data Engineering, Pukyong National University, South Korea. His research interests include patent analysis-based technology intelligence, technology and product opportunity discovery, technology trend identification by analyzing knowledge flow networks, and quantification of technological spillover effects.