Multilayer Network Representation Learning Method Based on Random Walk of Multiple Information

Network representation learning aims to map nodes in the network into low-dimensional dense vectors, which can be widely used to solve the network analysis tasks. Existing methods mainly focus on single-layer homogeneous networks. However, many real-world networks consist of multiple types of nodes and edges, which are called multilayer networks. The problem of how to capture node information and use multi-type relational information is a major challenge of multilayer network representation learning. To address this problem, we propose a method of random walk of multiple information, called IFMNE, to efficiently preserve and learn node information and multi-type relational information into a unified space. This method combines node structure information with network topology information to obtain the node random walk sequence, and trains the node walk sequence on the neural network model. Experimental results are performed on five real multilayer networks, and the embedding vectors were evaluated by link prediction task. The accuracy was significantly improved on the basis of low time complexity compared with the baseline methods.


I. INTRODUCTION
Network data can naturally express the relation between objects, which are ubiquitous in our daily life and work, such as infrastructure networks, social networks and so on. In recent years, many people have attempted to perform different machine learning tasks, such as node classification [1], community detection [2] and link prediction [3], by obtaining node features from network data. Initially, most algorithms require manual design features to solve downstream learning tasks. The fundamental problem with these manually designed features is that they rely on specific tasks and are difficult to generalize to other learning tasks. In order to solve this problem, a network representation learning algorithm [4] The associate editor coordinating the review of this manuscript and approving it for publication was Jenny Mahoney. is proposed, and the experimental results show that it can improve the accuracy of various learning tasks greatly. The basic idea is to learn the low-dimensional node vectors representation in the network by the close neighbor of nodes.
At present, the majority of network representation learning research is focusing on a single type network. However, in practical applications, many networks are not available in isolation, but interact with each other. If an edge is used to represent different types of relationships in complex networks, there are serious limitations and it will lead to incorrect descriptions of real networks. Therefore, researchers have proposed the concept of multilayer network [5], in which each layer describes the topology of a set of nodes corresponding to a specific relationship. In a social network such as Facebook, users often have different kinds of interactions, such as friendship relations, forwarding articles to each other, conducting conversations, and money transferring, etc. Each of interactions will create a layer of networks among all users to ensure that the network structure is more accurate and detailed without sacrificing its unique features. By interconnecting these layers, we can obtain a multilayer network.
Considering the complexity of multilayer networks, there are few researches on the representation learning of multilayer networks, most of which are improved from the single-layer representation learning model. Existing multilayer network representation learning methods can be divided into two categories according to whether they adopt a random walk strategy or not. The first category of representation learning method for multilayer networks is to directly apply neural network technology to the entire network. Ma et al. [6] proposed a multi-dimensional graph convolutional network (MGCN) method, which mainly extracted rich information from intra-dimensional and inter-dimensional interactions. It is worth noting that the multi-dimensional network is equivalent to the multilayer network. The MGCN model may be applicable, but its efficiency is affected by the number of convolution layers, so it is difficult to make an optimal choice. The second category of representation learning method of the multilayer network is based on the random walk strategy. This category of method uses the random walk within the layer to search in each layer of the network to learn the individual embedding of nodes of different layers, such as OhmNet [7] and MNE [8], etc. However, such methods are time-consuming and do not allow learning individual embeddings for each entity in a multilayer network. Therefore, Liu et al. [9] proposed a heterogeneous network representation learning method (PMNE) based on the interaction path to learn the low-dimensional representation of interaction path information by taking advantage of the interaction information between multiple sampling objects. From another perspective, multiple objects sampled from a heterogeneous network can be viewed as multiple sub-networks. The interaction path sampling strategy is used to capture the interaction information between different sub-networks. Although the PMNE method has better performance, it only uses the network topology information without considering the node structure information of the network itself in the process of node sequence sampling, so the node vector cannot accurately reflect the original structure information of the network. In order to enable the embedded model to learn high-quality node vectors, we add node structure information to guide the sampling process of the node sequence. With such guided sampling method, the generated node sequence can better retain the original structural information of the network. The node structure information refers to the local characteristics of the network to which the node belongs. For example, in Facebook, most users follow the same users (common neighbors), indicating that such users have similar interests and hobbies. Therefore, such users like to share information. Some users follow a lot of users and are followed by a lot of users. They can receive a lot of information, but are not likely to forward information, which means information transmission is very weak. Other users follow fewer users and are followed by fewer users, but keen to forward their own information, which means information transmission ability is strong. Therefore, we use the number of common neighbors to describe the similarity among users, and use the clustering coefficient to describe the transmission of information among users.
To solve the above problems, this paper proposes a multilayer network presentation learning method IFMNE(base on information fusion in multilayer network embedding) that integrates node structure information and network topology information. The IFMEN method uses the random walk strategy, which is similar to strategy of collection node sequences in the PMNE [9]. However, we propose a new walking strategy according to the number of common neighbors and clustering coefficient of the nodes. Specifically, the transition probability of random walkers we design in this paper is similar to PMNE in multilayer networks. By introducing node structure information and combining network topology information in PMEN method, our model can learn more appropriate representation from multilayer networks.
The major contributions of our paper are summarized as follows: • The IFMNE model we propose is a fast and extensible embedding model, which can retain the cross-layer neighborhood of nodes and learn various types of relational information into a unified space.
• We demonstrate that the node sequence obtained by random walk is guided to be applied on the Skip-Gram model by fusing node structure information and network topology information. The learned embedded vector can more accurately reflect the original structure of the network, and improve the accuracy of downstream machine learning tasks.
• We compare the proposed algorithm IFMNE with stateof-the-art baselines on link prediction task, demonstrating the effectiveness of the proposed algorithm on five real network datasets. The rest of this paper is structured as following. In Section II, we briefly review related work in network representation learning. We formally present relevant definitions and provide problem formulations in Section III. Section IV introduces the proposed algorithm IFMNE in detail. In Section V, we present the experimental results. Finally, we conclude our work in Section VI.

II. RELATED WORK
In this Section, we first review some related works on singlelayer network representation learning and further discuss the methods of multilayer network representation learning.

A. SINGLE-LAYER NETWORK REPRESENTATION LEARNING
Traditionally in feature learning, matrix decomposition [10], [11] is an early method for network representation learning. When the relational matrix is stored in memory, VOLUME 9, 2021 this method has high time complexity and space complexity. Therefore, it is not competent for the analysis of large-scale complex networks. Network representation learning based on neural network [12] can overcome the limitations of matrix decomposition method. For example DeepWalk [13], which is inspired by the neural language model Word2Vec [14], in which central words can be represented by contextual words. DeepWalk proposes to perform random walk on the network to generate sequences of nodes and then perform Skip-Gram algorithm in the Word2Vec model to maximize the likelihood probability of the random walk sequence. Node2Vec [15] is an extension of DeepWalk which introduces two parameters p and q to control the random walk in order to explore the local and global characteristics of the network structure. Another popular technique for network representation learning is graph neural networks (GNN). GCN [16] uses convolution operations to incorporate neighbors feature representations into the node feature representation. GraphSAGE [17] is an inductive method that combines structural information with node features. Its purpose is to learn functional representations instead of direct embeddings for each node.

B. MULTILAYER NETWORK REPRESENTATION LEARNING
Although the methods discussed above have achieved good results in some network analysis tasks. However, due to the complexity of the network in the real world, the study of a single type network has its limitations. Therefore, in order to effectively deal with multi-type network between entities, there has been a growing research interest in multilayer network representation learning recently. The OhmNet [7] is an algorithm for hierarchy-aware unsupervised feature learning in multilayer networks, which uses node2vec with a regularization penalty to combine node representations across different layers. The PMNE [9] method proposed by Liu et al. projected multilayer network into unified vector space and includes three embedding methods, namely ''network aggregation'', ''result aggregation'' and ''layer collaborative analysis''. The ''network aggregation'' and the ''result aggregation'' apply the standard network representation learning method to the aggregation network or each layer network, and find a vector space for the multilayer network, but this way will lose a lot of network topology information. In order to consider the influence of interlayer interaction, the ''layer cooperative analysis'' extends the embedding method of single-layer network to multilayer network. This method not only traverses the layers by strategies of first-order and second-order random walks, but also traverses the layers by introducing link transfer probability based on information distance. Zhang et al. proposed the MNE [8] method, which is an extensible multilayer network embedding model. For each node, a high-dimensional common vector is proposed and a lower-dimensional additional vector is provided for each relationship type, with the vectors eventually combined. Therefore, this method can effectively preserve and learn multiple types of relationship information and put them into a unified space. The Mvn2Vec [18] method mainly preserves the semantic information of edges in different networks. Therefore, this method also gets good embedding results. Cen et al. proposed the GATNE [19] method, which is mainly characterized by capturing rich structure information and utilizing multiplexing topology structures from different node types, namely common structure multiplexing heterogeneous network embedding. Ma et al. [6] proposed a multi-dimensional graph convolutional network (MGCN) method, which mainly extracted rich information from intra-dimensional and interdimensional interactions.
All the multilayer network representation learning models above do not consider the structures of the nodes themselves in the network. In this paper, we propose a randomwalk multilayer network representation learning method that fuses node structure information and network topology information. The method can obtain the node sequence in multilayer network and use this node sequence to learn its low-dimensional features.

III. PROBLEM DEFINITION AND FORMULATION A. DEFINITIONS AND NOTATIONS
Here, we first give the definition of a multilayer network.
Definition 1 (Multilayer Network): A single layer network can be represented as G = (V , E), where V denotes the set of nodes, E denotes the set of nodes edges. A multilayer network with L multiple relation types can be modeled as MG = (V , L, E), where V is the set of total N distinct entities and E = ∪ i∈L E [i] .We separate the network for every relation type as denotes all edges in the i th relation type. Thus, a multilayer network is a collection of multiple relation types: Definition 2 (Single-Layer Network Random Walk): Considering a single-layer network G, random walk is inspired by the idea of learning heuristics from local sub-graphs and thus has become the simplest dynamical process to learn network features. A single-layer random walk walk where each node v i ∈ V and t is the length of the walk.

Definition 3 (Multilayer Network Random Walk):
Multilayer network walk is a general extension of single-layer random walk, which is capable of traversing layers. Given a multilayer network MG = G 1 , G [2] , i denotes the node v i at layer β, and α, β, γ , δ can be the same layers or different layers. It is worth noting that v [β] i and v [γ ] i correspond to the same entity v i , because these nodes in different layers are the replicas of the same entity.

Definition 4 (Transition Probability & Cross-Layer Probability):
The transition probability is used to guide a random walker's intra-layer movement. For multilayer networks, we adopt the transition probability combining network topology information and node structure information. Apart from the transition probability, we also define a cross-layer transition probability r, which denotes the switching probability from layer α to β. According to the transition probability and cross-layer probability, a walker can move from one node to another node across multiple layers.

B. MULTILAYER NETWORK REPRESENTATION LEARNING STRATEGY
We adopt the random walk strategy with guidance information in the multilayer network. The advantage of this strategy is that it can obtain rich interaction information between different entities during the sampling process. Figure 1 shows architecture projected onto vector space by a multilayer network. The random walk of multilayer network is shown in Figure 1(a). Taking the walk sequence (v 1 , v 2 , v 4 , v 5 ) described by the red line as an example, 4 in layer β was selected by random walk, and node v [γ ] 5 in layer γ was selected by random walk. The corpus in Figure 1(b) was formed through the random walk sequence in Figure 1(a). Next, the corpus was put into the model in Figure 1(c) for training. Finally, the embedded vector in Figure 1(d) was trained. The random walk strategy combines the network topology information and node structure information of nodes, which can carry out the network representation learning better.

C. NODE STRUCTURE INDICATOR THAT GUIDES THE RANDOM WALK
In the network representation learning task, if the Skip-Gram model is to learn rich network information, the sequence of nodes input to the Skip-Gram model must have high similarity and high information transitivity. Therefore, in the network representation learning algorithm sampling process, we should give priority to select nodes with similar similarity and transitivity to the current node as the node to be sampled. The node sequence generated by the guided sampling method can fully reflect the original network structure information.
In this paper, the number of common neighbors are used to describe the similarity between nodes, which makes the similarity method relatively simple and effective. Thus, the common neighbor is computed by where C nbr (i, j) is the common neighbor number of node v i and node v j , A i is the neighbor node set of the node v i , and B j is the neighbor node set of the node v j . How to describe the transitivity of sampling nodes? In this paper, the clustering coefficient is mainly used to describe the degree of aggregation between the vertices of a graph. The C j of the clustering coefficient of node v j is computed by where j is the neighbor node set of the node v j , k j is degree of the node v j , and e pq is the edge between nodes v p and v q . Specifically, the clustering coefficient is the degree of interconnection between neighbor nodes of a node. The local clustering coefficient of a node proposed in ClusterRank [20] is generally negatively correlated with its transmission capacity, which means local clustering coefficient of a node is smaller, its transmission capacity is stronger. We combine the number of common neighbors and the clustering coefficient to give the guidance indicator Cr, which is calculated by where C j is the clustering coefficient in Eq.
(2), f is the linear negative correlation function of the clustering coefficient C j and is generally 10 −C j . C nbr (i, j) is the number of common neighbors in Eq.(1). VOLUME 9, 2021

IV. THE PROPOSED MODEL
This Section mainly introduces the proposed multilayer network representation learning method IFMNE based on information fusion.

A. MULTILAYER FEATURE LEARNING FRAMEWORK
The embedded model based on Skip-Gram can learn the continuous feature representation of nodes by optimizing the probability and probability of the occurrence of subsequent nodes in the random walk through stochastic gradient descent (SGD) [21]. In this Section, we will introduce the proposed multilayer feature learning framework. Our goal is to learn a low-dimensional embedded vector that holds structural information for multiple types of relationships. First, we construct the multilayer network feature learning problem as the Skip-Gram objective, then the node representations are learned by maximizing the Skip-Gram objective with negative sampling. Skip-gram model is a kind of model in the Word2Vec [14] model, whose purpose is to make clear the relationship between words. Simply put, it predicts the context around it based on the input of a word. As shown in Figure 2, the Skip-Gram model is a simple three-layer neural network. In Skip-Gram model, we input a specific word into the model and output the top few words in Softmax probability ranking [22]. The wandering sequence of nodes in this paper is similar to the text statement in the field of natural language processing. Meanwhile, nodes are similar to words in the text statement, so we can directly train the skip-gram model to get the vector representation of nodes. If given a sequence of words, the sequence is W = (w 0 , w 1 , · · · , w n ), the probability P r (w 0 , w 1 , · · · , w n−1 |w n ) should be maximized in the training. Similarly, given a random walk sequence walk = (v 0 , v 1 , · · · , v i ), the training is to maximize probability P r (v 0 , v 1 , · · · , v i−1 |v i ). As shown in Figure 2, it is assumed that the sliding window w is 2 and each training corpus is W (t−2) , W (t−1) , W (t+1) , W (t+2) , the input layer of the Skip-Gram model is the one-hot coding vector of W (t) , and the output layer is the probability of the occurrence of W (t−2) , W (t−1) , W (t+1) , W (t+2) before and after nodes in the node sequence under the condition that W (t) is known. The objective of this paper is to learn the vector representation of node, which is ζ and the optimization function J (ζ ) is shown in Eq. (4).
where ζ (v i ) ∈ R d is the embedding vector of the node v i across all layers in the multiplex network, n is the context size and represents the n neighbors before and after the current node during each training. By making conditional independence assumption, Eq. (5) can be calculated as Above, the conditional probability P r (v j |ζ (v i )) is computed as where ζ (v) ∈ R d is the context vector of node v.

B. MULTILAYER NETWORK WALK GENERATOR
By referring to local and global structure information introduced in the Node2Vec algorithm and the node structure information mentioned in Section III(C), the two kinds of information are fused to carry out random walk on the multilayer network. Meanwhile, in order to make a random walk between the layers of the multilayer network, the parameter r is introduced to control this mode. Let v i|L| represent the number of distinct layers connected to node v i (∀v i ∈ V ) of the multilayer network MG. That is, we can represent v i|L| as in Eq. (7).
where I denotes the index function, a l ij = 1 indicates that there are edges between nodes v i and v j in layer l. Then we introduce a 2nd order random walk with parameters p, q, r and the parameter Cr in Section III(C). If the random walk previously traversed edge (v z , v x , l ) and the current node v x only has edges on layer l (i.e. v x|L| = 1), then we step according to the node2vec strategy, which introduces local structure and global structure information, i.e.
where P is the transition probability of random walk in multilayer network, t is the time slice, Z is the normalization factor, and α(v z , v x , l) is the transition probability after the introduction of node structure indicator Cr, as shown in Eq. (8).
where Cr denotes the node properties, v z denotes the last visited node, v y denotes the current node, v x denotes the next visited node, d l v z ,v x denotes the shortest distance between node v z and node v x in layer l (node v z and node v x may be the same node). As shown in Figure 3, the parameter p controls the probability of repeatedly visiting the node that has been visited. It only works in d l v z ,v x = 0, and d l v z ,v x = 0 means that the next node to be visited is the node that has been visited. Therefore, if p value is large, the probability of visiting the node that has been visited will be lower, otherwise it will be higher. d l v z ,v x = 1 means that the next node to be visited is connected to the node that has been visited. The parameter q controls whether the direction of the random walk is outward or inward. q only plays a role in d l v z ,v x = 2, and d l v z ,v x = 2 means that the next node to be visited is neither visited the node that has been visited nor is connected to the node that has been visited. If q value is large, the random walk tends to visit the nodes close to v z , and the walk path is inclined to breadth first search. On the contrary, if q is small, the random walk process tends to visit nodes far from v z , and the walk path is biased toward depth first search. Otherwise, if v x|L| > 1, the random walk stays on the current layer l with probability r, and moves along the edge of another layer l with probability 1 − r. That is, the random walk traversal probability for v x|L| > 1 is given by Eq. (9).
Note that in this algorithm, r variable represents the ratio of interactions between nodes of the same layer, which we consider to be importance of the relationship between each layer. When r → 0, the random walk traverses different layers of the multilayer network, and when r → 1, the random walk stays at the initial layer.

C. COMPLEXITY ANALYSIS
The procedure of the proposed IFMNE framework is summarized in Algorithm 1. Algorithm 2 outlines the basic steps of MULTIWALK sampling, which is applied to the IFMNE algorithm. IFMNE algorithm mainly fuses network topology information and node structure information to carry out random walk on multilayer network. After input the node sequence obtained in this way into Skip-Gram model training, it can accurately reflect the original structure of the network. More details are shown in Algorithm 1 and Algorithm 2. We provide time complexity analysis of our algorithm as follows. We assume that there are M network layers and N nodes, and the number of walking steps sampled at each node is W and the step size is L, and the feature vector size is D. Because IFMNE method is based on random walk, the time complexity of IFMNE method is O(N * W * L). Since the random walk probability and cross-layer probability can be calculated in advance and are fixed in the method presented in this paper, the total time complexity of the IFMNE method proposed in this paper is O(N * W * L). In this Section, we compare the proposed algorithm with the latest techniques in existing link prediction tasks. Then, we also analyze the parameter sensitivity and running time scale of our algorithm. Table 1, we select five multilayer networks from different network types. The details of these datasets are as follows:

As shown in
VICKERS [23]: The data was collected by VICKERS from 29 seventh grade students in school in Victoria, Students VOLUME 9, 2021 Algorithm 2 MULTIWALK Input: multilayer network MG(V, L, E) length of walk l choose layer transition probability r Output: node sequence 1: Initialize WalksList to ∅ 2: for iter = 1 to l do 3: CurrentNode = WalksList [−1] 4: select layer MG = G [1] , G [2] , · · · , G [L] with probability r 5: CurrentNodeNeighbors(CurrentNode, G [L] ) 6: The transition probability P of CurrentNodeNeighbors from CurrentNode to each one was calculated according to Eq (7) to Eq (9). 7: select node v by guiding random walk according to probability P 8: add node v to WalksList 9: end for were asked to nominate their classmates on a number of relations including the following three layers: ''Who do you get on with in class?'', ''Who are your best friends in class?'', ''Who would you prefer to work with?''.
LAZEGA [24]: This multilayer social network is composed of three kinds of relationships among the partners of the company, namely cooperation, friendship and advice, each of which forms one layer.
KAPFERER [25]: The data was collected by interactions in a tailor shop in Zambia (then Northern Rhodesia) over a period of ten months. Layers represent two different types of interaction, recorded at two different times (seven months apart) over a period of one month. TI1 and TI2 record the ''instrumental'' (work-and assistance-related) interactions at the two times; TS1 and TS2 the ''sociational'' (friendship, socioemotional) interactions.
FF-TW-YT [28]: The data was collected by a social media aggregator (Friendfeed). In this system, while users can directly post messages and comment on other messages much like in Facebook and other similar OSNs, they can also register their accounts on other systems. From these, three multilayer networks were retrieved, one with users who registered exactly one Twitter account and whose Twitter account was associated to exactly one Friendfeed account (FF-TW) and one smaller dataset with an additional YouTube layer (FF-TW-YT). Since most of the nodes in FF-TW-YT network appear in two networks, we select 2 layers of interaction of that data as our experimental dataset including FF, TW.

B. BASELINE METHODS
We will first compare our model with following state-of-theart embedding-based baseline methods.
DeepWalk [13]: DeepWalk first applies random walk on the network, treats the path as a sentence, and then uses Skipgram algorithm to train the embeddings.
Node2Vec [15]: Node2Vec adds a pair of parameters p and q to control the random walk, so that its neighborhood can be explored by depth traversal and breadth traversal.
MNE [8]: MNE can effectively store and learn multiple types of relationship information into a unified embedded space.
Besides the above network representation learning methods, we will also compare our network representation learning method with the network structure-based methods in link prediction task.
Common Neighbor(CN) [29]: Due to its simplicity, the CN metric is one of the most widely used in link prediction tasks. For each pair of nodes, the more common neighbors it has, the more likely it is to have an edge.
Jaccard Coefficient(JC) [30]: For a pair of nodes, JC uses the total number of two node sets to normalize the number of their common neighbors.
Adamic/Adar(AA) [31]: AA is similar to JC, but unlike JC, AA gives more weight to nodes with fewer neighbors, and compared with other structure-based methods, AA shows excellent performance on many networks.
NIFMNE: This method is the basic version of IFMNE proposed in this paper. It only considers the node structure information to guide the random walk without adding network topology information.

C. EXPERIMENTAL SETTING 1) EVALUATION METRICS
For academics, link prediction is widely used to evaluate the quality of nodes embedded in a network. It attempts to predict which links are most likely to appear in a given network. Therefore, this paper use link prediction tasks to do validation, followed a similar task that is commonly used in evaluation criteria, this paper uses the AUC [32] (area under a TABLE 2. Link prediction based on similarities between two nodes. All the numbers are the averaged AUC score based on five-fold cross validation. receiver operating characteristic (roc) curve) indicator as an evaluation standard of experiment, AUC indicator is based on the test set similarity value of the edge and not exist on the edge of the comparison of similar values, as there is no side as a benchmark, as shown in Eq. (10).
where n represents the number of times similarity value of an edge in the test set is greater than the similarity value of an edge that does not exist. n represents the number of times similarity value of an edge in the test set is equal to similarity value of an edge that does not exist. n represents the number of times that similarity value of an edge in the test set is less than the similarity value of an edge that does not exist. The higher the AUC value of more than 50%, the better the performance of the algorithm. In network representation learning, a pair of vertices is generally used to represent cosine similarity or vector inner product of a vector to calculate similarity value. As shown in Eq (11), cosine similarity is used to calculate the score. The higher the score, the higher the similarity, the more likely there will be a link between the nodes.
where, SM represents cosine similarity, u i and u j represent embedding vectors of nodes v i and v j . When using the multilayer network representation learning method, the embedding vectors of nodes in multilayer network are first learned and used to calculate the AUC of each layer. Finally, the AUC of each layer is summed and its average value is used as the final result. When the single-layer network representation learning method is applied to multilayer networks, the embedding vectors of nodes in each layer of multilayer network are first learned and the corresponding embedding vectors of nodes in each layer are used to calculate their AUC. Finally, the AUC of each layer is summed and its average value is used as the final result, but these embedding vectors do not retain information of other layers in the multilayer network.

2) MODEL PARAMETERS
For all the baseline network representation learning methods, we set their embedding dimension and the generic embedding dimension in the model to 64. For all methods based on random walk, we set window width to 10, walk length to 20, iteration number to 10, and then select 5 negative samples. For Node2Vec, we empirically trained with the best parameters, namely p = 2 and q = 0.5. For the three models of PMNE, we will use the parameters given in the original paper. For the proposed method IFMNE, we only need to keep the same parameters as Node2Vec. Meanwhile, r value is obtained through the experimental analysis in Section V(D 2)), at which point optimal value is 0.5. Table 2, we used the five-fold cross validation settings to evaluate the AUC values of the different models on all the datasets. Cross validation is the repeated use of data, obtained sample data is segmented and combined into different training sets and test sets, the training sets are used to train the model, and the test sets are used to evaluate the prediction of the model. According to the experimental results, we have the following interesting observations:

As shown in
(1) For multilayer network, considering the different relationship types of the network at the same time is conducive to node embedding. The results show that most of the multilayer network representation learning models are superior to the single-layer network representation learning models. For example, under the VICKERS dataset, the AUC value of this method is improved by 4.7% and 3.3% respectively compared with DeepWalk and Node2Vec. The results are also consistent with the hypothesis of this paper, which means the information from the single-layer network is not enough to describe the structural information of the network, but the information from different relationship types can be supplemented to each other, so that most of the structural information of the network can be retained.     Figure 4, IFMNE method has a relatively good result at the five-fold cross validation under different crossvalidation. Due to the network selected is relatively compact, IFMNE method can explore the network structure well through the node sequence generated by the guided random walk. (5) As shown in Figure 5, under different datasets, IFMNE method in different cross validation based on the AUC value is higher than NIFMNE method. It shows that the fusion network topology information and node structure information embedding method obviously improves the quality of the representation of the nodes. At the same time, IFMNE method also can retain the network structure accurately and then get deeper node information.

2) CROSS-LAYER PARAMETER SENSITIVITY
The optimal cross-layer parameter r is selected by studying the optimal AUC value of IFMNE method in link prediction task. As shown in Figure 6, the analysis of VICKERS dataset, KAPFERER dataset and LAZEGA dataset shows that when the cross-layer parameter r is 0.5, the AUC value of these three datasets is the largest, indicating that the optimal value of cross-layer parameter r is 0.5. The results show that the optimal strategy of selecting a layer should be equal probability selection.

3) SCALABILITY OF OUR MODEL
By counting the actual running time of the baseline methods on VICKERS, KAPFERER and LAZEGA datasets to compare with the running time of IFMNE method, and the time complexity analysis results in Section IV(C) were verified. The analysis in Figure 7 shows that the running time of IFMNE method is significantly shorter than MNE and MGCN methods, and is closer to the common network embedding method. From the analysis of Figure 8, it is concluded that IFMNE method is better than most   network embedding methods. In conclusion, the experimental results are consistent with the time complexity analysis in Section IV(C). In terms of running time, IFMNE method is better than most network embedding methods.

VI. CONCLUSION
In this paper, we study the problem of network representation learning in multilayer networks, comprehensively consider the influence of node structures and network topology information in multilayer networks on network representation learning, and propose a random walk indicator in the internal layer. Subsequently, we introduce the parameter r to control the random walk between layers. Combining these two ways of random walking, we propose a multilayer network representation learning method IFMNE, so that the finally learned node representation vector can accurately reflect the original structure information of the network. The experimental results of link prediction on five multilayer network datasets show that IFMNE method is better than other baseline methods, and IFMNE method can learn high-quality node embedding vectors on the basis of less time utilization. At the same time, IFMNE method also keeps structure information of the network to the maximum. Therefore, the validity of this method is verified. Since there are different types of information in the multilayer network itself, which fully describes the characteristics of the real network, our future research will consider adding more types of information for network representation, and optimize it in combination with specific downstream tasks.
GUANGHUI YAN (Senior Member, IEEE) received the Ph.D. degree from Northwestern Polytechnical University, Xi'an, in 2009. He is currently a Professor with the School of Electronics and Information Engineering, Lanzhou Jiaotong University. He has published more than 50 articles in journals and conferences. His current research interests include database theory and systems, the Internet-of-Things engineering and application, data mining, and complex network analysis. He is a member of the China Compute Federation (CCF).
ZHE LI received the bachelor's degree from the School of Electronics and Information Engineering, Lanzhou Jiaotong University, in June 2019. He is currently pursuing the Graduate degree with Lanzhou Jiaotong University. His current research interests include data mining and network representation learning.
HAO LUO received the master's degree from the School of Electronics and Information Engineering, Lanzhou Jiaotong University, in June 2020. He is currently pursuing the Ph.D. degree with Lanzhou Jiaotong University. His current research interests include data mining and multi-relational network analysis.
YISHU WANG received the bachelor's degree from the School of Software, Taiyuan University of Technology, in June 2019. She is currently pursuing the Graduate degree with Lanzhou Jiaotong University. Her current research interests include data mining and complex network analysis.
WENWEN CHANG received the Ph.D. degree in mechatronic engineering from Northeastern University, Shenyang, China, in 2019. He is currently an Assistant Professor with the School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou, China. His current research interests include brain-computer interaction, functional brain networks, and pattern recognition, and their applications in cognitive science and engineering.
MINGJIE YANG is currently a Senior Engineer with State Grid Gansu Information and Telecommunication Company. He is also engaged in power information management.
RUI SU is currently an Engineer with State Grid Gansu Information and Telecommunication Company. She is also engaged in power information management.
NING LIU is currently a Senior Engineer with State Grid Gansu Information and Telecommunication Company. She is also engaged in power information management. VOLUME 9, 2021