LncRNA-Disease Associations Prediction Based on Neural Network-Based Matrix Factorization

Numerous experiments have demonstrated that long non-coding RNA (lncRNA) play an important role in various systems of the human body. LncRNA deletions or mutations can cause human disease. The prediction of lncRNA-disease associations is conducive to the diagnosis and prevention of complex diseases. As we all know, it is a time-consuming and expensive process to predict lncRNA-disease associations via biological experiments. However, the computation methods can effectively discover lncRNA-disease associations with less human and material resources. In this paper, we propose a neural network-based matrix factorization model to predict lncRNA-disease associations, which is called NeuMFLDA. NeuMFLDA first converts the one-hot encoding of disease or lncRNA into word vector via the embedding layer. Then combined with the memorization of the conventional matrix factorization and the generalization of the multi-layer perceptron, the lncRNA-disease associations can be predicted more accurately. In addition, as opposed to conventional pointwise loss function, a new pairwise loss function is proposed to update our model parameters. Our new loss function optimizes the model from the perspective of ranking priority, which is more in line with the solution to the lncRNA-disease associations prediction task. Experiments show that NeuMFLDA reaches average AUCs of 0.904± 0.003 and 0.918± 0.002 in the framework of 5-fold cross validation and Leave-one-out cross validation, which is superior to three the-state-of-art methods. In case studies, 9, 9 and 8 out of top-10 candidate lncRNAs are verified by recently published literatures for hepatocelluar carcinoma, kidney cancer and ovarian cancer, respectively. In short, NeuMFLDA is an effective tool for predicting lncRNA-disease associations.


I. INTRODUCTION
There are lots of non-coding RNAs (ncRNAs) that are not transcribed in the human genome, which are seen as noise of transcription. However, there is growing evidence that ncRNA plays a crucial role in many biological activities [1]- [3]. NcRNA can be divided into long ncRNAs, transfer RNAs, small ncRNAs and piwi-interacting RNAs [4]. The length of lncRNA is generally greater than 200 nucleotides. lncRNA plays an irreplaceable role in many important biological processes, such as cell proliferation, cell differentiation, epigenetic regulation, genomic imprinting and chromosome modification [5]. Studies have shown that lncRNA has a The associate editor coordinating the review of this manuscript and approving it for publication was Victor Hugo Albuquerque . lower expression level than protein-coding genes, and many lncRNAs are highly conserved and highly tissue-specific. Due to the improvement of sequencing technology and computational methods, more and more lncRNAs have been identified.
Studies have shown that lncRNA is involved in the regulation of various diseases. For example, in breast cancer, the expression level of lncRNA 'HOTAIR' is thousands of times higher than the normal expression level [6]. Similarly, abnormalities and disorders of lncRNA can lead to other human diseases such as leukemia [7], cardiovascular disease [8], and various cancers [9]. Therefore, effective identification of disease-related lncRNA can not only help us understand the pathogenesis of the disease, but also promote the detection of biomarkers, which is conducive to disease prevention and drug development. With the development of high-throughput technology, lots of databases about lncRNA are available for free, which makes it convenient for researchers. For example, LncRNADisease and Lnc2Cancer store a lot of useful information about lncRNA [10], [11]. In addition, a growing number of researchers have begun to focus on lncRNA, which greatly expands our understanding of the regulatory mechanisms of lncRNA.
With the increasing impact of lncRNA on diseases, more and more researchers are beginning to pay attention to their association predictions [12], [13]. Clinical trials exploring the lncRNA-disease associations will cost a lot of human and material resources. On the contrary, it is convenient and fast to predict their associations by computation methods. Recently, the application of computation methods in bioinformatics has attracted wide Pattention, such as identifying lncRNA and mRNA Co-expression Modules [14], calculating miRNA functional similarity [15], the discovery of miRNA-mRNA regulatory modules [16], [17], Predict Essential Proteins [18], [19], and the prediction of miRNA-target interactions [20]. Before 2013, there was no computation methods to predict lncRNA-disease associations. Chen proposed a semi-supervised learning method (LRLSLDA) in the Laplacian regularized least squares framework to identify potential disease-associated lncRNAs [21]. This is the first lncRNA-disease prediction model, which enables researchers to further understand the relationship between lncRNA and disease. In 2015, a novel model of HyperGeometric distribution for LncRNA-Disease Association inference (HGLDA) was developed to predict lncRNA-disease associations by integrating miRNA-disease associations and lncRNA-miRNA interactions [22]. The important difference from previous computational researches about lncRNA-disease inference is that HGLDA doesn't rely on any known lncRNA-disease associations. Considering the limitations of traditional Random Walk with Restart (RWR), the model of Improved Random Walk with Restart for PLncRNA-Disease Association prediction (IRWRLDA) was developed to predict novel lncRNA-disease associations [23]. The novelty of IRWRLDA lies in the incorporation of lncRNA expression similarity and disease semantic similarity to set the initial probability vector of the RWR. Lu proposed a method (named SIMCLDA) for predicting potential lncRNA-disease associations based on inductive matrix completion [24]. They computed Gaussian interaction profile kernel of lncRNAs from known lncRNA-disease interactions and functional similarity of diseases based on disease-gene and gene-gene onotology associations. Based on the assumption that functionally similar lncRNAs tend to be associated with similar diseases, Chen developed a novel lncRNA functional similarity calculation models (LNCSIM) [25]. LNCSIM has potential value for lncRNA-related interactions prediction and lncRNA biomarker detection for human disease diagnosis, treatment, prognosis and prevention. Wang established a novel prediction model based on internal inclined random walk with restart (IIRWR) to infer potential lncRNA-disease associations different from traditional prediction models based on random walk with restart [26]. One major novelty of the IIRWR-based prediction model is the introduction of the concept of disease clique, which makes the process of the random walk to possess an internal tendency. Li presented a novel network consistency projection for LncRNA-disease association prediction (NCPLDA) model by integrating the lncRNA-disease association probability matrix with the integrated disease similarity and lncRNA similarity [27]. The lncRNA-disease association probability matrix is calculated based on known lncRNA-disease associations and disease semantic similarity.
These methods can be roughly divided into four categories: (1): Network-based methods, such as NCPLDA and IRWRLDA. Many prediction methods are based on the construction of bi-layer heterogeneous networks, including microbe-disease association prediction and miRNA-disease association prediction [28], [29]. (2): Machine-based learning, such as LRLSLDA. There are many successful applications of machine learning in the field of bioinformatics, including cancer progression identification and identifying mutated driver pathway [30], [31]. Given the power of machine learning algorithms, it will serve as an effective computational tool in various fields. (3): Matrix factorizationbased methods, such as SIMCLDA. The improvement of matrix factorization is widely used in data mining processing of bioinformatics [32]. (4): Models based on multiple biological information, such as HGLDA. In addition, the advancement of association prediction research in various fields of computational biology would provide valuable insights into the development of lncRNA-disease association prediction, such as miRNA-disease association prediction [33], [34], drug-target interaction prediction [35], and synergistic drug combination prediction [36].
In this work, we propose a neural network-based matrix factorization method (NeuMFLDA) to predict the lncRNA-disease associations. NeuMFLDA maps disease (or lncRNA) to word vector representing potential features through a neural network with an embedding layer. Then the linear modeling ability of matrix factorization and the nonlinear modeling ability of multilayer perceptron are combined to fit the implicit relationship between word vectors. In addition, we propose a new pairwise loss function to optimize the model from the perspective of ranking priority, which is more suitable for the problem solved in this paper. In 5-fold cross validation framework, NeuMFLDA obtains the maximum AUC value of 0.904±0.003, and is superior to other methods of 0.891 (IIRWR), 0.884 (NCPLDA), 0.856 (SIMCLDA). Furthermore, NeuMFLDA also outperforms other methods in Leave-one-out cross validation framework, whose AUC value is 0.918 ± 0.002, and is again superior to other methods of 0.897 (IIRWR), 0.902 (NCPLDA), 0.864 (SIMCLDA). Case studies of three human diseases also shows the excellent performance of this method. In summary, NeuMFLDA is an effective model for identifying disease-associated lncRNAs.

II. MATERIAL AND METHODS
First, the database used in this article is introduced. Next, it explains how embedding works, which helps to better understand our methods. Then, we explain how the NeuMFLDA method combines the memorization of matrix factorization (MF) and the generalization of multi-layer perceptron (MLP), and further elaborates the processing of each step. Finally, we propose a new pairwise loss function to optimize the model instead of using the conventional pointwise loss function.

A. DATABASE
We retrieve known lncRNA-disease associations from LncRNADisease Database [10]. This dataset contains 685 lncRNA-disease associations. After correcting the names of diseases (according to UMlS, Mesh and NCBI) and lncR-NAs (according to HGNC, NCBI), we delete all duplicate records with the same lncRNA and disease, and all the wrong entries that do not belong to human beings. The final data set includes 226 diseases and 285 lncRNAs, with a total of 621 known associations. We will get the disease-lncRNA interaction matrix Y ∈ R M ×N from the LncRNADisease database. M and N were the number of diseases and lncRNAs respectively. The elements in the matrix are represented as: where y dl = 1 indicats that the disease d is related to lncRNA l. If y dl = 0, it does not mean that the disease d is not related to lncRNA l, but unknown entry. Although the observed entries reflect at least the relationship of disease to lncRNA, unobserved entries may simply be missing data. The lncRNA-disease prediction task with unknown associations can be expressed as a problem of estimating the score of unobserved entries in matrix Y (this score is used to estimate the ranking of lncRNAs). Assuming the real data can be generated from the underlying model, it can be abstracted as a learning function y dl = f (d, l|θ). Where y dl represents the predicted score of the interactive function, θ represents the model parameters, and f represents the function that maps model parameters to the predicted score.

B. EMBEDDING
The concept of Embedding comes from Google's word2vec algorithm [37]. The conventional approach is to assign an index to each word, and a word is represented as a one-hot encoding, which naturally leads to the word vector becoming sparse and taking up too much storage space, greatly reducing processing efficiency. Next, we create an embedding matrix and determine how many ''potential factors'' are assigned to each index, which usually means how large the vector is needed, usually much shorter than a one-hot encoding. The word vector is trained from the context in the corpus and can represent high-dimensional sparse vectors with semantics. In summary, the embedding layer is a fully connected layer that maps high-dimensional sparse vectors to low-dimensional dense vectors for model calculation. Since the embedding vectors are updated during deep neural network training, it is possible to explore which categories are similar in high dimensional space. Fig 1 shows how embedding works. For the readability of the article, we set the dimensionality of embedding vector to 5 in this chapter.

C. NeuMFLDA
As shown in Fig 1, we first map the disease (or lncRNA) onehot encoding into word vector through the embedding layer, that is, the conversion from a high-dimensional sparse vector to a low-dimensional dense vector. Then, as shown in Fig 2, MF and MLP are implemented using a neural network with embedding layer. Finally, combined with the linear modeling ability of MF and the non-linear modeling ability of MLP, a new neural network is designed, which is called neural network-based matrix factorization (NeuMF). In particular, the use of NeuMF to predict the lncRNA-disease associations is called NeuMFLDA.
Since MF is the most commonly used representation learning method, implementing it can prove that the neural network with embedding can simulate most of the matrix factorization models [38]. As shown in Fig 2a, the input of the network is a one-hot encoding for the disease (or lncRNA) ID. Above the input layer is embedding layer, which is a fully connected layer that maps the sparse representation of a disease (or lncRNA) into a dense vector that is used to represent a disease (or lncRNA) feature. Then define the first mapping function of the MF layer as: where represents the element-wise product of vectors. p d and q l represent disease embedding and lncRNA embedding, respectively. The resulting vector then continues to be fed to the output layer: where a out represents the activation function of the output layer and h represents the connection weight of the output layer. a out can be thought of as an identity function, and h is always equal to 1, which is obviously a conventional MF model. In the framework of NeuMF, MF can be easily generalized and customized. For example, taking a different h will form a variant model of MF. We can also generalize the MF to a nonlinear model using a nonlinear activation function a out . Similar to MF, MLP also has two inputs to model disease and lncRNA, respectively, and naturally requires two lines to link their features, as shown in Fig 2b. This design is widely used in multimodal deep learning. However, simply linking their latent vector is not sufficient to illustrate the underlying characteristics between disease and lncRNA. The nonlinear modeling ability of the model is improved by adding hidden layers to learn the disease and lncRNA latent feature vectors using a standard multilayer perceptron. The multilayer perceptron is defined as: here, W x , b x , and a x represent the weight matrix, the bias vector, and the activation function in the x-th layer perceptron, respectively. Sigmoid, Tanh, and ReLU can all be used as activation functions. When the output is close to 1 or 0, the Sigmoid function has a problem of oversaturation (the gradient disappears or explodes), and the neuron will fall into the dilemma of stagnation learning. Tanh is another commonly used activation function, but it only mitigates the problems caused by Sigmoid to some extent, because it can be seen as a scaled version of Sigmoid (tanh x 2 = 2δ (x) −1). So, we chose the ReLU function, which has been proven not to cause oversaturation, and supports sparse activation, giving the model sufficient generalization capabilities [39]. Experiments show that ReLU performs slightly better than the Tanh function and the Sigmoid function. The design of the network structure is usually a tower type with the widest bottom layer. By using fewer hidden cells for a higher level, a more accurate abstraction function can be learned from the data [40]. We use a neural network with embedding layers to implement MF and MLP. MF uses a linear kernel to simulate the interaction of potential features, while MLP uses a nonlinear kernel to learn the interaction function. Next, the MF and MLP are combined in the same network, so that the model has both linear learning ability and nonlinear learning ability.
If MF and MLP share the embedding layer similar to the well-known neural network tensor, the performance of the fusion model may be limited [41]. Because the shared embedding layer makes MF and MLP must use the same size of embedding layer, the optimal embedding layer size of the two models for the lncRNA-disease associations data varies greatly, which makes this is a poor choice. Let the MF and MLP have separate embedding layers, linking the final output of the two sub-modules as the last layer input, as shown in Fig 2c. Specifically, NeuMF can be formulated as: Here p MF d shows the disease embedding of the MF submodule, p MLP d represents the disease embedding of the MLP sub-module, and similarly, q MF l and q MLP l represent the lncRNA embedding of the MF part and the MLP part. As mentioned above, the activation function of the MLP is the ReLU function. The activation function of the final output layer is the Sigmoid function, which limits the output of the network to a range of 0 to 1, indicating the probability of a disease related to a lncRNA. The parameters of the model can all be calculated using standard error backpropagation, such as stochastic gradient descent.

D. LOSS FUNCTION
In order to estimate the parameter , the existing methods generally follow the machine learning paradigm to optimize the objective function. Two loss functions are most commonly used in the literature: pointwise loss and pairwise loss [42], [43]. As a natural extension of abundant work on explicit feedback, the pointwise loss is usually the regression model followed, minimizing the mean square error between y and its target value y. At the same time, in order to deal with unobserved data, they either treat all unobserved entries as negative feedback or ignore the unobserved entries when optimizing the model [42]. For pairwise learning, the practice is that the observed entries should rank higher than those that are not observed [43]. Therefore, pairwise learning maximizes the gap between the observed item inferred score y and the unobserved item inferred score y , rather than reducing the loss error between inferred score y and original score y. Since we analyzed the predictions of lncRNA-disease associations from the rankings, we chose pairwise loss function to learn these parameters [44]. We use a more appropriate target loss function: O is a training set. Each instance is a triplet (i, j, s), which means that the disease d i is associated with lncRNA l j , but not associated with lncRNA l s . y ij represents the predicted score of disease d i and lncRNA l j . y is represents the predicted score of disease d i and lncRNA l s . Minimize the loss and maximize y ij − y is so that y ij ranks higher than y is .

A. PERFORMANCE COMPARISON
Leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV) was used to evaluate the performance of our NeuMFLDA. The LOOCV is based on known lncRNA-disease associations. Each time the training model is used, each pair of known associations is taken as the test sample in turn, and the remaining known associations are used as training samples. In the 5-fold cross-validation framework, known lncRNA-disease association items were randomly divided into five groups, and each disease was ensured to be divided into each group. Four of them were for training and one for testing. Of course, the test data also contains all unobserved items. We set a threshold and compare the result with the threshold to determine if the result is correct. In order to reduce the bias caused by randomness, we performed LOOCV and 5-fold CV for 100 times. In order to intuitively evaluate the performance of the model, the receiver characteristic curve (ROC) was introduced, which is a common method for evaluating binary classifications. The true positive rate (TPR, sensitivity) and the false positive rate (FPR, 1-specificity) are two important indicators. Here, the ordinate of the ROC curve represents sensitivity and the abscissa represents 1-specificity. Sensitivity represents the percentage of the test samples that rank higher than the given threshold, while specificity represents the opposite. Calculated as follows: where TP means true positives, FP refers to false positives, TN is true negatives, and FN represents false negatives. In this way, we plotted the ROC curve and calculated the sensitivity and specificity by gradually changing the threshold. The area under ROC (AUC) is also used to measure performance. In general, AUC = 0.5 indicates random performance, and AUC = 1 indicates optimal performance. VOLUME 11, 2023  In our study, in order to further validate the effectiveness of the model NeuMFLDA, we compared the performance of NeuMFLDA with three recent methods. The input data of the three comparison methods is the same as our method. The experimental results of the NeuMFLDA based on LncRNADisease are compared with those of the methods: IIRWR [26], NCPLDA [27] and SIMLDA [24]. As shown in Fig 3, the ROC curve of the four methods can illustrate their performance. NeuMFLDA achieves AUC of 0.904 ± 0.003 in the framework of 5-fold cross validation, which is better than other methods (IIRWR:0.891, NCPLDA:0.884 and SIMLDA:0.856). The performance of NeuMFLDA is also the best in framework of LOOCV, which achieves AUC of 0.918 ± 0.002 (IIRWR:0.897, NCPLDA:0.902 and SIMLDA:0.864).
Next, another evaluation indicator, called Leave-One-Disease-Out Cross-Validation (LODOCV), was used to verify our predictive performance. For each disease d, all known lncRNAs associated with d are deleted, and prediction tasks are performed according to lncRNA related information of other diseases. We calculated AUC values of each disease based on LODOCV framework, and obtained 226 AUC values of each method. Then, the AUC vectors of each method are shown as density plot to compare the results (Fig 4a). As a result, the AUC values obtained by our method are mainly concentrated in the [0.9,1] interval, and its performance is better than other methods. In addition, we calculated the number of known lncRNA-disease associations that each method could retrieve at a specified threshold. As shown in Fig 4b, NeuMFLDA is superior to other methods by predicting more true positives.

B. PARAMETER SETTING
The performance of the model depends on the parameters, and the model can achieve the best performance by adjusting the appropriate parameters. The parameters of NeuMFLDA is learned by optimizing formula (6). The size of embedding layer represents the length of the latent vector of a disease or lncRNA. After repeated experiments, the size of MLP embedding is set to be 16 (K MLP = 16), and the size of MF embedding is set to be 8 (K MF = 8). The number of MLP layers is 3, and the size of MLP layers are 32, 16 and 8. It is worth noting that the larger the size of each layer, the easier it is to cause over-fitting. Stochastic gradient descent (SGD) is used to update our model parameters. The parameter neg denotes the unknown association items randomly extracted for a known association item. When neg increases, it indicates that each positive sample ranks higher than more unknown samples. If neg is large enough, the model will be overfitting and the disease-associated lncRNA cannot be accurately predicted. Here, each known positive item corresponds to 8 unknown items, making this known positive items rank higher than the other 8 unknown items (neg = 8). Epoch indicates the number of times the whole dataset is trained. One epoch is equivalent to training all samples in the training set for one time. Similarly, if the epoch value is too large, the ''overtrained'' state occurs, making the model too fit for the training data. After repeated debugging, the most appropriate value for epoch is 20. Among them, neg and epoch have the greatest impact on performance, and Table1 shows the impact of these two parameters on the performance of the model.
Variant algorithms for MF and MF are often used for relational prediction problems and have shown good performance. At the same time, MLP is also a commonly used nonlinear predictor. NeuMFLDA not only implements MF, but also leverage a multi-layer perceptron to supercharge NeuMFLDA modelling with non-linearities. Whether the performance of NeuMFLDA is better than a single MF or MLP is worth studying. As shown in Fig 5a, we implement MF and MLP separately. As a result, the performance of NeuMFLDA is significantly better than that of single model. Further, how the nonlinear layer of MLP affects model performance is also worth exploring. In order to solve this problem, we have further studied the MLP with different hidden layers. The results are summarized in Fig 5b. MLP-4 represents the MLP with four layers (besides the embedding layer), and MLP-0 means there is no MLP layer, equivalent to only a single MF model at work. For MLP-0 with no hidden layer, the effect is very weak. This suggests that simply connecting the latent vectors of the lncRNA to the disease is not enough to model the interaction of their features, so the hidden layer is helpful to model their feature interactions. In order to optimize the performance of our model, the number of hidden layers is set to 3.

C. LOSS EFFECT
Optimizing the weight parameters of each layer is the goal of training deep learning architecture. In order to learn the most suitable hierarchical representations from data, simple features are gradually combined into complex features. A single cycle organization of the optimization process is as follows [45]. Firstly, given a training dataset, forward transfer calculates the output of each layer in sequence, and propagates function signals forward through the network. In the final output layer, an objective loss function measures the error between the inferred output score and the target label. In order to minimize the training error, the chain rule is used to back-propagate the error signal, and the gradient of all weights in the whole neural network is calculated. Finally, the weight parameters are updated by an optimization algorithm based on SGD.
Obviously, the objective loss function plays a very important role in updating the parameters. So how to choose the most appropriate objective loss function is a big challenge. We know that the loss function is divided into pointwise loss and pairwise loss, and square error is the most commonly used pointwise loss. In past similar prediction solutions, almost all methods used mean square error (MSE). For example, conventional matrix factorization only updates the positive items in the dataset when optimizing the model, and does not update the unknown items, resulting in a lack of utilization of unknown data. The high ranked candidate lncRNAs in the result is judged to be more relevant to the disease. Since our prediction task is addressed from a ranking perspective, we propose a new pairwise loss function to update the parameters of NeuMFLDA so that positive items rank higher than unknown items. Table 2 shows the performance comparison of two loss functions when epoch takes different values on NeuMFLDA.

D. CASE STUDIES
Three complex human cancers (i.e., hepatocelluar carcinoma, kidney cancer and ovarian cancer) were investigated as cases. We consider all known lncRNA-disease associations and train them as a complete set. The lncRNAs corresponding to each disease in the training results were ranked, and the top 10 lncRNAs were analyzed. It should be noted that the top ten rankings we refer to are the rankings of the experimental results after removing the LncRNADisease known associated data, which guarantees absolute independence between the verified candidates and the known associations used for model training. That is to say, the first ten lncRNAs we refer to were unknown before the prediction. After that, we check the predicted associations by referring to available associations in Lnc2Cancer. For the predicted associations that cannot be supported by associations in Lnc2Cancer, we further manually check them on PubMed and list the supportive literature. According to recent literature and dataset, the top 9, 9 and 8 lncRNAs of the three diseases were separately verified.

1) HEPATOCELLUAR CARCINOMA
Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer in adults, and is the most common cause of death in people with cirrhosis. Enforced expression of MEG3 in HCC cells significantly decreased both anchoragedependent and -independent cell growth, and induced apoptosis [46]. Kamel et al. found that lncRNA-UCA1 was significantly higher in sera of HCC than those with chronic HCV infection or healthy volunteers [47]. Their data suggested that the increased expression of UCA1 was associated with advanced clinical parameters in HCC. In 2014, Tu et al. showed that the expression level of GAS5 was reduced in HCC in comparison to normal matched tissues [48]. GAS5 expression was an independent prognostic marker of overall HCC patient survival in a multivariate analysis. PVT1 [49], MINA [50], CCAT1 [51], etc. have been confirmed to have some relationship with hepatocellular carcinoma. The top 10 candidate lncRNAs for HCC obtained from NeuMFLDA are also listed in Table 3.

2) RENAL CANCER
Renal cancer, also known as Kidney cancer, is a type of cancer that starts in the cells in the kidney. The two most common types of kidney cancer are renal cell carcinoma (RCC) and transitional cell carcinoma (TCC). HOTAIR is a lncRNA that interacts with the polycomb repressive complex and suppresses its target genes. HOTAIR has also been demonstrated to promote renal cancer [52]. In Li's research, the expression of PVT1 in ccRCC was analyzed using reverse transcriptionquantitative polymerase chain reaction, and it was revealed that PVT1 expression was upregulated in ccRCC tissues  compared with that in normal adjacent tissues. The results of the present study suggested that PVT1 serves oncogenic functions and may be a biomarker and therapeutic target in ccRCC [53]. Ectopic expression and gene silencing of UCA1 in RCC cell lines exerted opposite effects on cellular proliferation, migration and apoptosis. These results indicated that UCA1 may be considered as a promising biomarker for diagnosis, and a therapeutic target in RCC [54]. The top 10 candidate lncRNAs for renal cancer obtained from NeuMFLDA are also listed in Table 4.

3) OVARIAN CANCER
Ovarian cancer is a cancer that develops in or on the ovaries. Ovarian cancer causes cell abnormalities, and these abnormal cells can invade or spread to other parts of the body. Meg3 protected ATG3 mRNA from degradation following treatment with actinomycin D. Xiu's results suggest that the lncRNA Meg3 acts as a tumor suppressor in EOC by regulating ATG3 activity and inducing autophagy [55]. The present study investigated the underlying role of growth arrest-specific transcript 5 (GAS5) in epithelial ovarian cancer (EOC), which is the main cause of death in women with malignant tumor of the genital system [56]. The expression of NEAT1 was collaboratively controlled by HuR and miR-124-3p, could regulate ovarian carcinogenesis and may serve as a potential target for antineoplastic therapies [57]. Similarly, the relationships of ovarian and other lncRNAs listed in Table 5 have been confirmed by recent studies.

IV. CONCLUSION
Exploring the relationship between lncRNA and disease is helpful to understand the pathogenesis, diagnosis and treatment of diseases. We propose an effective method for predicting lncRNA-disease associations, called NeuMFLDA. NeuMFLDA first expresses disease and lncRNA as word vector via embedding layer, and then combines the linear modeling advantages of MF with the nonlinear modeling advantages of MLP to predict the associations between diseases and lncRNAs. Finally, a pairwise loss function that is more suitable for solving the relationship prediction problem is introduced to optimize our model. Different from traditional prediction methods, our model is easier to operate and does not need to calculate disease similarity and lncRNA similarity. Experiments show that NeuMFLDA outperforms three existing advanced methods. The case study further illustrates the superiority of our model.
The reason why NeuMFLDA achieves satisfactory results is mainly due to the following points. First, the introduction of embedding allows disease and lncRNA to be represented as low-dimensional dense vectors, which can more accurately describe the characteristics of each disease and lncRNA. Secondly, the generalization of nonlinear MLP is extended based on the memorization of conventional MF, which makes the prediction result more accurate. PMemorization ability refers to the ability of the model to fit the original samples. Generalization ability refers to the ability to reason and adapt to the new samples. Finally, since the relationship prediction problem is essentially a ranking priority problem, using pairwise loss function is more appropriate than the pointwise loss function. We propose a more concise and effective pairwise loss function to optimize our model parameters, which is helpful to improve the prediction performance. Most importantly, NeuMFLDA can be used not only to predict lncRNA-disease associations, but also to solve other similar prediction problems, such as predicting miRNA-disease associations and drug-disease associations by adjusting parameters including the number of layers of the neural network and the number of nodes in each layer. Nevertheless, our approach has some challenges. It is a very important work to determine the optimal parameter based on different biological datasets. How to integrate different biological information more reasonably and improve the prediction performance is worth further study.