miRReg: A Framework for Studying miRNA Regulation in Tetralogy of Fallot

It is known that microRNAs (miRNAs) can inhibit message RNAs (mRNAs) translation, and thus directly influence the expression levels of them. On the other hand, the competing endogenous RNA (ceRNA) hypothesis indicates that different mRNAs compete with each other for binding with miRNAs, and further indirectly affect miRNA activity. Therefore, the miRNA regulation can be generally divided into two types: direct and indirect. Recently, the miRNA regulation mechanism in Tetralogy of Fallot (TOF) is largely unknown. To understand the direct and indirect miRNA regulation in TOF, in this work, we propose a framework called miRReg to study the miRNA regulation mechanism. For the direct miRNA regulation, we identify miRNA-mRNA regulatory network and further infer functional miRNA regulatory modules in TOF. Meanwhile, we infer mRNA related miRNA sponge interaction network and modules for the indirect miRNA regulation in TOF. The functional analysis shows that the identified direct or indirect miRNA regulatory networks and modules are closely associated with TOF disease. Moreover, several miRNA-mRNA regulatory relationships and miRNA sponge interactions are experimentally confirmed. miRReg is released under the GPL-3.0 License, and is available at https://github.com/wgb2098/miRReg.


I. INTRODUCTION
MicroRNA (miRNA) is a kind of short non-coding singlestranded RNA with a length of ∼22 nucleotides (nts). Through binding with the target messenger RNAs (mRNAs), miRNAs can affect the levels of mRNAs at the posttranscriptional level. More and more evidences demonstrate that miRNAs as critical regulators are involved in many biological processes and signal pathways [1].
TOF disease is one of the most common cardiac defects in children [2]- [5]. Previous studies have shown that TOF disease shows a diverse range of pathogenesis, including right ventricular hypertrophy, aortic overriding, ventricular septal The associate editor coordinating the review of this manuscript and approving it for publication was Zhan Bu . defect and right ventricular outflow tract stenosis, long-term hypoxia in non-surgical TOF patients, and decreased pulmonary blood flow. For each type of pathogenesis, the treatment for patients is significantly different. Therefore, it is necessary to have a comprehensive understanding of the pathophysiology of TOF [6], [7].
Until now, more and more miRNAs have been found to be closely related to TOF disease. O'Brien et al. [8] discovered that the expression of 61 miRNAs is significantly changed between myocardium from children with TOF and normally developing comparison subjects. By analyzing 75 dysregulated miRNAs in human heart tissue, Liang et al. [9] found that miR-940 is highly expressed in the normal right ventricle and is associated with the development TOF disease. For the study of myocardial tissue in the right ventricular outflow tract of infants, Bittel et al. [10] also revealed that miR-421 dysregulation is associated with TOF disease. In the analysis of circulating miRNAs in patients with repaired TOF with and without heart failure, three miRNAs (miR-421, miR-1233-3p and miR-625-5p) have been found to be involved in TOF disease [11]. In TOF, Wang et al. [12] discovered that miR-1/miR-133 behave significant sexual difference and are essential for cardiac development. Recently, by using 432 sporadic patients and 450 controls in China, Yang et al. [13] also found that the primary miR-124 (rs531564) polymorphism is closely associated with TOF.
However, the existing methods mostly focus on the expression patterns of miRNAs between TOF patients and healthy controls to study the associations between miRNAs and TOF disease. To further understand the gene regulation mechanism in TOF disease, it is necessary to identify TOF-related gene regulatory networks and modules from the perspective of systems biology. Based on this, by integrating expression data and miRNA-target binding information, we propose a framework called miRReg to study miRNA regulation in TOF disease. At the network level, two networks (miRNA-mRNA regulatory network and miRNA sponge interaction network) are constructed in TOF. At the module level, we further identify two types of modules (miRNA-mRNA regulatory modules and miRNA sponge modules). Functional analysis results have shown that the identified networks and modules are closely associated with TOF disease. In this work, miRReg has the following two contributions. Firstly, as far as we know, miRReg is the first framework to study TOFrelated miRNA regulation at the both network and module level. Secondly, miRReg can help to understand both direct and indirect miRNA regulation in TOF.
The remainder of this work is organized in the following. Section II presents related datasets used and methods selected. Section III shows the analysis results in TOF. Section IV discusses the limitations and parameter settings of miRReg. Finally, we make a conclusion in Section V.

A. MATERIALS
The matched miRNA and mRNA expression profiles in TOF and normal samples are from Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) database (GSE35490 for miRNA, GSE35776 for mRNA). After gene differential expression (DE) analysis using wellcited limma [14] R package, 149 DE miRNAs and 2505 DE mRNAs are identified at a significant level (p-value < 0.05) (details in S1 Data). As for putative miRNA-mRNA interactions, we obtain 35582 experimentally validated miRNA-mRNA interactions between DE miRNAs and DE mRNAs from miRTarBase v8.0 [15] (details in S2 Data).

B. OVERVIEW OF miRReg FRAMEWORK
As shown in Figure 1, the miRReg framework consists of three steps: identifying miRNA-mRNA regulatory network and module, identifying miRNA sponge interaction network and module, and functional analysis. In the following, we will describe the three steps in detail.

1) IDENTIFYING miRNA-mRNA REGULATORY NETWORK AND MODULE
Based on the matched differential miRNA and mRNA expression profiles, we firstly use the LASSO regression method to construct miRNA-mRNA regression matrix. The objective function of the LASSO regression method is denoted as follows: Here, y is a response vector, X is a covariate matrix, w is a coefficient matrix, m and n are the number of miRNAs and mRNAs, respectively. To obtain the coefficient matrix w, the glmnet R package [16] with α = 1 for choosing the best β using 5-fold cross-validation is used in this work. Normally, since miRNAs negatively regulate mRNAs, the positive correlation values between miRNAs and mRNAs in w are transformed into zeros. Then, based on the constructed miRNA-mRNA regression matrix, and the matched miRNA and mRNA expression data, we infer miRNA-mRNA biclique modules using BiModule [17]. We use the default parameter settings of the BiModule method to infer miRNA-mRNA biclique modules in this work.
For each miRNA-mRNA biclique module, all miRNAs are regarded as regulators of all mRNAs. By merging all of possible miRNA-mRNA interactions in each module, we remove duplicate miRNA-mRNA interactions to generate an integrated miRNA-mRNA regulatory network.

2) IDENTIFYING miRNA SPONGE INTERACTION NETWORK AND MODULE
To identify mRNA related miRNA sponge interaction network, we use the positive correlation (PC) method [18], [19] implemented in the miRspongeR [20] R package. In the miRNA sponge interaction network, each mRNA-mRNA pair should have significant sharing of miRNAs (e.g. p-value < 0.05) and significant positive correlation (e.g. p-value < 0.05). To further understand the modularity of miRNA sponges in the network, we use the well-cited Markov Cluster Algorithm (MCL) [21] implemented in the miRspongeR [20] R package to infer miRNA sponge modules. The number of miRNA sponges in each module is at least 3.

3) FUNCTION ANALYSIS
To understand whether the identified miRNA-related networks and modules are functional or not, it is necessary to perform functional analysis of them. For the (Gene Ontology) GO [22] and (Kyoto Encyclopedia of Genes and Genomes) KEGG enrichment analysis [23], we use the clusterProfiler [24] R package in this work. The workflow of miRReg. Here, step 1 is for studying direct miRNA regulation, and step 2 is for investigating indirect miRNA regulation. In step1, after differential expression analysis, we obtain a list of DE miRNAs and DE mRNAs. The LASSO regression method is further used to generate the miRNA-mRNA regression matrix between DE miRNAs and DE mRNAs. Based on the generated miRNA-mRNA regression matrix and expression profiles of DE miRNAs and DE mRNAs, the BiModule method is used to obtain miRNA-mRNA biclique modules. In each biclique module, all of miRNAs completely connect with all of mRNAs, and no connection exists between miRNAs or mRNAs. To obtain miRNA-mRNA regulatory network, we merge all of miRNA-mRNA pairs in each module. In step2, based on the expression profiles of DE miRNAs and DE mRNAs, and experimentally validated miRNA-mRNA binding information, we identify miRNA sponge interaction network. In addition, we further identify miRNA sponge modules from the identified miRNA sponge interaction network. Finally, we perform functional analysis of the identified miRNA-related networks and modules in step 3.

III. RESULTS
A. THE IDENTIFIED miRNA-mRNA REGULATORY NETWORK AND MODULE Following the Step 1 of Figure 1, we have generated 12 miRNA-mRNA biclique modules (details can be seen in S3 Data). After integrating the miRNA-mRNA interactions of each miRNA-mRNA biclique module, we obtain a list of 1565 miRNA-mRNA interactions between 38 miR-NAs and 168 mRNAs (see Figure 2 visualized by Cytoscape [26]).

B. THE IDENTIFIED miRNA SPONGE INTERACTION NETWORK AND MODULES
Following the Step 2 of Figure 1, we have constructed an mRNA-related miRNA sponge interaction network consisting of 266 miRNA sponge interaction pairs (see Figure 3 visualized by Cytoscape [26]). Based on the identified miRNAs sponge interaction network, we identify 6 miRNA sponge modules in Table 1.   [27], ''response to insulin'' (GO: 0032868) [28], ''cellular response to peptide hormone stimulus'' (GO: 0071375) [29] are closely related to TOF. In terms of KEGG pathways, several pathways  including Adipocytokine signaling pathway (hsa04920), Insulin signaling pathway (hsa04910), Glycosphingolipid biosynthesis-lacto and neolacto series (hsa00601), and Fatty acid biosynthesis and metabolism (hsa00061 and hsa01212) are involved in TOF. This result indicates that the identified miRNA-mRNA biclique modules are functionally associated with TOF.  muscle contraction (GO: 0060452) are highly implicated in TOF. Moreover, several KEGG pathways, including Cardiac muscle contraction (hsa04260), Oxidative phosphorylation (hsa00190), and Taurine and hypotaurine metabolism (hsa00430) also involve in TOF diseases. Specifically, the identified miRNA sponge module (Module 4) is significantly enriched in the KEGG pathway: Endocrine and other factor-regulated calcium reabsorption (hsa04961). This result is consistent with previous studies, which have demonstrated that calcium plays important roles in cardiac contraction and vascular tone.

3) SEVERAL miRNA-RELATED INTERACTIONS ARE EXPERIMENTALLY VERIFIED
In this section, we use the experimentally validated miRNA-mRNA interactions from miRTarBase and the experimentally validated miRNA sponge interactions from miRSponge as ground-truth to validate the identified miRNA-related interactions. As a result, a total of 266 miRNA-mRNA interactions and 6 miRNA sponge interactions have been verified by the above databases.

IV. DISCUSSION
The computing complexity and computing time of miRReg is proportional to the number of genes (miRNAs and mRNAs). To efficiently implement the framework, we select popular or well-cited methods to study direct and indirect miRNA regulation. miRReg is a flexible method, and users can choose any existing methods or new methods to replace a step or more steps in our framework.
Moreover, our miRReg method is a parametric framework. Most of parameter settings are based on common-used principle. Specifically, in the step of identifying miRNA-mRNA biclique modules, there are two important parameters (lambda.tol, merge.tol) to be empirically set. The parameter lambda.tol is the cutoff of significance for each miRNA-mRNA pair in 1000 permutation test. Larger value of it denotes a more significant result. For identifying appropriate number of miRNA-mRNA bicliques, we empirically set the values of lambda.tol to 0.75. The parameter merge.tol is used to merge the identified small bicliques into large bicliques. The larger the value of it is, the more stringent to merge different bicliques. Of course, users can set the parameters according to specific requirement (refer to [17] for detail information).
Furthermore, in the step of identifying miRNA sponge interactions, two common principles (significant sharing of miRNAs and significant positive correlation) are used. The significance p-value (e.g. 0.05) can be set more stringent or more loose.
In addtion, we use the well-cited method Markov Cluster Algorithm (MCL) [21] for inferring miRNA sponge modules from the identified miRNA sponge interaction network. As is known, there are several other network clustering methods (i.e. MCODE) [30]- [34] that could be used to extract miRNA sponge modules.
Finally, since the TOF-related dataset is still limited, we only use one TOF-related dataset in this work. However, the functional analysis results show that miRReg can still uncover direct and indirect miRNA regulation associated with TOF disease. In future, we will apply more TOFrelated datasets to further validate the effectiveness of miRReg.

V. CONCLUSION
More and more evidences have demonstrated that miRNAs play an important role in the pathology of TOF disease. However, the study of miRNA regulation related to TOF is still inadequate. In this work, we present a novel VOLUME 8, 2020 framework called miRReg to investigate the miRNA regulation at both the network and module levels. At the network level, we construct two types of networks: miRNA-mRNA regulatory network and miRNA sponge interaction network. At the module level, we also identify two different kinds of modules: miRNA-mRNA biclique modules and miRNA sponge modules. Functional analysis results show that the identified miRNA-related networks and modules are closely associated with TOF disease. We believe that miRReg can be a potential method to study miRNA regulation of other types of human complex diseases in future. APPENDIX S1 Data. Differentially expressed miRNAs and mRNAs. S2 Data. Putative miRNA-mRNA interactions. S3 Data. The identified miRNA-mRNA regulatory modules. GUANGBIN  KE LIU received the bachelor's degree in clinical medicine from the Chengdu Medical College, in 2010, and the master's degree in medicine from the University of Electronic Science and Technology of China, in 2017. After graduation, he worked as a Cardiac Surgeon at the Sichuan Provincial People's Hospital. He has been engaged in clinical, teaching, and scientific research of cardiac surgery, have accumulated rich clinical experience in the diagnosis and treatment of heart surgery related diseases (heart valve disease, congenital heart disease, coronary atherosclerotic heart disease, macrovascular disease, atrial fibrillation, and other diseases). He has published several articles in domestic core journals, participated in the research of gene expression related to atrial fibrillation, and participated in the compilation of the eighth edition of Surgery of People's Medical Publishing House. VOLUME 8, 2020