Skip to Main Content
Micro RNAs (miRNAs) are single-stranded, endogenous ~22nt small non-coding RNAs (sncRNAs) that can play important regulatory roles in animals and plants by targeting mRNA for cleavage or translational repression. miRNAs which have very low expression levels or are expressed at specific stage are difficult to find by biological experiments. Also biological experiment only can find a small amount of miRNAs. Computational approaches have become another important way of miRNA prediction, especially machine learning approaches. miRNA prediction based on machine learning approaches requires a lot of positive and negative samples. The number of miRNA precursors that are experimentally validated is rare. However, the number of the sequence fragments, which are similar to real miRNA precursors in whole genome, is up to millions and millions. It is important to select reasonable samples for constructing high-performance classifier. In this review, the training set samples used for predicting miRNA precursors based on machine learning approaches are summarized.