Abstract:
The Next Generation Sequencing platform, RNA-seq provides quantitative expression data that exhibit distinctive sequence patterns in the segments of the short-reads level...Show MoreMetadata
Abstract:
The Next Generation Sequencing platform, RNA-seq provides quantitative expression data that exhibit distinctive sequence patterns in the segments of the short-reads level and are found useful in clustering of those segments. However, the result does not reflect the functional chemistry of the non-coding RNAs (ncRNAs). The functions of the ncRNAs are deeply related to their secondary structures. Thus by exploring the clustering in terms of structural profiles of the read block segments rather than their sequence patterns would be essential and useful. We proposed the QLZCClust (Quaternary Lempel-Ziv complexity based Clustering) method which is an extension to the popular Lempel-Ziv algorithm to compute pairwise secondary structure distance. We applied QLZCClust on the short-read segments obtained from the RNA-seq experient and found that it can separate most miRNAs and the tRNAs. Moreover, it can be used to detect structural similarities among different classes of ncRNAs. We compared our algorithm with the clustering of two other structural distance measures - SimTree edit distance and RNAz based distance, and found that our method performs superior.
Date of Conference: 10-13 November 2013
Date Added to IEEE Xplore: 09 January 2014
Electronic ISBN:978-1-4799-3163-7