Skip to Main Content
Hidden Markov models (HMMs) have demonstrated great successes in modeling noisy sequential data sets in the area of speech recognition and protein sequence profiling. Results from association test showed significant Markov dependency in time-series gene expression data, and therefore HMMs would be especially appropriate for modeling gene expressions. In this project, we developed a gene function prediction tool based on profile HMMs. Each function class is associated with a distinct HMM whose parameters are trained using yeast time-series gene expression data. The function annotations of the HMM training set were obtained from Munich Information Centre for Protein Sequences (MIPS) data base. We designed several structural variants of HMMs (single, double-split) and tested each of them on forty function classes each of which includes more than one hundred instances. The highest prediction sensitivity we achieved is 51% by using double-split HMM with 3-fold cross-validation.