Skip to Main Content
In this paper, based on the functional elements derived from non-redundant CDs catalogue, we show that the configuration of functional groups in meta-genome samples can be inferred by probabilistic topic modeling. The probabilistic topic modeling is a Bayesian method that is able to extract useful topical information from unlabeled data. When used to study microbial samples (assuming that relative abundance of functional elements is already obtained by a homology-based approach), each sample can be considered as a 'document', which has a mixture of functional groups, while each functional group (also known as a 'latent topic') is a weight mixture of functional elements (including taxonomic levels, and indicators of gene orthologous groups and KEGG pathway mappings). The functional elements bear an analogy with 'words'. Estimating the probabilistic topic model can uncover the configuration of functional groups (the latent topic) in each sample. The experimental results demonstrate the effectiveness of our proposed method.