Skip to Main Content
In this paper, we introduce an exploratory framework for learning patterns of conditional coexpression in gene expression data. The main idea behind the proposed approach consists of estimating how the information content shared by a set of M nodes in a network (where each node is associated to an expression profile) varies upon conditioning on a set of L conditioning variables (in the simplest case represented by a separate set of expression profiles). The method is nonparametric, and it is based on the concept of statistical coinformation, which, unlike conventional correlation-based techniques, is not restricted in scope to linear conditional dependency patterns. Moreover, such conditional coexpression relationships can potentially indicate regulatory interactions that do not manifest themselves when only pairwise relationships are considered. A moment-based approximation of the coinformation measure is derived that efficiently gets around the problem of estimating high-dimensional multivariate probability density functions from the data, a task usually not viable due to the intrinsic sample size limitations that characterize expression-level measurements. By applying the proposed exploratory method, we analyzed a whole genome microarray assay of the eukaryote Saccharomices cerevisiae and were able to learn statistically significant patterns of conditional coexpression. A selection of such interactions that carry a meaningful biological interpretation are discussed.