Skip to Main Content
We report the results of fitting mixture models to the distribution of expression values for individual genes over a broad range of normal tissues, which we call the marginal distribution of the gene. The base distributions used were normal, lognormal and gamma. The expectation-maximization algorithm was used to learn the model parameters. Experiments with artificial data were performed to ascertain the robustness of learning. Applying the procedure to data from two publicly available microarray datasets, we conclude that lognormal performed the best function for modeling the marginal distributions of gene expression. Our results should provide guidances in the development of informed priors or gene specific normalization for use with gene network inference algorithms.