Causal structure discovery is an important problem in protein sequences and gene-gene interaction in gene expression data, which will reveal the elementary structure of the protein sequence and the gene-gene interaction by the expression level of genes within the cell. In this paper, we investigate the feature--based causal structure learning methods for protein sequence and gene expression data respectively. Three feature extraction methods are proposed to model casual structure with Bayesian network with Dirichlet distribution in protein sequence data, and a factor analysis based feature extraction method is discussed for gene expression data Bayesian network learning. The truncated hemoglobin superfamily from SCOP protein database and Princeton colon gene expression data are involved to demonstrate the causal structure of Bayesian network determined by different feature extraction.
Published in:
Natural Computation, 2009. ICNC '09. Fifth International Conference on
(Volume:2
)
Date of Conference: 14-16 Aug. 2009