The DNA microarray technology enables rapid, large scale screening for patterns of gene expression. It is meaningful to detect useful phenotypes and the informative genes that can manifest these phenotypes in gene expression data. While the existing methods of phenotypes discriminating are most supervised methods, they train samples based on the known informative genes. In this paper, we propose an unsupervised phenotypes and informative genes detection model with outlier consideration called UPID, which can simultaneously mining phenotypes and informative genes from gene expression data. By adopting incremental computing optimization strategies, the calculation of UPID is greatly reduced. Furthermore, UPID decreases the impact of outliers by taking the sample proportion of each group into consideration, which makes the model more robust. Compared with HS, a previous pattern detection method for gene expression data, it shows that the algorithm we proposed, UPID is more efficient. Moreover, the experiments conducted on several real microarray datasets prove the effectiveness of the UPID algorithm.
Published in:
Biomedical Engineering and Informatics (BMEI), 2010 3rd International Conference on
(Volume:6
)
Date of Conference: 16-18 Oct. 2010