By Topic

Discovering Connected Patterns in Gene Expression Arrays

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Noha A. Yousri ; Computers and System Engineering, University of Alexandria, Alexandria, Egypt; Electrical and Computer Engineering, University of Waterloo, Ontario, Canada. ; Mohamed A. Ismail ; Mohamed S. Kamel

Clustering methods have been extensively used for gene expression data analysis to detect groups of related genes. The clusters provide useful information to analyze gene function, gene regulation and cellular patterns. Most existing clustering algorithms, though, discover only coherent gene expression patterns, and do not handle connected patterns. Coherent and connected patterns correspond to globular and arbitrary shaped clusters, respectively, in low dimensional spaces. For high dimensional gene expression data, two connected patterns can be two similar patterns with time lags in a time series data, or in general, two different patterns that are connected by an intermediate pattern that is related to both of them. Discovering such connected patterns has important biological implications not revealed by groups of coherent patterns. In this paper, a novel algorithm that finds connected patterns, in gene expression data, is proposed. Using a novel merge criterion, it can distinguish clusters based on distances between patterns, thus avoiding the effect of noise and outliers. Moreover, the algorithm uses a metric based on Pearson correlation to find neighbours, which renders it a lower complexity than related algorithms. Both time series and non temporal gene expression data sets are used to illustrate the efficiency of the proposed algorithm. Results on the serum and the leukaemia data sets reveal interesting biologically significant information

Published in:

Computational Intelligence and Bioinformatics and Computational Biology, 2007. CIBCB '07. IEEE Symposium on

Date of Conference:

1-5 April 2007