By Topic

Memetic Algorithms for De Novo Motif Discovery

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Tak-Ming Chan ; Department of Computer Science and Engineering, Chinese University of Hong Kong, Shatin, Hong Kong ; Kwong-Sak Leung ; Kin-Hong Lee

Identifying the unknown transcription factor binding sites (TFBSs) is a fundamental and important component for understanding gene regulation as well as life mechanisms. The corresponding de novo motif discovery problem in bioinformatics is formulated as pattern discovery from strings, where challenges come from both modeling and optimization, because the short TFBSs are weak signals in massive and noisy experimental data. While genetic algorithms have been widely applied to the problem, recent memetic algorithms (MAs) employing local operators demonstrate the superiority in both effectiveness and efficiency. In this paper, we propose and study various MA components including local operators and models for motif discovery, through the newly established MA framework. The demonstrated optimization and modeling capabilities are analyzed in-depth on real datasets and their noisy versions. Selected optimal MAs show significantly improved performance over state-of-the-art methods in extensive tests including the blind test on the eukaryotic benchmark. This paper serves as the first systematic study of MAs on de novo motif discovery, where important issues are highlighted in the analyses of MA design. The comprehensive component categorization and the MA framework provide a useful platform for future MA developments, especially on the newly emerging chromatin immunoprecipitation followed by sequencing data.

Published in:

IEEE Transactions on Evolutionary Computation  (Volume:16 ,  Issue: 5 )