Skip to Main Content
The Challenge Problem posed by Pevzner et al. showed that special algorithms are needed to detect weak motifs in bio-sequences, where the classical approaches, such as MEME and Gibbs Sampler, fail. Though several algorithms have since been developed to solve the weak motif recognition problem, their focus has been on exact datasets and their performances show poor tolerance to the noisy datasets, i.e., for datasets bearing sequences without any motif instances. We propose a novel approach to find weak motifs that is robust to noise in the datasets. The experiments with synthetic datasets show that our algorithm has less running time and higher accuracy in detecting weak motifs over the existing approaches and is more robust to the presense of noise. The application of the algorithm on some promoter datasets from yeast genomes found previously-proven binding sites.