Skip to Main Content
We present a computational study for prediction of cis regulatory elements. We model the problem as follows. Each set of conserved binding motifs, evolved from one common ancestor, have a short (Hamming) distance from this ancestor. The problem is to identify a set of l-mers from a given set of promoter sequences which have at most k different positions from the to-be-identified ancestor. A number of papers published in the past attempt to solve this challenging problem. Although the putative ancestor is unknown, even it does not appear in whole background database, we may assume that an instance of it at hand since we can guess it. Our main contribution in this paper is to develop an algorithm, named PROMOCO (PROfile Motif Collection), to find a profile containing all the motifs and relatively small number of random l-mers so that the consensus of the profile would be the putative ancestor. The key idea of the PROMOCO algorithm lies in a new distance measure.