Skip to Main Content
In this paper, we propose a supervised classification approach based on Euclidean distance for identifying CpG islands in DNA sequences. We first extract features from the training data set which is extracted from annotated DNA sequences. CpG island locations in a test data sequence are identified by calculating Euclidean distance in the feature space. A moving window method has been used to screen the input test sequence. The performance of the proposed method is verified experimentally on EMBL human DNA database. Proposed approach gives superior performance results over most of the available CpG island detectors and has potential application in annotating CpG islands in large human sequences.