By Topic

Gene Ontology Automatic Annotation Using a Domain Based Gene Product Similarity Measure

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Popescu, M. ; Dept. of Health Manage. & Informatics, Missouri Univ., Columbia, MO ; Keller, J.M. ; Mitchell, J.A.

Recent years have seen an explosive growth in the amount of biological data available for analysis. The large volume of data collected makes it necessary to automatically classify and sort such data on a very large scale. Typically, investigators use computational sequence analysis tools to assign functions to newly found gene products. The problem is to find the functions of a (unknown) gene product given its amino acid sequence. In this work we search for functional similarity between gene products by matching the functional domains that they contain. The domain-based approach addresses the main problem of sequence-based similarity, i.e., when the region of a gene product that is matched by a query sequence is not related to the function of that gene product. We use the hidden Markov representation of a gene product domain as described in the PFAM database, and then infer annotations that come from the Gene Ontology. To compute domain similarity between two gene products we introduce a fuzzy Jaccard similarity measure. We tested our domain-based similarity for the functional annotation of a set of 194 gene products extracted from the ENSEMBL Web site. We compared the domain similarity approach to the traditional way of performing functional annotation using a sequence-based similarity (BLAST and Smith-Waterman). The annotation was performed in all cases using a fuzzy K-nearest neighbor algorithm. We found that our domain-based annotation was better than the most common BLAST approach, but not as good as complex Smith-Waterman technique. The domain-based annotation has about 70% correct annotation rate at 17% false annotation rate

Published in:

Fuzzy Systems, 2005. FUZZ '05. The 14th IEEE International Conference on

Date of Conference:

25-25 May 2005