By Topic

Mapping Gene/Protein Names in Free Text to Biomedical Databases

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Hongfang Liu ; Georgetown Univ. Med. Center, Washington ; Manabu Torii ; Zhang-zhi Hu ; Cathy Wu

Observing that many biomedical databases have been developed and maintained independently, their records referring to the same entities may have different sets of synonyms. Integration of names pertaining to the same entity would provide a more comprehensive list of synonyms than each individual database. We have assembled BioThesaurus, a thesaurus of proteins and their corresponding genes compiled from multiple databases for all UniProtKB records. In this study, the coverage of BioThesaurus, and the contribution of each individual database were assessed for several organisms. The result indicates that the coverage of BioThesaurus is over 80% for most of the organisms with an average of 85.4%. When restricted to individual databases or resources, the percentages dropped ranging from 3 to 30%. The study demonstrated that each individual database or resource has some synonyms not covered by other databases or resources, and a list of names compiled from multiple databases would be desired for systems requiring high recall.

Published in:

Seventh IEEE International Conference on Data Mining Workshops (ICDMW 2007)

Date of Conference:

28-31 Oct. 2007