By Topic

Combining Semantics, Context, and Statistical Evidence in Genomics Literature Search

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Urbain, J. ; Illinois Inst. of Technol., Chicago ; Goharian, N. ; Frieder, O.

We present an information retrieval model for combining evidence from concept-based semantics, term statistics, and context for improving search precision of genomics literature by accurately identifying concise, variable length passages of text to answer a user query. The system combines a dimensional data model for indexing scientific literature at multiple levels of document structure and context with a rule-based query processing algorithm. The query processing algorithm uses an iterative information extraction technique to identify query concepts, and a retrieval function for systematically combining concepts with term statistics at multiple levels of context. We define context by variable length passages of text and different levels of document lexical structure including terms, sentences, paragraphs, and entire documents. Our results demonstrate improved search results in the presence of varying levels of semantic evidence, and higher performance using retrieval functions that combine document as well as sentence and passage level information versus using document, sentence or passage level information alone. Initial results are promising. When ranking documents based on the most relevant extracted passages, the results exceed the state-of-the-art by 13.89% as assessed by the TREC 2005 Genomics track collection of 4.5 million MEDLINE citations.

Published in:

Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on

Date of Conference:

14-17 Oct. 2007