By Topic

A topic based indexing approach for searching in documents

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Osuna-Ontiveros, D. ; Inf. Technol. Lab., CINVESTAV-IPN, Tamaulipas, Mexico ; Lopez-Arevalo, I. ; Sosa-Sosa, V.

Nowadays, users of computers store a lot of text documents. This requires fast and precise searches over documents. The goal of Information Retrieval (IR) models is to provide users with those documents that will satisfy their information needs. The core of such models is the document representation used in the indexing of documents. Traditional IR models handle the frequency of query terms. The disadvantage of these models is that they exclusively consider terms in the query and ignore similar terms. This paper proposes a topic based indexing approach to represent topics associated to documents. Documents are modeled by using clustering algorithms based on natural language processing. As result of this proposal is a document-topic matrix representation denoting the importance of topics inside documents. In a similar way, each query over documents is converted into a vector of topics. Thus, a similarity measure can be applied over this vector and the matrix of documents to retrieve the most relevant documents.

Published in:

Electrical Engineering Computing Science and Automatic Control (CCE), 2011 8th International Conference on

Date of Conference:

26-28 Oct. 2011