By Topic

Enhancing information retrieval efficiency using semantic-based-combined-similarity-measure

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Mayank Saini ; School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India ; Dharmendar Sharma ; P. K. Gupta

Most of the knowledge intensive organizations are having their information resided in large text document repositories and most of these text repositories and databases are either unstructured or semi-structured. Recently various soft computing techniques have been used to improve information retrieval efficiency. More specifically genetic algorithms have been used for various information retrieval components like matching function learning, documents clustering, information extraction, query optimization [1 - 6]. In most of the cases in information retrieval matching function is based on term frequency. But the problem with this approach is that the syntactic information of the text document is lost and phrases are also not considered, so results in poor accuracy. In this paper we have proposed a new semantic based similarity measure in which each term can be a phrase or a single word and the weight assigned to each term is based on its semantic importance considering each sentence. We have used this semantic similarity measure along with other standard similarity measure as Jaccard and cosine to form the semantic-based-combined-similarity-measure. Standard genetic algorithm has been used to optimize the weight given for each similarity measure.

Published in:

Image Information Processing (ICIIP), 2011 International Conference on

Date of Conference:

3-5 Nov. 2011