By Topic

Evaluation of stop word lists in text retrieval using Latent Semantic Indexing

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
A. N. K. Zaman ; School of Computer Science, University of Guelph, Guelph, ON, Canada ; Pascal Matsakis ; Charles Brown

The goal of this research is to evaluate the use of English stop word lists in Latent Semantic Indexing (LSI)-based Information Retrieval (IR) systems with large text datasets. Literature claims that the use of such lists improves retrieval performance. Here, three different lists are compared: two were compiled by IR groups at the University of Glasgow and the University of Tennessee, and one is our own list developed at the University of Northern British Columbia. We also examine the case where stop words are not removed from the input dataset. Our research finds that using tailored stop word lists improves retrieval performance. On the other hand, using arbitrary (non-tailored) lists or not using any list reduces the retrieval performance of LSI-based IR systems with large text datasets.

Published in:

Digital Information Management (ICDIM), 2011 Sixth International Conference on

Date of Conference:

26-28 Sept. 2011