By Topic

N-gram and Local Context Analysis for Persian text retrieval

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Abolfazl Aleahmad ; Electrical and Computer Engineering Department, University of Tehran, Iran ; Parsia Hakimian ; Farzad Mahdikhani ; Farhad Oroumchian

The Persian language is one of the languages in Middle-East, so there are significant amount of Persian documents available on the Web. But there are relatively few studies on retrieval of Persian documents in the literature. In this experimental study, we assessed term and N-gram based vector space model and a query expansion method, namely, local context analysis using different weighting schemes on a realistic corpus containing 160000+ news articles. Then we compared our results with previous works reported on Persian language. Our experimental results show that among the assessed methods, 4-gram based vector space model with Lnu.ltu weighting scheme has acceptable performance and Local context analysis has the best performance for Persian text retrieval so far.

Published in:

Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on

Date of Conference:

12-15 Feb. 2007