By Topic

A clustering retrieval system of Chinese information

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Xin-Guang Sha ; Intelligent Technology and Natural Language Processing Lab, Harbin Institute of Technology, No. 92, West Dazhi Street, NanGang, 150001, China ; Yuan-Chao Liu ; Ming Liu ; Xiao-Long Wang

With tremendous and ever-growing amounts of electronic documents from World Wide Web and digital libraries, it becomes more and more difficult to get information that people really want. In order to predigest search process, people use clustering method to browse through search results. However traditional Chinese information clustering techniques are inadequate since they don't generate clusters with highly readable themes. This paper reformats the clustering problem as a salient phrase ranking problem. Given a query and its related ranked list of documents (typically a list of titles and snippets) returned from a certain Web search engine, this method first extracts and ranks salient phrases as candidate cluster theme, based on regression model of SVR (support vector regression) learned from human labeled training data. The documents are assigned to relevant salient phrases to form candidate clusters, and the final clusters are generated by merging these candidate clusters. This paper also searches for a reasonable format to display the final themes of clusters, in order to help users to find the interesting documents easily. Experiment results verified our method feasible and effective.

Published in:

Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on

Date of Conference:

19-22 Oct. 2008