By Topic

A Novel Approach To Automatically Extracting Main Content of Web News

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Xuan Wang ; Bus. Intell. Lab., Univ. of Sci. & Technol. of China, Hefei ; Weiping Wang ; Bowen Liu ; Zhen Wang
more authors

Recently, the Web has been the data repository. In order to obtain the relevant information from the repository, many research have been made. The typical function of Web news extraction is to locate the useful content text and filter the noises , both main issues result in Web news extraction that is an open research problem. In this paper , we describe an approach that can cluster the pages which share common extracting path and automatically extract location of main text passages. Our approach can apply to structural Web pages . Moreover, we developed an extracting system by using our algorithm. Experiments are done over several important on-line news sites and experimental results on our extracting system show that the approach can achieve higher extraction accuracy than RTDM algorithm.

Published in:

E-Business and Information System Security, 2009. EBISS '09. International Conference on

Date of Conference:

23-24 May 2009