By Topic

Information Extraction from Semi-structured WEB Page Based on DOM Tree and its Application in Scientific Literature Statistical Analysis System

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
WeiDong Li ; Sch. of Inf. Technol., Hebei Univ. of Econ. & Bus., Shijiazhuang, China ; Yibing Dong ; RuiJiang Wang ; HongXia Tian

To extract information automatically from semi-structured Web pages, this paper puts forward a method named IESS for discovering the record model based on DOM and maximal similar sub tree, to identify records automatically and correctly when there are some differences in expression models of records that belong to the same type. To test the performance of the method, a scientific literature statistical analysis system is designed. The practice shows that users can quickly understand the distribution of papers in their retrieving field and grasp the importance with the help of the system.

Published in:

Services Science, Management and Engineering, 2009. SSME '09. IITA International Conference on

Date of Conference:

11-12 July 2009