By Topic

A sample-guided approach to incremental structured web database crawling

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Wei Liu ; Key Lab. of Comput. Linguistics, Peking Univ., Beijing, China ; Jianguo Xiao ; Jianwu Yang

Web database crawling is a promising solution for Deep Web data integration. To the best of our knowledge, the existing approaches only focused on how to crawl all records in a web database. Due to the high dynamic of most web databases, it is not practical to harvest a small proportion of new records by crawling the whole database. This paper studies the problem of incremental web database crawling, which targets at crawling the new records from a web database efficiently. In the proposed approach, a new graph model, query related graph, is proposed to transform a incremental crawling task into a graph traversal process. Based on this graph model, appropriate queries are generated for crawling which are guided by the samples of the web database. Extensive experimental evaluations over real Web databases validate the effectiveness of our techniques and provide insights for future efforts in this direction.

Published in:

Information and Automation (ICIA), 2010 IEEE International Conference on

Date of Conference:

20-23 June 2010