By Topic

A framework of deep Web crawler

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Xiang Peisu ; College of Electrical Information Engineering, Southwest University for Nationalities, Chengdu Sichuan 610041, China ; Tian Ke ; Huang Qinzhen

As an ever-increasing amount of information on the Web today is available through search interfaces, users have to key in a set of keywords in order to access the pages from certain Web sites, which are often referred to as the hidden Web or the deep Web. Since there is no static links to the hidden Web pages, search engines cannot discover and index such pages. However, according to recent studies, the content provided by many hidden Web sites is often of very high quality and can be extremely valuable to many users. How to build an effective hidden Web crawler that can autonomously discover and download pages from the hidden Web is studied. A framework of deep Web crawler is provided and we propose novel techniques to handle the actual mechanics of crawling the deep Web. Experiment shows that these policies are effective.

Published in:

2008 27th Chinese Control Conference

Date of Conference:

16-18 July 2008