By Topic

An Increment-Based Random Walk Approach to Sampling Hidden Databases

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Na Zhao ; Sch. of Comput. Sci. & Technol, Shandong Univ. Ji''nan, Ji''nan ; Qingzhong Li ; Zhongmin Yan

A flood of information is hidden behind form-like interface which makes it difficult to capture the characteristics of the databases, such as the topic and the frequency of updates. This poses a great challenge for hidden web data integration. HIDDEN-DB-SAMPLER is the first algorithm to address this problem, but it does not consider the keyword attributes on the query interface. This paper presents increment-based random walk, a new technique applicable to any kind of attributes. The main idea of this approach is for keyword attributes, it incrementally obtains new values from a database. That is, select a value from the current sample and submit it to the interface, the selection scheme is designed to ensure the quality of the sampling; for other attributes, it works as RANDOM WALK does. An extensive set of experimental results demonstrates the accuracy and efficiency of our technique.

Published in:

Computer Science and Software Engineering, 2008 International Conference on  (Volume:4 )

Date of Conference:

12-14 Dec. 2008