By Topic

Top-down extraction of semi-structured data

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
B. Ribeiro-Neto ; Dept. of Comput. Sci., Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil ; A. H. F. Laender ; A. S. da Silva

We propose an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use this information to extract new objects from new pages or texts. We propose a top-down strategy that extracts complex objects, decomposing them in objects less complex, until atomic objects have been extracted. Through experimentation, we demonstrate that with a small number of given examples, our strategy is able to extract most of the objects present in a Web source given as input

Published in:

String Processing and Information Retrieval Symposium, 1999 and International Workshop on Groupware

Date of Conference: