Cart (Loading....) | Create Account
Close category search window

Learning object models from semistructured Web documents

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Shiren Ye ; Dept. of Comput. Sci., Nat. Univ. of Singapore, Singapore ; Chua, T.-S.

This paper presents an automated approach to learning object models by means of useful object data extracted from data-intensive semistructured Web documents such as product descriptions. Modeling intensive data on the Web involves the following three phrases: first, we identify the object region covering the descriptions of object data when irrelevant contents from the Web documents are excluded. Second, we partition the contents of different object data appearing in the object region and construct object data using hierarchical XML outputs. Third, we induce the abstract object model from the analogous object data. This model would match the corresponding object data from a Web site more precisely and comprehensively than the existing handcrafted ontologies. The main contribution of this study is in developing a fully automated approach to extract object data and object model from semistructured Web documents using kernel-based matching and view syntax interpretation. Our system, OnModer, can automatically construct object data and induce object models from complicated Web documents, such as the technical descriptions of personal computers and digital cameras downloaded from manufacturers' and vendors' sites. A comparison with the available hand-crafted ontologies and tests on an open corpus demonstrate that our framework is effective in extracting meaningful and comprehensive models.

Published in:

Knowledge and Data Engineering, IEEE Transactions on  (Volume:18 ,  Issue: 3 )

Date of Publication:

March 2006

Need Help?

IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.