Skip to Main Content
Many Web sites contain large sets of pages generated dynamically using a common template. The structured data extracted from these pages with semantic annotation are valuable for information system. We proposed a system, ADeaD, to automatically extract data values from these Web pages and annotate the data schema. Experimental evaluation on a lot of real Web page collections indicates our algorithm correctly extracted data and annotated the data schema.
Date of Conference: 28-31 March 2004