Abstract:
Everyday, more and more specialized databases (car rental, hotels, airfares, etc.) are available on the Web and can be only queried by means of a Web Query Interface (WQI...Show MoreMetadata
Abstract:
Everyday, more and more specialized databases (car rental, hotels, airfares, etc.) are available on the Web and can be only queried by means of a Web Query Interface (WQI). Since in the Web is increasing the number of domain-specific databases, it is getting very complicated for end users to explore the information stored in them. In this context, research efforts are focused on building a single (unified) specific-domain WQI that allows user to query and integrate information available in different Web databases. The construction of such integrated WQI, for a given domain, involves several complex tasks, specially the extraction, representation, understanding and mapping of semantic content of each individual WQI associated to a web database. Previous approaches have considered hierarchical models to build integrated WQI, preserving the ancestor-descendant relationships in individual WQIs. In this work, we propose a novel tree-based approach for automatic construction of a hierarchical model of visual content of WQIs, representing their components in a clear and concise form. In the proposed approach, the Document Object Model(DOM) tree of each WQI considered in the integration process is processed by a specialized web resource to obtain relevant visual information in the WQI such as fields (UIs), groups of UIs and super-groups as well as their corresponding labels. This process is guided by a set of 8 design heuristic rules for the right identification of labels and components. Experiments to evaluate the proposed strategy were conducted on the ICQ and Tel-8 datasets of UIUC repository. Our results showed that the proposed tree-based approach for representing the visual components in a WQI has more than 94% of accuracy, improving current reported approaches and making easier the integration process of domain-specifi
Date of Conference: 07-10 July 2014
Date Added to IEEE Xplore: 07 October 2014
Electronic ISBN:978-8-4901-2355-3
Conference Location: Salamanca, Spain