Skip to Main Content
Web search engines are powerful tools used to satisfy specific information needs on the Web. Their purpose is to maximize user satisfaction when performing this task. Although there are other sources of evidence, besides text, to characterize document relevance for a specific need, especially for HTML documents, current search engines do not allow users to explore these features when posing a query. Search engine queries are based almost exclusively on keywords. We believe that it is possible to improve user satisfaction if HTML tags and document metadata are available to users at query time. In this paper we present Xearch, a meta-search system that wraps public search engines in a framework that improves both the expressiveness of the language available for the user to specify information needs and the control over the answer format. Xearch converts HTML pages to a specific XML schema, covering text and metadata derived from HTML. User queries are then submitted on this schema and can be specified through keywords but also explore documents' HTML tags and metadata. Results from our experimental evaluation confirm that it is possible to improve the answer quality with this framework.