Skip to Main Content
The crawler system in a vertical search engine should format a representative sample web page so at to make sure that the page could meet the W3C standards, which make it available that the processed page can be resolved by the visual XPath generator and then the desired XPath value will be found out. In batch-data-extraction, some exact data will be available when object web pages are parsed by the crawler system. A vertical search engine can extract the necessary data and segment Chinese words at first, and then the data will be presented on web pages. The data structuring process after the data extraction distinguishes a vertical search engine from a traditional search engine. The crawler system that can extract professional information on the Internet and process the information preliminarily is an indispensable part of a vertical search engine.
Date of Conference: 22-24 Oct. 2010