Normally web sites are designed for large screen devices and hence it is not easy to browse these pages with limited user interface and devices such as palm, mobile. Web page segmentation is an important technology for both search engine and web browser on mobile device. Web page segmentation is a task that breaks down the structure of web page into logical blocks which is an important step for identifying informative blocks for efficient information extraction and convenient display on the devices with small sized screens. Previous repetition based segmentation method is not suitable for segmenting blocks, when there is no reappearance tags in the web pages. In order to improve the segmentation accuracy, a new method of segmentation is introduced (DWS) which segments web pages based on either reappearance based scheme, by recognizing reappearance tag patterns from the DOM tree structure of a web page. Based on the detection of tag patterns, it generates implicit nodes to segment the nested block correctly nor it will segment pages based on web layout information such as TABLE>;, DIV>; and FRAME>; tags depends on key pattern in the web page. If it contains reappearance tag in tag pattern means, it will segment based on reappearance based segmentation. Otherwise it will segment based on web layout information. From that segmented block hyperlink is displayed on the mobile first and then user select hyperlinks based on his area of interest. The interested information alone is displayed to the user.
Published in:
Recent Trends In Information Technology (ICRTIT), 2012 International Conference on
Date of Conference: 19-21 April 2012