Improving the Web text content by extracting significant pages into a Web site | IEEE Conference Publication | IEEE Xplore