Web page text recognition and extraction based on OCR | IEEE Conference Publication | IEEE Xplore