Web document duplicate removal algorithm based on keyword sequences | IEEE Conference Publication | IEEE Xplore