To create a confusion matrix in respect of threshold being fixed for effective detection of near duplicate web documents in Web Crawling | IEEE Conference Publication | IEEE Xplore