Skip to Main Content
As online forums contain a vast amount of information that can aid in the early detection of fraud cases and extremist activities, accurate and efficient forum crawlers are very important in the field of digital forensics to acquire useful information and knowledge. In this paper, we conduct an analysis on an existing work, iRobot, which is an intelligent forum crawler. We identify the drawbacks of iRobot in its vertex-based traversal path selection, edge-based traversal path seletion, informativeness estimation and detection of duplicate pages. We also propose algorithms which are better suited for these tasks and present the proofs that they can achieve a much higher accuracy and efficiency. Finally, we conduct an empirical evaluation on the “scam.com” and “fraudwatchers.org” forum sites to demonstrate the actual accuracy improvement of our proposed informativeness estimation.