Skip to Main Content
Topic-specific search engines that offer users relevant topics as search results have recently been developed. However, these topic-specific search engines require intensive human efforts to build and maintain. In addition, they visit many irrelevant pages. In our project, we propose a new approach for Web topics search. First, we do early detection for "candidate topics" while extracting words from the HTML text. Secondly, we perform data analysis on the appearance information such as appearance times and places for candidate topics. By these two techniques, we can reduce candidate topics' crawling times and computing cost. Analysis of the results and the comparisons with related research will be made to demonstrate the effectiveness of our approach.