Loading [MathJax]/extensions/TeX/ietmacros.js
Design of focused crawler for information retrieval of Indian origin Academicians | IEEE Conference Publication | IEEE Xplore

Design of focused crawler for information retrieval of Indian origin Academicians


Abstract:

Search engines alleviate the task of finding information on Internet very easy. Web crawler is the main part of any search engine that follows the URLs to gather informat...Show More

Abstract:

Search engines alleviate the task of finding information on Internet very easy. Web crawler is the main part of any search engine that follows the URLs to gather information from the Web. For topic specific crawling, a special type of crawler called focused Web crawler is used. Focused crawler tries to find high quality information on a specific topic while avoiding irrelevant links. In the era of contemporary world, boundary of countries has evanesces for researchers, scientists and academicians. In this paper, we have applied the concept of focused crawling for information retrieval of Indian origin academicians working abroad. The aim is to develop a database of such academicians working in universities abroad, finding, and connecting with them. Gathering all such individuals through manual search is an impossible task and hence this paper gives a design of a focused crawler that can conglomerate all such information. This continuously updating database will cater to students wishing to connect with their alumni or other professors for academic collaboration.
Date of Conference: 08-09 April 2016
Date Added to IEEE Xplore: 29 September 2016
ISBN Information:
Conference Location: Dehradun, India

I. Introduction

Today, most of the popular search engines have provided us with facilities to locate any information on the Internet. When user tries to search for any information, they usually focus on some specific topic or person. Search engines use Web crawlers to collect information available online. Web crawlers are the tools that keep on following the hyperlinks to gather information. Rather than collecting all the available data on the Web, focused crawler selectively download webpages that are relevant according to a predefined criteria. The concept of focused crawling was introduced in [1]: a focused crawler can seek, acquire, and index webpages on a specific set of topics that represent a narrow segment of Web. Focused crawling approach leads to significant savings in hardware and network resources, and helps to keep the data gathered by the crawler more up-to-date.

Contact IEEE to Subscribe

References

References is not available for this document.