Abstract:
Centralized crawlers are not adequate to spider meaningful and relevant portions of the Web. A crawler with good scalability and load balancing can bring growth to perfor...Show MoreMetadata
Abstract:
Centralized crawlers are not adequate to spider meaningful and relevant portions of the Web. A crawler with good scalability and load balancing can bring growth to performance. As the size of web is growing, in order to complete the downloading of pages in fewer amounts of time and increase the coverage of crawlers it is necessary to distribute the crawling process. In this paper, we present client server architecture based smart distributed crawler for crawling web. In this architecture load between the crawlers is managed by server and each time a crawler is loaded, load is distributed to others by dynamically distributing the URLs. Focused crawlers makes efficient usage of network bandwidth and storage capacity, when distributed can enhance the performance.
Published in: 2016 International Conference on Information Communication and Embedded Systems (ICICES)
Date of Conference: 25-26 February 2016
Date Added to IEEE Xplore: 25 July 2016
ISBN Information: