Abstract:
The Internet at present has become vast and dynamic with the ever increasing number of web pages. These web pages change when more content is added to them. With the avai...Show MoreMetadata
Abstract:
The Internet at present has become vast and dynamic with the ever increasing number of web pages. These web pages change when more content is added to them. With the availability of change detection and notification systems, keeping track of the changes occurring in web pages has become more simple and straightforward. However, most of these change detection and notification systems work based on predefined crawling schedules with static time intervals. This can become inefficient if there are no relevant changes being made to the web pages, resulting in the wastage of both temporal and computational resources. If the web pages are not crawled frequently, some of the important changes may be missed and there may be delays in notifying the subscribed users. This paper proposes a methodology to detect the frequency of change in web pages to optimize server-side scheduling of change detection and notification systems. The proposed method is based on a dynamic detection process, where the crawling schedule will be adjusted accordingly in order to result in a more efficient server-based scheduler to detect changes in web pages.
Published in: 2017 Seventeenth International Conference on Advances in ICT for Emerging Regions (ICTer)
Date of Conference: 06-09 September 2017
Date Added to IEEE Xplore: 15 January 2018
ISBN Information:
Electronic ISSN: 2472-7598