By Topic

Utilizing RSS Feeds for Crawling the Web

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
George Adam ; Technol. Inst. Rion, Res. Acad. Comput., Patras ; Christos Bouras ; Vassilis Poulopoulos

We present ldquoadvaRSSrdquo crawling mechanism which is created in order to support peRSSonal, a mechanism used to create personalized RSS feeds. In contrast to the common crawling mechanisms our system is focalized on fetching the latest news from the major and minor portals worldwide by utilizing their communication channels. The challenge between ldquoadvaRSSrdquo and a usual crawler is the fact that the news is produced in a random order any time of the day and thus the freshness of the offline collection can be measured even in minutes. This means that the system has to be updated with news every single time they occur. In order to achieve this we utilize the communication channels that exist on the modern architecture of the WWW and more specifically in almost every modern news portal. As the RSS feeds are used by every major and minor portal it is possible to keep our crawler up to date and retain a high freshness of the ldquooffline contentrdquo that is maintained in our systempsilas database by applying algorithms in order to observe the temporal behaviour of each RSS feed.

Published in:

Internet and Web Applications and Services, 2009. ICIW '09. Fourth International Conference on

Date of Conference:

24-28 May 2009