By Topic

Watershed: A High Performance Distributed Stream Processing System

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
de Souza Ramos, T.L.A. ; Dept. of Comput. Sci., Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil ; Oliveira, R.S. ; de Carvalho, A.P. ; Ferreira, R.A.C.
more authors

The task of extracting information from datasets that become larger at a daily basis, such as those collected from the web, is an increasing challenge, but also provides more interesting insights and analysis. Current analyses went beyond content and now focus on tracking and understanding users' relationships and interactions. Such computation is intensive both in terms of the processing demand imposed by the algorithms and also the sheer amount of data that has to handled. In this paper we introduce Watershed, a distributed computing framework designed to support the analysis of very large data streams online and in real-time. Data are obtained from streams by the system's processing components, transformed, and directed to other streams, creating large flows of information. The processing components are decoupled from each other and their connections are strictly data-driven. They can be dynamically inserted and removed, providing an environment in which it is feasible that different applications share intermediate results or cooperate to a global purpose. Our experiments demonstrate the flexibility in creating a set of data analysis algorithms and their composition into a powerful stream analysis environment.

Published in:

Computer Architecture and High Performance Computing (SBAC-PAD), 2011 23rd International Symposium on

Date of Conference:

26-29 Oct. 2011