By Topic

Data-WISE: Efficient management of data-intensive workflows in scheduled grid environments

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Gargi Dasgupta ; IBM India Research Lab, India ; Koustuv Dasgupta ; Balaji Viswanathan

The execution of data-intensive workflow applications in scientific and enterprise grids has gained popularity in recent times. Such applications process large and dynamic data sets, and often present scope for optimized data handling that can be exploited for performance. Traditionally, core grid middleware technologies of scheduling and orchestration, have treated data management as a background activity - decoupled from job management and handled at the storage and/or network protocol level. We believe that an important requirement for building data-aware grid technologies lies in managing data flows at the application level, in conjunction with their computation counterparts. To this end, we present Data-WISE, an end-to-end framework for management of data-intensive workflows as first class citizens, that addresses aspects of data flow orchestration, co-scheduling and runtime management. The optimizations are focused on exploiting application structure for use of data parallelism, replication, and runtime adaptations. We implement data-WISE on a real testbed and demonstrate significant improvements in terms of application response time, resource utilization, and adaptability to varying resource conditions. The proposed framework acts as an important step towards making distributed execution of data-intensive workflows a reality.

Published in:

NOMS 2008 - 2008 IEEE Network Operations and Management Symposium

Date of Conference:

7-11 April 2008