Skip to Main Content
Water domain grid platform, a grid platform based on cycle stealing technology is used to harness idle computing resources in one or several labs of one or several sites for its low costs and high performance. Volatility is the key challenge of this kind of platform and one fault will generate when a computing node leaves the platform. So how to make these volatile nodes work together without being influenced by generated faults is a key issue. So this paper presents a fault tolerance architecture aiming at minimizing generating faults. Once faults generated, other idle computation nodes in this platform can go on executing the unfinished task immediately. Finally some experiments based on this platform show that the framework has good performance in dealing with fault tolerance in water domain oriented computing resources integrated platform.
Date of Conference: 1-3 June 2009