By Topic

Fault-tolerant technique in the cluster computation of the digital watershed model

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

The purchase and pricing options are temporarily unavailable. Please try again later.
4 Author(s)
Yizi Shang ; State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing 100084, China ; Baosheng Wu ; Tiejian Li ; Shenguang Fang

This paper describes a parallel computing platform using the existing facilities for the digital watershed model. In this paper, distributed multi-layered structure is applied to the computer cluster system, and the MPI-2 is adopted as a mature parallel programming standard. An agent is introduced which makes it possible to be multi-level fault-tolerant in software development. The communication protocol based on checkpointing and rollback recovery mechanism can realize the transaction reprocessing. Compared with conventional platform, the new system is able to make better use of the computing resource. Experimental results show the speedup ratio of the platform is almost 4 times as that of the conventional one, which demonstrates the high efficiency and good performance of the new approach.

Published in:

Tsinghua Science and Technology  (Volume:12 ,  Issue: S1 )