Skip to Main Content
Summary form only given. Grid computing is defined as "coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations". The transport network infrastructure represents one of the main resources to be shared. Emerging high capacity intelligent grid transport network infrastructures, such as optical transport networks based on generalized multiprotocol label switching (GMPLS) and automatically switched optical networks/automatically switched transport networks (ASON/ASTN), are fostering the expansion of grid computing from local area networks (LAN) (i.e., cluster grid) to wide area networks (WAN) (i.e., global grid). Indeed they are able to guarantee the required quality of service (QoS) to heterogeneous grid applications that share the same grid network infrastructure. The tutorial addresses one particular aspect of the grid transport network QoS: resilience, i.e. the ability to overcome failures. In particular, it gives an overview of the current efforts for guaranteeing grid application resilience in spite of different types of failures, such as network infrastructure failures or computer crashes. Finally, it shows that, by tailoring the utilized recovery scheme to the type of failure that occurred, it is possible to optimize the failure recovery process.
Date of Conference: 25-27 Aug. 2004