Skip to Main Content
Grid monitoring, as an important part of any grid systems, is needed to query the state of grid resources and match user requirements with available grid resources. In order to ensure the availability of grid monitoring, the reliability imposed by software or hardware failure happened with unpredictable probability must be assessed. This paper contributes to study the reliability analysis approach of grid monitoring in the context of grid monitoring architecture (GMA) that has been de facto standards for many areas of grid computing. Failure types and contributing factors in GMA are analyzed, which are likely to take place in comprised components, channels or process behaviors. Then, the respective evaluation equations are suggested via Markov procedure, queue model, and probability theory. Furthermore, the reliability issue of hierarchical GMA is discussed based on four basic architectural relations. Numerical examples are given to illustrate the proposed computing equations. The results show that our approach is feasible.