Skip to Main Content
In a highly distributed computing environment, how to collect data and diagnose faults in a timely and scalable manner without disturbing the monitored system remains difficult. We proposed an agent-based three-layer monitoring framework for distributed complex engineering simulation system (e.g. CTCS-3 distributed simulation system) to solve the problem. Particularly, three key points are taken into consideration: 1) the data collecting strategy, 2) the dynamic load balancing algorithm and 3) the distributed fault diagnosis mechanism. To enable security, a coarse-grained RBAC model is also introduced based on the framework. The proposed framework has been successfully deployed for monitoring the CTCS-3 distributed simulation system.