Skip to Main Content
The cluster monitoring system observes the operation of the system, analyzes the performance data, and displays results. It is crucial for cluster management and performance measurements as the monitoring data can be used to diagnose problems and to suggest remedies by both end users and system administrators. Scalable resource monitoring is essential to the cluster management. This paper proposes a scalable cluster monitoring architecture that builds a structured data aggregation tree(DAT) of master monitoring nodes by using the Chord P2P algorithm. The DAT leverages the Chord topology and routing mechanisms and it is implicitly constructed from native Chord routing paths without previous monitoring nodes membership and topology configuration. To balance the storage space used by monitoring data and computing load of the monitoring node, we propose a balanced routing algorithm that dynamically selects the parent of a node from its finger nodes by its distance to the root. We have evaluated the performance and scalability of our DAT-based monitoring system with up to 2500 nodes in a simulated environment. Our experiments results show that the balanced DAT scheme monitoring system scales well to a large number of nodes. Without explicitly configuring parent-child relationship, it is well adaptive to node arrival and departure and can be easily deployed.