This paper addresses the problem of building a failure detection service for large scale distributed systems. We describe failure detection service, which merges some novel proposals and satisfies scalability, flexibility and adaptability properties. Afterwards, we present the architecture of such a service, show detailed information about its components and present the simulation results concerning performance.
Published in:
Parallel, Distributed and Network-based Processing, 2009 17th Euromicro International Conference on
Date of Conference: 18-20 Feb. 2009