We propose a reliable, scalable platform that employs a reliability management framework to support service-oriented networking and computing systems. Although virtualization is widely used to make systems reconfigurable, it fails to provide a totally integrated management framework in which the allocation of required services to adequate physical resources takes into account the presence of shared risk groups. We present basic system architecture and a highly scalable hardware-platform, as well as a reliability management framework that provides both a service availability management interface and automated risk assessment. Our evaluation of a prototype system shows that with the proposed framework, complete service breakdown will occur only in the case of multiple-point failure. While this corresponds to the same level of reliability as ordinary manual risk management, resource utilization efficiency is twice as high
Published in:
Military Communications Conference, 2005. MILCOM 2005. IEEE
Date of Conference: 17-20 Oct. 2005