Skip to Main Content
Adding self-healing capabilities to network management systems holds great promise for delivering important goals, such as QoS, while simultaneously lowering capital expenditure, operation cost, and maintenance cost. In this paper, we present a model-based approach to add self-healing capabilities to a fault management system for cellular networks. We propose a generic modeling framework to categorize software failures and specify their dispositions at the model level for the target system. This facilitates the deployment of a control loop for adding autonomic capabilities into the system architecture, which include self-monitoring, self-healing, and self-adjusting functionality. While self- monitoring oversees the environmental conditions and system behavior, self-healing is accomplished by instrumenting the system with self-adjusting operations. We include a case study on a prototype intelligent network fault management system to illustrate this approach by showing how these autonomic capabilities can be added and deployed. Specifically, these autonomic capabilities are derived from self-model specifications, and are used to mitigate the risk of specified failures and maintain the health of the system in response to different types of faults encountered.