Skip to Main Content
Among the essential components of the IBM System z10™ platform is the hardware management console (HMC) and the IBM System z™ support element (SE). Both the SE and the HMC are closed fixed-function computer systems that include an operating system, many middleware open-source packages, and millions of lines of C, C++, and Java™ application code developed by IBM. The code on the SE and HMC is required to remain operational without a restart or reboot over long periods of time. In the first step toward the autonomic computing goal of continuous operation, an integrated, automatic software resource monitoring program has been implemented and integrated in the SE and HMC to look for resource, performance, and operational problems, and, when appropriate, initiate recovery actions. This paper describes the embedded resource monitoring program in detail. Included are the types of resources being monitored, the algorithms and frequency used for the monitoring, the information that is collected when a resource problem is detected, and actions executed as a result. It also covers the types of problems the resource monitoring program has detected so far and improvements that have been made on the basis of empirical evidence.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.