By Topic

NCAPS: application high availability in Unix computer clusters

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
L. A. Laranjeira ; Commun. Products Group, Tandem Comput. Inc., Austin, TX, USA

The paper presents a solution for improving the availability of applications running on a Unix computer cluster with two or more nodes. Tandem's NCAPS (NonStop Clusters Application Protection System) consists of specialized system software that is capable of recovering applications after hardware, software or operating system failures. The main component of NCAPS, the PPM (Process Pairs Manager), uses a primary and warm backup approach to achieve recovery times in the range of 10 seconds (for nodes having access to all needed resources) regardless of the application initialization time. This is a clear improvement over recovery times provided by existing high availability (HA) solutions, which are typically in the order of 1 minute plus the application reinitialization time. The PPM manages an application through a configurable user-specified state model in which state changes are triggered by detected failures or system administrator commands. Upon a state transition the PPM sends a state change command message to registered application processes. Communication between the application processes and the PPM is achieved through a set of API (application programming interface) calls provided by the OftLib (Open Fault Tolerance Library), also called FT-API. NCAPS is now available on Unix clusters composed of Tandem S4000 machines. A version to run on Tandem SSI (Single System Image) product NSC (NonStop Clusters) for a cluster of Compaq Proliant machines is under development.

Published in:

Fault-Tolerant Computing, 1998. Digest of Papers. Twenty-Eighth Annual International Symposium on

Date of Conference:

23-25 June 1998