Skip to Main Content
User perceived quality is the most important aspect of Internet applications. After a single negative experience, users tend to switch to one of the other myriad of alternatives available to them on the Internet. Two key components of Internet application quality are scalability and reliability. In this paper, we present the first general-purpose mechanism capable of maintaining reliability in the face of process, machine, and catastrophic failures. We define catastrophic failures as events that cause entire clusters of servers to become unavailable such as network partitioning, router failures, natural disasters, or even terrorist attacks. Our mechanism utilizes client-side tunneling, client-side redirection, and implicit redirection triggers to deliver reliable communication channels. We capitalize on previous work, redirectable sockets (RedSocks), that focuses on Internet application scalability. RedSocks are communication channels enhanced with a novel session layer aimed at modernizing network communication. We modify Red-Socks to create the first fault tolerant socket solution that can handle all server-side failures. Our mechanism is compatible with NATs and firewalls, scalable, application independent, and backwards compatible.