Skip to Main Content
We study the patterns and predictability of Internet end-to-end service degradations, where a degradation is a significant deviation of the round-trip time (RTT) between a client and a server. We use simultaneous RTT measurements collected from several locations to a large representative set of Web sites and study the duration and extent of degradations. We combine these measurements with border gateway protocol cluster information to learn on the location of the cause. We evaluate a number of predictors based upon hidden Markov models and Markov models. Predictors typically exhibit a tradeoff between two types of errors, false positives (incorrect degradation prediction) and false negatives (a degradation is not predicted). The costs of these error types is application dependent, but we capture the entire spectrum using a precision versus recall tradeoff. Using this methodology, we learn what information is most valuable for prediction (recency versus quantity of past measurements). Surprisingly, we also conclude that predictors that utilize history in a very simple way perform as well as more sophisticated ones. One important application of prediction is gateway selection, which is applicable when a local-area network is connected through multiple gateways to one or several Internet service provider. Gateway selection can boost reliability and survivability by selecting for each connection the (hopefully) best gateway. We show that gateway selection using our predictors can reduce the degradations to half of that obtained by routing all the connections through the best gateway.