Skip to Main Content
Loss measurements are widely used in today's networks. There are existing standards and commercial products to perform these measurements. The missing element is a rigorous statistical methodology for their analysis. Indeed, most existing tools ignore the correlation between packet losses and severely underestimate the errors in the measured loss ratios. In this paper, we present a rigorous technique for analyzing performance measurements, in particular, for estimating confidence intervals of packet loss measurements. The task is challenging because Internet packet loss ratios are typically small and the packet loss process is bursty. Our approach, SAIL, is motivated by some simple observations about the mechanism of packet losses. Packet losses occur when the buffer in a switch or router fills, when there are major routing instabilities, or when the hosts are overloaded, and so we expect packet loss to proceed in episodes of loss, interspersed with periods of successful packet transmission. This can be modeled as a simple on/off process, and in fact, empirical measurements suggest that an alternating renewal process is a reasonable approximation to the real underlying loss process. We use this structure to build a hidden semi-Markov model (HSMM) of the underlying loss process and, from this, to estimate both loss ratios and confidence intervals on these loss ratios. We use both simulations and a set of more than 18 000 hours of real Internet measurements (between dedicated measurement hosts, PlanetLab hosts, Web and DNS servers) to cross-validate our estimates and show that they are better than any current alternative.