Skip to Main Content
The topology of the Internet has been extensively studied in recent years, driving a need for increasingly complex measurement infrastructures. These measurements have produced detailed topologies with steadily increasing temporal resolution, but concerns exist about the ability of active measurement to measure the true Internet topology. Difficulties in ensuring the accuracy of every individual measurement when millions of measurements are made daily, and concerns about the bias that might result from measurement along the tree of routes from each vantage point to the wider reaches of the Internet must be addressed. However, early discussions of these concerns were based mostly on synthetic data, oversimplified models or data with limited or biased observer distributions. In this paper, we show the importance that extensive sampling from a broad distribution of vantage points has on the resulting topology and bias. We present two methods for designing and analyzing the topology coverage by vantage points: one, when system-wide knowledge exists, provides a near-optimal assignment of measurements to vantage points; while the second one is suitable for an oblivious system and is purely probabilistic. The majority of the paper is devoted to a first look at the importance of the distribution's quality. We show that diversity in the locations and types of vantage points is required for obtaining an unbiased topology. We analyze the effect that broad distribution has over the convergence of various autonomous systems topology characteristics. We show that although diverse and broad distribution is not required for all inspected properties, it is required for some. Finally, some recent bias claims that were made against active traceroute sampling are revisited, and we empirically show that diverse and broad distribution can question their conclusions.