Skip to Main Content
Despite advances in measurement technology, it is still challenging to reliably compile large-scale network datasets. For example, because of flaws in the measurement systems or difficulties posed by the measurement problem itself, missing, ambiguous, or indirect data are common. In the case where such data have spatio-temporal structure, it is natural to try to leverage this structure to deal with the challenges posed by the problematic nature of the data. Our work involving network datasets draws on ideas from the area of compressive sensing and matrix completion, where sparsity is exploited in estimating quantities of interest. However, the standard results on compressive sensing are: 1) reliant on conditions that generally do not hold for network datasets; and 2) do not allow us to exploit all we know about their spatio-temporal structure. In this paper, we overcome these limitations with an algorithm that has at its heart the same ideas espoused in compressive sensing, but adapted to the problem of network datasets. We show how this algorithm can be used in a variety of ways, in particular on traffic data, to solve problems such as simple interpolation of missing values, traffic matrix inference from link data, prediction, and anomaly detection. The elegance of the approach lies in the fact that it unifies all of these tasks and allows them to be performed even when as much as 98% of the data is missing.