Skip to Main Content
In scheduling connections at busy Web servers, it is commonly assumed that transmission duration (or time in system) is directly proportional to the size of the file transferred. For example, a scheduling discipline such as SRPT (shortest remaining processing time) could use this assumption to order connections according to the residual size of the transfer. However, with a diverse client population, network effects such as packet loss, heterogeneous end-to-end bandwidths and latencies render this assumption invalid. In this measurement study, we explore this relationship and investigate the predictive value of file size in determining transfer time. We use the publicly available sanitized cache access logs which are collected on a daily basis as a part of IRCache, the NLANR Web caching project, to explore this relationship for HTTP traffic serviced by the NLANR caches over a weeklong interval. Over this dataset, we first confirm an earlier finding: that for small transfers of up to 30 KB, there is virtually no correlation between file size and transfer time; moreover, transfer times vary over 5 orders of magnitude. For larger files, we find that file size and transfer time are increasingly well correlated as file size increases but we still find that predictions of transfer time from file size alone are not highly accurate. Our findings motivate further investigation of incorporating network-awareness into end-system scheduling disciplines.