Skip to Main Content
Prefetching is an important technique for tolerating Web access latency. Existing prefetching algorithms are mostly based on URL graphs. While they have been demonstrated to be effective in prefetching of documents that are often accessed, few of them can prefetch documents whose URLs have never been accessed. We propose a semantics-based prefetching technique to overcome the limitation. It predicts future requests based on semantic preferences of previously retrieved documents. We apply this technique to news reading activities and prototyped a client-side prefetching system, NewsAgent. The system extracts document semantics by identifying keywords in their URL anchor texts and relies on neural networks over the keyword set to predict future requests. We cross-examine the system in daily browsing of ABC News, CNN, and MSNBC News sites for three months and demonstrate the effectiveness of the technique.