Skip to Main Content
File prefetching based on previous file access patterns has been shown to be an effective means of reducing file system latency by implicitly loading caches with files that are likely to be needed in the near future. Mistaken prefetching requests can be very costly in terms of added performance overheads, including increased latency and bandwidth consumption. Such costs of mispredictions are easily overlooked when considering access prediction algorithms only in terms of their accuracy; we describe a novel algorithm that uses machine learning not only to improve overall prediction accuracy, but also as a means to avoid those costly mispredictions. Our algorithm is fully adaptive to changing workloads, and is fully automated in its ability to refrain from offering predictions when they are likely to be mistaken. Our trace-based simulations show that our algorithm produces prediction accuracies of up to 98%. While this appears to be at the expense of a very slight reduction in cache hit ratios, application of this algorithm actually results in substantial reductions in unnecessary (and costly) I/O operations.