Skip to Main Content
In this paper, we present an informed prefetching technique called IPODS that makes use of application-disclosed access patterns to prefetch hinted blocks in distributed multi-level storage systems. We develop a prefetching pipeline in IPODS, where an informed prefetching process is divided into a set of independent prefetching steps among multiple storage levels in a distributed system. In the IPODS system, while data blocks are prefetched from hard disks to memory buffers in remote storage servers, data blocks buffered in the servers are prefetched through networks to clients' local cache. We show that these two prefetching steps can be handled in a pipelining manner to improve I/O performance of distributed storage systems. Our IPODS technique differs itself from existing prefetching schemes in two ways. First, IPODS reduces applications' I/O stalls by keeping hinted data in clients' local caches and storage servers' fast buffers (e.g., solid state disks). Second, in a prefetching pipeline, multiple informed prefetching mechanisms semi-dependently coordinate to fetch blocks (1) from low-level (slow) to high-level (fast) storage devices in servers and (2) from high-level devices in servers to clients' local cache. The prefetching pipeline in IPODS judiciously hides network latencies in distributed storage systems, thereby reducing the overall I/O access time in distributed systems. Using a wide range of real-world I/O traces, our experiments show that IPODS can improve noticeably I/O performance of distributed storage systems.