Skip to Main Content
Data caches are a key hardware means to bridge the gap between processor and memory speeds, but only for programs that exhibit sufficient data locality in their memory accesses. Thus, a method for evaluating cache performance is required to both determine quantitatively cache misses and to guide data cache optimizations. Existing analytical models for data cache optimizations target mainly isolated perfect loop nests. We present an analytical model that is capable of statically analyzing not only loop nest fragments, but also complete numerical programs with regular and compile-time predictable memory accesses. Central to the whole-program approach are abstract call inlining, memory access vectors, and parametric reuse analysis, which allow the reuse and interference both within and across loop nests to be quantified precisely in a unified framework. Based on the framework, the cache misses of a program are specified using mathematical formulas and the miss ratio is predicted from these formulas based on statistical sampling techniques. Our experimental results using kernels and whole programs indicate accurate cache miss estimates in a substantially shorter amount of time (typically, several orders of magnitude faster) than simulation.