Emerging computing architectures exhibit a rich variety of controllable storage resources. Allocation and management of these resources critically affect the performance of data intensive applications. In this paper we describe a synergistic collaboration between compiler data dependence analysis and execution modeling techniques to explore the application of data caching and software prefetching for hardware designs in high-level synthesis. We describe a design space exploration algorithm that selects between data caching and prefetching of array references along the critical paths of the computation with the objective of minimizing the overall execution time, while meeting the architecture's storage and bandwidth constraints. We present preliminary results of the application of the algorithm for a set of image/signal processing kernels on a commercial FPGA. The high precision of our execution model (average 94%) results in the selection of the fastest design in every case.