Skip to Main Content
Digital image processing algorithms are a good match for direct implementation on FPGAs as current FPGA architectures can naturally match the fine grain parallelism in these applications. Typically, these algorithms are structured as a sequence of operations, expressed in high-level programming languages as tight loop nests. The loops usually define a shifting-window region over which the algorithm applies a simple localized operator (e.g., a differential gradient, or a min/max). In this research we focus on the development of fast, yet accurate performance and area modeling of complete FPGA designs that combine analytical, empirical and behavioral estimation techniques. We model the application of a set of important program transformations for image processing algorithms, namely loop unrolling, tiling, loop interchanging, loop fission and array privatization, and explore pipelined and non-pipelined execution modes. We take into consideration the impact of various transformations, in the presence of limited I/O resources like address generators and external memory data channels, on the performance of a complete design implemented in a FPGA based architecture.