Skip to Main Content
Summary form only given. Reconfigurable computers (RCs) can leverage the synergism between conventional processors and FPGAs to provide low-level hardware functionality at the same level of programmability as general-purpose computers. In a large class of applications, the total I/O time is comparable or even greater than the computations time. As a result, the rate of the DMA transfer between the microprocessor memory and the on-board memory of the FPGA-based processor becomes the performance bottleneck. We perform a theoretical and experimental study of this specific performance limitation. The mathematical formulation of the problem has been experimentally verified on the state-of-the art reconfigurable platform, SRC-6E. We demonstrate and quantify the possible solution to this problem that exploits the system-level parallelism within reconfigurable machines.