Skip to Main Content
High performance and low-power very large-scale integrations are required to implement complex media processing applications on mobile devices. Heterogeneous multicore processors are a promising way to achieve this objective. They contain multiple accelerator cores and CPU cores to increase the processing speed. Since media processing applications access a huge amount of data, fast address generation is very important. To increase the address generation speed, accelerator cores contain address generation units (AGUs). To reduce the power consumption, the AGUs have limited hardware resources such as adders and counters. Therefore, the AGUs generate simple addressing patterns where the address increases linearly in each clock cycle. Media processing applications frequently encounter addressing patterns where the same data are accessed in different time slots. To implement such addressing patterns, the same data have to be allocated into multiple memory addresses in such a way that those addresses can be generated by the AGUs. Allocation of the same data in multiple addresses is called the “data-duplication.” The data-duplication increases the data-transfer time and also the total processing time significantly. To remove such data-transfer bottlenecks, this paper proposes a memory allocation method that exploits the temporal and spatial locality of the memory access in media processing applications. We evaluate the proposed method using media processing applications to validate its effectiveness. According to the results, the proposed method reduces the total processing time by 14% to more than 85% compared to previous works.