Skip to Main Content
Motion estimation is the most time consuming stage of MPEG family encodings and it reportedly absorbs up to 90% of the total execution time of MPEG processing. Therefore, we propose a hardware/software co-design paradigm that uses a PIM module to efficiently execute motion estimation operations. We use a PIM module to reduce the memory access penalty caused by a large number of memory accesses. We segment the PIM module into small pieces so that each smaller PIM module can execute the operations in parallel fashion. However, in order to execute the operations in parallel, there are critical overheads that involve replicating a huge amount of data to many of these smaller PIM modules. Not only do these replications require a huge amount of additional memory accesses but also calculations when generating addresses. Therefore, we also present an efficient data distribution mechanism to effectively support parallel executions among these smaller PIM modules. With our paradigm, the host processor can be relieved from computationally-intensive and data-intensive workloads of motion estimation. We observed up to 2034× improvement in reduction of the number of memory accesses and up to 439× performance improvement for the execution of motion estimation operations when using our computing paradigm.