1 Introduction
The explosive scaling of the computing power in recent decades has placed a spotlight on the main memory being an increasingly critical performance and energy bottleneck. Processing-in-Memory (PIM) provides promising solutions to this issue by placing computational logic in or near memory devices thus reducing or eliminating the data movement overheads [1], [2]. While the idea is not new [3], [4], with recent technology advances in-memory architectures, its potential as a viable solution is being actively explored in varying applications and commercial hardware contexts [5], [6], [7].