Skip to Main Content
As the number of cores per machine increases, memory architectures are being redesigned to avoid bus contention and sustain higher throughput needs. The emergence of Non-Uniform Memory Access (NUMA) constraints has caused affinities between threads and buffers to become an important decision criterion for schedulers. Memory migration dynamically enables the joint distribution of work and data across the machine but requires high-performance data transfers as well as a convenient programming interface. We present improvements of the LINUX migration primitives and the implementation of a Next-touch policy in the kernel to provide multithreaded applications with an easy way to dynamically maintain thread-data affinity. Microbenchmarks show that our work enables a high-performance, synchronous and lazy memory migration within multithreaded applications. A threaded LU factorization then reveals the large improvement that our Next-touch policy model may bring in applications with complex access patterns.