Skip to Main Content
High-performance cluster networks achieve very high throughput thanks to zero-copy techniques that require pinning of application buffers in physical memory. The Open-MX stack implements message passing over generic Ethernet hardware with similar needs. We present the design of an innovative pinning model in Open-MX based on the decoupling of memory pinning from the application. This idea eases the implementation of a reliable pinning cache in the kernel and enables full overlap of pinning with communication. The pinning cache enables performance improvement when the application reuses the same buffers multiple times, while overlapped pinning is also applicable to other applications. Performance evaluation shows that both these optimizations bring from 5 up to 20% throughput improvements depending on the host and network performance.