Skip to Main Content
This paper presents a strategy to speed-up the simulation of processors having SIMD extensions using dynamic binary translation. The idea is simple: benefit from the SIMD instructions of the host processor that is running the simulation. The realization is unfortunately not easy, as the nature of all but the simplest SIMD instructions is very different from a manufacturer to an other. To solve this issue, we propose an approach based on a simple 3-addresses intermediate SIMD instruction set on which and from which mapping most existing instructions at translation time is easy. To still support complex instructions, we use a form of threaded code. We detail our generic solution and demonstrate its applicability and effectiveness using a parametrized synthetic benchmark making use of the ARMv7 NEON extensions executed on a Pentium with MMX/SSE extensions.