In recent years, multi-core systems have become mainstream in computer industry. The design of multi-cores takes advantage of thread-level parallelism in emerging applications that are computationally intensive and highly parallel. Energy efficiency is one of the biggest challenges in the design of multi-core systems, and workload imbalance among parallel threads is one of sources of energy inefficiency. Many techniques based on dynamic voltage frequency scaling (DVFS) are proposed to save energy consumptions on multi-cores, but all of them assume that each core in a multi-core system contains only one hardware context and only one thread can execute on one core at a time. However, mainstream multi-core systems are moving to have simultaneous multithreading (SMT) support in cores, and existing DVFS-based techniques are not effective to achieve maximum energy savings. In this paper, we present a novel technique called thread shuffling, which combines thread migration and DVFS to achieve maximum energy savings and maintain performance on a multi-core system supporting SMT. Thread shuffling is implemented and simulated in a cycle-accurate ×86 multi-core system. The experiments show that it achieves up to 56% energy savings without performance penalty for selected Recognition, Mining and Synthesis (RMS) applications from Intel Labs.