I. Introduction
Accompany with the fast development of computing ability, memory space and I/O bandwidth of massive multi-core Graphics Processing Unit (GPU), the general-purpose usage of GPU (GPGPU) has become popular [1]. The huge number of cores in GPGPU (latest GTX590 contains 1024 cores) makes it efficient in processing large amount of data and providing high parallelism. However, some specific properties of GPGPU like memory hierarchy make it impossible to substitute CPU which leads to the conclusion that the CPU-GPGPU heterogeneous architecture will last for a long time. In a CPU-GPGPU heterogeneous architecture computer, CPU and GPGPU are integrated and CPU is used as the host processor.