In this article, we investigate using multi-core graphics processing units (GPUs) for video encoding and decoding. After an overview of video coding and GPUs, we review some previous work on structuring video coding modules so that the massive parallel processing capability of GPUs can be harnessed. We also review previous work on partitioning the video decoding flow between the central processing unit (CPU) and GPU. After that, we discuss in detail a GPU based fast motion estimation to illustrate some design considerations in using GPUs for video coding, and the tradeoff between speedup and rate-distortion performance. Our results highlight the importance to expose as much data parallelism as possible in designing algorithms for GPUs.