In this paper, we show how we can estimate, accurately and efficiently, the 3D motion of a rigid object and time-varying lighting in a dynamic scene. This is achieved in an inverse compositional tracking framework with a novel warping function that involves a 2D rarr 3D rarr 2D transformation. This also allows us to extend traditional two-frame inverse compositional tracking to a sequence of frames, leading to even higher computational savings. We prove the theoretical convergence of this method and show that it leads to significant reduction in computational burden. Experimental analysis on multiple video sequences shows impressive speedup over existing methods while retaining a high level of accuracy.