In object-based video, the encoding of the video data is decoupled into the encoding of shape, motion and texture information, which enables certain functionalities like content-based interactivity and scalability. However, the problem of how to jointly encode these separate signals to reach the best coding efficiency has never been solved thoroughly. In this paper, we present an operational rate-distortion optimal bit allocation scheme that provides a solution to this problem. Our approach is based on the Lagrangian relaxation and dynamic programming. Experimental results indicate that the proposed optimal encoding approach has considerable gains over an ad-hoc method without optimization. Furthermore the proposed algorithm is much more efficient than exhaustive search.