In object-based video encoding, the encoding of the video data is decoupled into the encoding of shape, motion, and texture information, which enables certain functionalities, like content-based interactivity and content-based scalability. The fundamental problem, however, of how to jointly encode this separate information to reach the best coding efficiency has not been studied thoroughly. In this paper, we present an operational rate-distortion optimal scheme for the allocation of bits among shape, motion, and texture in object-based video encoding. Our approach is based on Lagrangian relaxation and dynamic programming. We implement our algorithm on the MPEG-4 video verification model, although it is applicable to any object-based video encoding scheme. The performance is accessed utilizing a proposed metric that jointly captures the distortion due to the encoding of the shape and texture. Experimental results demonstrate that the gains of lossy shape encoding depend on the percentage the shape bits occupy out of the total bit budget. This gain may be small or may be realized at very low bit rates for certain typical scenes.