This paper addresses the problem of spatio-temporal segmentation of video sequences. An initial intensity segmentation method (watershed segmentation) provides a number of initial segments which are subsequently labeled, with a known number of labels, according to motion information. The label field is modeled as a Markov random field where the statistical spatial and and temporal interactions are expressed on the basis of the initial watershed segments. The labeling criterion is the maximization of the conditional a posteriori probability of the label field given the motion hypotheses, the estimate of the label field of the previous frame, and the image intensities. For the optimization, an iterative motion estimation-labeling algorithm is proposed and experimental results are presented.