This paper presents a new probabilistic method for detecting and tracking multiple faces in a video sequence. The proposed method integrates the information of face probabilities provided by the detector and the temporal information provided by the tracker to produce a method superior to the available detection and tracking methods. The three novel contributions of the paper are: 1) Accumulation of probabilities of detection over a sequence. This leads to coherent detection over time and, thus, improves detection results. 2) Prediction of the detection parameters which are position, scale, and pose. This guarantees the accuracy of accumulation as well as a continuous detection. 3) The representation of pose is based on the combination of two detectors, one for frontal views and one for profiles. Face detection is fully automatic and is based on the method developed by Schneiderman and Kanade (2000). It uses local histograms of wavelet coefficients represented with respect to a coordinate frame fixed to the object. A probability of detection is obtained for each image position and at several scales and poses. The probabilities of detection are propagated over time using a Condensation filter and factored sampling. Prediction is based on a zero order model for position, scale, and pose; update uses the probability maps produced by the detection routine. The proposed method can handle multiple faces, appearing/disappearing faces as well as changing scale and pose. Experiments carried out on a large number of sequences taken from commercial movies and the Web show a clear improvement over the results of frame-based detection (in which the detector is applied to each frame of the video sequence).