Skip to Main Content
We present an approach for the joint segmentation and classification of a time series. The segmentation is on the basis of a menu of possible statistical models: each of these must be describable in terms of a sufficient statistic, but there is no need for these sufficient statistics to be the same, and these can be as complex (for example, cepstral features or autoregressive coefficients) as fits. All that is needed is the probability density function (PDF) of each sufficient statistic under its own assumed model-presumably this comes from training data, and it is particularly appealing that there is no need at all for a joint statistical characterization of all the statistics. There is similarly no need for an a-priori specification of the number of sections, as the approach uses an appropriate penalization of an over-zealous segmentation. The scheme has two stages. In stage one, rough segmentations are implemented sequentially using a piecewise generalized likelihood ratio (GLR); in the second stage, the results from the first stage (both forward and backward) are refined. The computational burden is remarkably small, approximately linear with the length of the time series, and the method is nicely accurate in terms both of discovered number of segments and of segmentation accuracy. A hybrid of the approach with one based on Gibbs sampling is also presented; this combination is somewhat slower but considerably more accurate.