Skip to Main Content
A classical solution for matching two image patches is to use the cross-correlation coefficient. This works well if there is a lot of structure within the patches, but not so well if the patches are close to uniform. This means that some patches are matched with more confidence than others. By estimating this uncertainty more weight can be put on the confident matches than those that are more uncertain. In this paper we present a system that can learn the distribution of the correlation coefficient from a video sequence of an empty scene. No manual annotation of the video is needed. Two distributions functions are learned for two different cases: i) the correlation between an estimated background image and the current frame showing that background and ii) the correlation between an estimated background image and an unrelated patch. Using these two distributions the patch matching problem is formulated as a binary classification problem and the probability of two patches matching is derived. The model depends on the signal to noise ratio. The noise level is reasonably invariant over time, while the signal level, represented by the amount of structure in the patch or it's spatial variance, has to be measured for every frame. A common application where this is useful is feature point matching between different images. Another application is background/foreground segmentation. In this paper it is shown how the theory can be used to implement a very fast background/foreground segmentation by transforming the calculations to the DCT-domain and processing a motion JPEG stream without uncompressing it. This allows the algorithm to be embedded on a 150 MHz ARM based network camera. It is also suggested to use recursive quantile estimation to estimate the background model. This gives very accurate background models even if there is a lot of foreground present during the initialisation of the model.