Skip to Main Content
How to effectively and efficiently assess similarity is a long-standing and challenging research problem in various multimedia applications. For ranked retrieval in a collection of objects based on series of multivariate observations (e.g., searching similar video clips to a query example), satisfactory performance cannot be achieved by using many conventional similarity measures that aggregate element-to-element comparison results. Some correlation information among the individual elements has also been investigated to characterize each set of multidimensional points for comparison, but with an unwarranted assumption that the underlying data distribution has a particular parametric form. Motivated by these concerns, measuring the similarity of multidimensional point sets is approached from a novel collective perspective in this paper, by evaluating the probability that they are consistent with a same distribution. We propose to make use of nonparametric hypothesis tests in statistics to compute the distributional discrepancy of samples for assessing the degree of similarity between two ensembles of points. While our proposal is mainly presented in the context of video similarity search, it enjoys great flexibility and is extensible to other applications where multidimensional point set representations are involved, such as motion capture retrieval.