Skip to Main Content
Retrieval accuracy in content-based multimedia retrieval can be improved by using distance metric learned from distribution of features in input feature space. One way to achieve this is by dimension reduction via manifold-learning, such as Locally Linear Embedding . While effective in improving retrieval accuracy, these algorithms have high computational cost that depends on feature dimensionality d and number of training samples N. In this paper, we explore a clustering-based approach to reduce number of training samples; it uses L cluster centers (L≪N) computed from N input features as training samples. We propose to use extremely randomized clustering tree  for clustering. Experiments showed that the proposed approach produces better retrieval performance than random sampling, and that the randomized tree is much faster than the k-means algorithm.