By Topic

An Enhanced Bag-of-Visual Word Vector Space Model to Represent Visual Content in Athletics Images

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Kesorn, K. ; Comput. Sci. & Inf. Technol. Dept., Naresuan Univ., Phitsanulok, Thailand ; Poslad, S.

Images that have a different visual appearance may be semantically related using a higher level conceptualization. However, image classification and retrieval systems tend to rely only on the low-level visual structure within images. This paper presents a framework to deal with this semantic gap limitation by exploiting the well-known bag-of-visual words (BVW) to represent visual content. The novelty of this paper is threefold. First, the quality of visual words is improved by constructing visual words from representative keypoints. Second, domain specific “non-informative visual words” are detected which are useless to represent the content of visual data but which can degrade the categorization capability. Distinct from existing frameworks, two main characteristics for non-informative visual words are defined: a high document frequency (DF) and a small statistical association with all the concepts in the collection. The third contribution in this paper is that a novel method is used to restructure the vector space model of visual words with respect to a structural ontology model in order to resolve visual synonym and polysemy problems. The experimental results show that our method can disambiguate visual word senses effectively and can significantly improve classification, interpretation, and retrieval performance for the athletics images.

Published in:

Multimedia, IEEE Transactions on  (Volume:14 ,  Issue: 1 )