Cart (Loading....) | Create Account
Close category search window
 

Nearest neighbors by neighborhood counting

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Hui Wang ; Fac. of Eng., Ulster Univ., Jordanstown, UK

Finding nearest neighbors is a general idea that underlies many artificial intelligence tasks, including machine learning, data mining, natural language understanding, and information retrieval. This idea is explicitly used in the k-nearest neighbors algorithm (kNN), a popular classification method. In this paper, this idea is adopted in the development of a general methodology, neighborhood counting, for devising similarity functions. We turn our focus from neighbors to neighborhoods, a region in the data space covering the data point in question. To measure the similarity between two data points, we consider all neighborhoods that cover both data points. We propose to use the number of such neighborhoods as a measure of similarity. Neighborhood can be defined for different types of data in different ways. Here, we consider one definition of neighborhood for multivariate data and derive a formula for such similarity, called neighborhood counting measure or NCM. NCM was tested experimentally in the framework of kNN. Experiments show that NCM is generally comparable to VDM and its variants, the state-of-the-art distance functions for multivariate data, and, at the same time, is consistently better for relatively large k values. Additionally, NCM consistently outperforms HEOM (a mixture of Euclidean and Hamming distances), the "standard" and most widely used distance function for multivariate data. NCM has a computational complexity in the same order as the standard Euclidean distance function and NCM is task independent and works for numerical and categorical data in a conceptually uniform way. The neighborhood counting methodology is proven sound for multivariate data experimentally. We hope it works for other types of data.

Published in:

Pattern Analysis and Machine Intelligence, IEEE Transactions on  (Volume:28 ,  Issue: 6 )

Date of Publication:

June 2006

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.