Nearest Neighbor
- Updated2025-11-25
- 3 minute(s) read
Nearest neighbor classification includes Nearest Neighbor, K-Nearest Neighbor, and Minimum Mean Distance algorithms. The most intuitive way of determining the class of a feature vector is to find its proximity to a class or features of a class using a distance function. Based on the definition of the proximity, there are several different algorithms, as follows.
Distance Metrics
The Particle Classifier and the Color Classifier provide three distance metrics: Euclidean distance, Sum distance, and Maximum distance.
Let X = [x1, x2, . . . xn] and Y = [y1, y2, . . . yn] be the feature vectors.
| Euclidean distance (L2) | |
| Sum distance, also known as the City-Block metric or Manhattan metric (L1) | |
| Maximum distance (L∞) |
Nearest Neighbor Classifier
In Nearest Neighbor classification, the distance of an input feature vector X of unknown class to a class Cj is defined as the distance to the closest sample that is used to represent the class.
where d(X,Xi j) is the distance between X and Xi j.
The classification rule assigns a pattern X of unknown classification to the class of its nearest neighbor.
Nearest neighbor classification is the most intuitive approach for classification. If representative feature vectors for each class are available, Nearest Neighbor classification works well in most classification applications.
In some classification applications, a class may be represented by multiple samples that are not in the same cluster, as shown in the following figure. In such applications, the Nearest Neighbor classifier is more effective than the Minimum Mean Distance classifier.
o = Class 1
x = Class 2
K-Nearest Neighbor Classifier
In K-Nearest Neighbor classification, an input feature vector X is classified into class Cj based on a voting mechanism. The classifier finds the K nearest samples from all of the classes. The input feature vector of the unknown class is assigned to the class with the majority of the votes in the K nearest samples.
The outlier feature patterns caused by noise in real-world applications can cause erroneous classifications when Nearest Neighbor classification is used. As the following figure illustrates, K-Nearest Neighbor classification is more robust to noise compared with Nearest Neighbor classification. With X as an input, K = 1 outputs Label 1, and K = 3 outputs Label 2.
Minimum Mean Distance Classifier
Let {X j1,X j2, . . . ,X jnj} be nj feature vectors that represent class Cj. Each feature vector has the label of class j that you have selected to represent the class. The center of the class j is defined as
The classification phase classifies an input feature vector X of unknown class based on its distance to each class center.
where d(X,Mj) is defined as the distance function based on the distance metric selected during the training phase.
In applications that have little to no feature pattern variability or a lot of noise, the feature patterns of each class tend to cluster tightly around the class center. Under these conditions, Minimum Mean Distance classifiers perform effectively—only the input vector distances to the centers of the classes need to be calculated instead of all the representative samples in real-time classification.