In wifi fingerprinting, the k-Nearest Neighbors algorithm (knn) selects the k closest training examples in the feature space. This particular implementation of the algorithm works as follows:

The training and validation sets are composed of several vectors in a multidimensional feature space. These features represent the received signal strength indicator (RSSI) for each wireless access point (WAP) detected. Every one of these vectors corresponds to a position in the space, given by its longitude and latitude.

Every observation from the validation set is compared with all the vectors from the training set. The algorithm selects the k more similar to the observation. The similarity is measured tipically by computing the euclidean distance between the two feature vectors, although this parameter can be customized to use any other distance measure (manhattan, bray, etc.)

Once the k nearest neighbors are selected, the predicted position for the observation is estimated as the weighted average of the positions of the selected neighbors. The weights to the contributions of the neighbors are assigned for the nearer neighbors to contribute more to the average than the more distant ones.

Implementation

Let’s see how to implement the algorithm with an example. Generally, RSSI values are represented in a negative form, where the closer the value is to 0, the stronger the received signal is. For convenience, in this example RSSI is stored as a positive integer value ranging from 0 (no signal received) to 100.


This is the training set:

##    WAP1 WAP2 WAP3 WAP4 WAP5 WAP6 WAP7 WAP8 WAP9 WAP10 LONGITUDE LATITUDE
## 1     0    0   80   70    0    0   15   30    0     0      1050     3200
## 2    20   10   30   10    0    0    0   10    0    10      1063     3210
## 3     0    0   60   60    0   20    0   20   10     0      1052     3198
## 4    30    0   10    0   10    0   30   10    0     0      1071     3190
## 5    20   20   20   30   10   10   10   40    0    20      1065     3201
## 6    40   10    0   30    0   10   10   40    0     0      1032     3192
## 7    30    0   30   20   10   20    0   30    0     0      1035     3196
## 8    20    0   60   60    0   40   20    0    0     0      1055     3173
## 9    10    0   20   30    0   30   20   10    0     0      1070     3180
## 10   30   10   40   20   10    0    0   20   20    10      1037     3220


and this is the validation example:

##   WAP1 WAP2 WAP3 WAP4 WAP5 WAP6 WAP7 WAP8 WAP9 WAP10 LONGITUDE LATITUDE
## 1    0    0   60   50    0   40   15   20    0     0      1054     3180


This plot shows the position of the measurements for both training and validation sets.