This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
== Green Man POC | |
:neo4j-version: neo4j-2.1 | |
:author: John Swain | |
:twitter: @swainjo | |
:tags: domain:POC | |
=== Load Sample Data | |
//setup | |
//hide |
The k-nearest neighbors (k-NN) algorithm is among the simplest algorithms in the data mining field. Distances / similarities are calculated between each element in the data set using some distance / similarity metric ^[1]^ that the researcher chooses (there are many distance / similarity metrics), where the distance / similarity between any two elements is calculated based on the two elements' attributes. A data element’s k-NN are the k closest data elements according to this distance / similarity.
1. A distance metric measures distance; the higher the distance the further apart the neighbors. A similarity metric measures similarity; the higher the similarity the closer the neighbors.