Skip to content

Instantly share code, notes, and snippets.

@noemi-dresden
Last active June 18, 2018 12:47
Show Gist options
  • Save noemi-dresden/c66446e1be07f2a086691d750fcc208d to your computer and use it in GitHub Desktop.
Save noemi-dresden/c66446e1be07f2a086691d750fcc208d to your computer and use it in GitHub Desktop.
Clustering algorithms overview
Name Definition Example Characteristics Organization
Connectivity Models Data points closer in data space are more similar than those far away hierachical cluster easy to interpret but do not scale well Hierachical
Centroid models iterative where similarity is intepreted as proximity of data point to centroid K-means provide final number of cluster Non-Hierachical
Distribution Models Based on probability of data points in a cluster belonging to the same distribution EM-Algorithm (Expectation-Maximization) frequent problems of overfitting Non-Hierachical
Density Models Isolate different density regions as basis for clustering Density-Based Clustering of Application with Noise (DBSCAN) Not good on high dimensional data or clusters with varying densities Non-Hierachical
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment