Skip to content

Instantly share code, notes, and snippets.

@wabarr
Created September 19, 2015 13:51
Show Gist options
  • Save wabarr/2859dd161e844a0deea1 to your computer and use it in GitHub Desktop.
Save wabarr/2859dd161e844a0deea1 to your computer and use it in GitHub Desktop.
testing if kmeans clustering is deterministic or not

kmeans testing

Andrew Barr
September 19, 2015

library(ggplot2)

groups <- factor(rep(c("A", "B"), 100))
x <- rnorm(200) 
y <- rnorm(200) 
clusters <- kmeans(as.matrix(x, y), 2)
clusters2 <- kmeans(as.matrix(x, y), 2)
clusters3 <- kmeans(as.matrix(x, y), 2)
clusters4 <- kmeans(as.matrix(x, y), 2)

table(clusters$cluster)
## 
##   1   2 
## 115  85
table(clusters2$cluster)
## 
##   1   2 
## 115  85
table(clusters3$cluster)
## 
##   1   2 
##  85 115
table(clusters4$cluster)
## 
##   1   2 
## 115  85
forPlot <- data.frame(x, y)
qplot(x, y, color=as.factor(clusters$cluster), size=I(4), data=forPlot, main="First Clustering") + guides(color=FALSE)

qplot(x, y, color=as.factor(clusters2$cluster),size=I(4), data=forPlot, main="Second Clustering") + guides(color=FALSE)

qplot(x, y, color=as.factor(clusters3$cluster),size=I(4), data=forPlot, main="Third Clustering") + guides(color=FALSE)

qplot(x, y, color=as.factor(clusters4$cluster),size=I(4), data=forPlot, main="Fourth Clustering") + guides(color=FALSE)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment