Skip to content

Instantly share code, notes, and snippets.

@timcdlucas
Created October 4, 2019 18:47
Show Gist options
  • Save timcdlucas/0173e91997e3c29b98052583e75e6f81 to your computer and use it in GitHub Desktop.
Save timcdlucas/0173e91997e3c29b98052583e75e6f81 to your computer and use it in GitHub Desktop.
pca with log + eps
# pca
d <- data.frame(x = runif(100))
d$y <- d$x + runif(100, -0.01, 0.01)
plot(y ~ x, data = d)
# very correlated. pc1 should be along the diagonal.
m <- prcomp(x = log(d))
# x and y components in pc1 are equal. i.e. pc1 is along diagonal.
# Now add a row with 1 zero
d <- rbind(d, c(0, 0.5))
# Make vector of small numbers to add.
eps_vec <- exp(seq(-30, 1, length.out = 20))
ratio <- rep(NA, 20)
for(i in 1:20){
m <- prcomp(x = log(d + eps_vec[i]))
ratio[i] <- m$rotation[1, 1] / m$rotation[2, 1]
}
# When eps is 1, pca1 is still equal parts x and y
# When eps is small, x dominates because the log(1e-10) makes x the widest.
plot(ratio ~ eps_vec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment