Skip to content

Instantly share code, notes, and snippets.

@explodecomputer
Created December 4, 2015 12:37
Show Gist options
  • Save explodecomputer/f21f3ca2afc787ab6b74 to your computer and use it in GitHub Desktop.
Save explodecomputer/f21f3ca2afc787ab6b74 to your computer and use it in GitHub Desktop.
missingness and PCs
impute <- function(X)
{
apply(X, 2, function(x) {
x[is.na(x)] <- mean(x, na.rn=T)
return(x)
})
}
# Original matrix
a <- matrix(rnorm(1700*20000), 1700)
# Missing at random
b <- a
b[sample(1:length(a), 500)] <- NA
b_imputed <- impute(b)
# Non-random missingness, e.g. take the highest value of every column
c <- apply(a, 2, function(x) {
x[which.max(x)] <- NA
return(x)
})
c_imputed <- c
c_imputed[is.na(c_imputed)] <- 0
# PCs
a_pc <- prcomp(a)
b_pc <- prcomp(b_imputed)
c_pc <- prcomp(c_imputed)
diag(cor(a_pc$x[,1:10], b_pc$x[,1:10]))
diag(cor(a_pc$x[,1:10], c_pc$x[,1:10]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment