Skip to content

Instantly share code, notes, and snippets.

@AndrewLJackson
Created December 9, 2014 10:58
Show Gist options
  • Select an option

  • Save AndrewLJackson/bc67cfc2f5bc53c4b1e6 to your computer and use it in GitHub Desktop.

Select an option

Save AndrewLJackson/bc67cfc2f5bc53c4b1e6 to your computer and use it in GitHub Desktop.
Example code showing how transformations to the raw data affect estimation of covariance matrices, and hence ellipse size and shape.
rm(list=ls())
graphics.off()
library(mnormt)
set.seed(1)
# start of notes on re-scaling
#if x and y are two correlated variables with covariance matrix S then
#let x.z and y.z be their independently z-score transformed values.
# some multivariate normal random data
n <- 50
rr <- rmnorm(n, c(10, 20), matrix(c(1, 2, 2, 5), 2, 2))
# extract and label them x and y
x <- rr[,1]
y <- rr[,2]
# some summary statistics on our raw data
mu.x <- mean(x)
mu.y <- mean(y)
var.x <- var(x)
var.y <- var(y)
sd.x <- sd(x)
sd.y <- sd(y)
# and the sample covariance matrix
S <- cov(data.frame(x,y))
# our z-score transformed data
x.z <- (x - mu.x) / sd.x
y.z <- (y - mu.y) / sd.y
# ---------------------------------------------------------------------
# Note that their correlation is preserved in this transform
dev.new()
par(mfrow = c(1,2))
plot(x,y,
main = paste("raw data, cor = ", round(cor(x,y), digits = 2), sep=""))
plot(x.z, y.z,
main = paste("z-score data, cor = ", round(cor(x,y), digits = 2), sep=""))
# ---------------------------------------------------------------------
# now estimate the covariance matrix on the z-scored data and
# back-transform to the original scale. Although the
# correlation coefficients are unchanged by the z-score transform,
# the covariance matrices are not the same, and hence an ellipse
# fitted to these will be different. We can back-transform though.
# compare S with S.z
S.z <- cov(data.frame(x.z,y.z))
print(S)
print(S.z)
# back-transform
S.back <- matrix(0, 2, 2)
S.back[1,1] <- S.z[1,1] * var.x
S.back[2,2] <- S.z[2,2] * var.y
S.back[1,2] <- S.z[1,2] * sd.x * sd.y
S.back[2,1] <- S.back[1,2]
# and now our back-transformed covariance matrix is the same
# as the covariance matrix calculated on the raw data.
# Hence, the ellipses fit to both these will be identical.
print(S.back)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment