Skip to content

Instantly share code, notes, and snippets.

@explodecomputer
Created November 21, 2015 12:34
Show Gist options
  • Save explodecomputer/2bfcb872d6613113bf42 to your computer and use it in GitHub Desktop.
Save explodecomputer/2bfcb872d6613113bf42 to your computer and use it in GitHub Desktop.
aries data
load("aries_funnorm.randomeffect.pc10.160915.Robj")
load("samplesheet_aries.160915.Robj")
# Make sure that all the sample_names in samplesheet match the colnames in norm.beta.random
all(samplesheet$Sample_Name == colnames(norm.beta.random))
# This is not the case so we need to match things up
# Get the intersect
ids <- intersect(samplesheet$Sample_Name, colnames(norm.beta.random))
# Keep only the common IDs
samplesheet <- subset(samplesheet, Sample_Name %in% ids)
norm.beta.random <- norm.beta.random[, colnames(norm.beta.random) %in% ids]
# Now make sure that the IDs in samplesheet and norm.beta.random are the same
samplesheet <- samplesheet[match(colnames(norm.beta.random), samplesheet$Sample_Name), ]
all(samplesheet$Sample_Name == colnames(norm.beta.random))
# Ok so we probably wanna remove samples that we thing are dodgy, e.g.
to_remove_kids <- samplesheet$genotypeQCkids %in% c("ETHNICITY", "HZT;ETHNICITY") | samplesheet$Pass == "No" | samplesheet$QLET == "B"
to_remove_mums <- samplesheet$genotypeQCmums == "/strat" | samplesheet$Pass == "No"
to_remove <- which(to_remove_kids | to_remove_mums)
length(to_remove) # We are gonna remove all these samples
samplesheet <- samplesheet[-to_remove, ]
norm.beta.random <- norm.beta.random[, -to_remove]
all(samplesheet$Sample_Name == colnames(norm.beta.random))
# So now the samplesheet and norm.beta.random have all the IDs we wanna keep, and the IDs match up
# Get the cord samples:
m_cord <- norm.beta.random[, samplesheet$Sample_Name[samplesheet$time_point == "cord"]]
dim(m_cord)
colnames(m_cord) <- samplesheet$ALN[samplesheet$time_point == "cord"]
# Get the 15up samples:
m_15up <- norm.beta.random[, samplesheet$Sample_Name[samplesheet$time_point == "15up"]]
dim(m_15up)
colnames(m_15up) <- samplesheet$ALN[samplesheet$time_point == "15up"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment