Skip to content

Instantly share code, notes, and snippets.

@PeteHaitch
Created June 25, 2014 08:37
Show Gist options
  • Select an option

  • Save PeteHaitch/114a3db332dbe6b49dd4 to your computer and use it in GitHub Desktop.

Select an option

Save PeteHaitch/114a3db332dbe6b49dd4 to your computer and use it in GitHub Desktop.
Merge a list of data.tables
# Could be generalised to handle the full arguments of merge.data.table but I've kept it simple.
# mergeDT based on http://r.789695.n4.nabble.com/merge-multiple-data-frames-td4331089.html
# Takes a (named) list of data.tables (lodt) where all columns are common to all data.tables
# The key of each table is the same but is only a subset of the columns, e.g. (chr, pos1, pos2)
# The remaining columns of each data.table are the "counts", e.g. (MM, MU, UM, UU)
# We append the names of each sample (the names of lodt) to the "counts" so that we can keep
# track of from which sample the counts came.
mergeAll <- function(lodt) {
dotNames <- lapply(lodt, names)
repNames <- Reduce(intersect, dotNames)
repNames <- repNames[repNames != key(lodt[[1]])]
for(i in seq_along(lodt)){
wn <- which((names(lodt[[i]]) %in% repNames) &
(names(lodt[[i]]) != key(lodt[[1]])))
setnames(lodt[[i]], wn,
paste(names(lodt[[i]])[wn], names(lodt)[[i]], sep = "."))
}
Reduce(function(x, y) merge(x, y, all = TRUE), lodt)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment