Skip to content

Instantly share code, notes, and snippets.

@kcha
Created March 13, 2016 18:24
Show Gist options
  • Save kcha/776adbead433e9df20bb to your computer and use it in GitHub Desktop.
Save kcha/776adbead433e9df20bb to your computer and use it in GitHub Desktop.
Split a data frame into k evenly divided (or close to even) subsets
# Split a data frame into k subsets
#
# Returns a list of subsetted data frames of equal (or as close to equal as
# possible) size. If the data frame cannot be split equally by k, then the
# remainder will be adding to the last k'th subset. User can also request
# to put the remainder in an additional k+1 subset.
#
# This is basically a wrapper around split(), but helps calculate the remainder,
# if necessary.
split_k <- function(x, k, remainder_as_additional = FALSE) {
if (k < 2) {
stop("k needs to be 2 or more")
}
if (nrow(x) %% k == 0) {
split(x, rep(1:k, nrow(x)/k))
} else {
# Find the remainder
rem <- nrow(x) %% k
div <- floor(nrow(x)/k)
if (remainder_as_additional) {
split(x, c(rep(1:k, each=div), rep(k+1, rem)))
} else {
split(x, c(rep(1:(k-1), each=div), rep(k, div + rem)))
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment