Skip to content

Instantly share code, notes, and snippets.

@whatever
Created January 11, 2013 16:26
Show Gist options
  • Save whatever/4512000 to your computer and use it in GitHub Desktop.
Save whatever/4512000 to your computer and use it in GitHub Desktop.
Function to divide a data.frame by unique values of a column
# Divide a Data Frame or Matrix Into Subsets
# @param obj a data.frame or matrix to be split into subsets, divided by the
# categorical variable
# @param by a character-string, specifying the column to subset
# @return a list, containing the subsetted data sets. The names of the list
# correspond to the value of the subsetted list
divide <- function (obj, by) {
# Get the set of possible values
column.levels <-if (is.factor(obj[, by])) {
levels(obj[, by])
} else {
unique(obj[, by])
}
# A list used to store each individual data.frame
res <- list()
# Iterate through all possible values and store each subset in a separate
# entry in the list
for (val in column.levels) {
# Determine which rows match this value
hits <- obj[, by] == val
# Store data set temporarily in a local value
data.set <- obj[hits, ]
# Assign levels to the column. This adds levels to string data.
levels(data.set[, by]) <- column.levels
# Store data set in list
res[[val]] <- data.set
}
# Return list
res
}
@rupesh1219
Copy link

I have 7 distinct values in my column, i used this query... list is created with all 7 distinct values but atlast only one datset is created! how should i create 7 dataframes?

@TheAlchemistNerd
Copy link

TheAlchemistNerd commented Dec 25, 2021

You can use dplyr::group_split() for this functionality. Here is the link to the documentation
https://dplyr.tidyverse.org/reference/group_split.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment