Skip to content

Instantly share code, notes, and snippets.

@mrdwab
Created May 21, 2011 17:06
Show Gist options
  • Select an option

  • Save mrdwab/984691 to your computer and use it in GitHub Desktop.

Select an option

Save mrdwab/984691 to your computer and use it in GitHub Desktop.
R stratified random sampling from a data frame
stratified = function(df, group, size) {
# USE: * Specify your data frame and grouping variable (as column
# number) as the first two arguments.
# * Decide on your sample size. For a sample proportional to the
# population, enter "size" as a decimal. For an equal number
# of samples from each group, enter "size" as a whole number.
#
# Example 1: Sample 10% of each group from a data frame named "z",
# where the grouping variable is the fourth variable, use:
#
# > stratified(z, 4, .1)
#
# Example 2: Sample 5 observations from each group from a data frame
# named "z"; grouping variable is the third variable:
#
# > stratified(z, 3, 5)
#
require(sampling)
temp = df[order(df[group]),]
if (size < 1) {
size = ceiling(table(temp[group]) * size)
} else if (size >= 1) {
size = rep(size, times=length(table(temp[group])))
}
strat = strata(temp, stratanames = names(temp[group]),
size = size, method = "srswor")
(dsample = getdata(temp, strat))
}
@mrdwab

mrdwab commented Mar 15, 2012

Copy link
Copy Markdown
Author

If you want to use this, you can copy and paste the function above, or you can use the following:

require(RCurl)
temp = getURL("https://raw.github.com/gist/984691/fb8e0483b093caa871444db162ed11210a1bac5b/Stratified.R")
source(textConnection(temp))

@mrdwab

mrdwab commented Sep 29, 2014

Copy link
Copy Markdown
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment