This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Top Charities for Hurricane Harvey Relief | |
## According to both Charity Navigator and Charity Watch | |
## Approach: | |
## Scrape data from Charity Navigator and Charity Watch. | |
## Merge and display the intersection (common entries) of | |
## the two data sets. | |
## ** BROKEN ** As of 2017-10-29, Charity Navigator has changed their page | |
## and the organization of the table of charities. | |
## Libraries #### |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Height and Weight of 18 year olds | |
## from Hong Kong 1993 Growth Survey data, | |
## simulated by SOCR from reported summary statistics | |
## Heights in inches | |
## Weights in pounds | |
## Explanation \url{http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_Dinov_020108_HeightsWeights} | |
## Data \url{http://socr.ucla.edu/docs/resources/SOCR_Data/SOCR_Data_Dinov_020108_HeightsWeights.html} | |
## Libraries #### | |
library(rvest) # Web scraping |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Download growth chart summary statistics for Hong Kong children, ages 6 to 18, for 1963, 1993, 2005/6 | |
## Data from | |
## So, Hung-Kwan et al. “Secular Changes in Height, Weight and Body Mass Index in Hong Kong Children.” BMC Public Health 8 (2008): 320. PMC. Web. 29 Oct. 2017. | |
## Article at \url{https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2572616/} | |
## PMC Copyright and reuse terms: \url{https://www.ncbi.nlm.nih.gov/pmc/about/copyright/} | |
## Heights in cm | |
## Weights in kg | |
## Libraries #### | |
library(rvest) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' Model Fit Statistics | |
#' @description Returns lm model fit statistics R-squared, adjusted R-squared, | |
#' predicted R-squared and PRESS. | |
#' Thanks to John Mount for his 6-June-2014 blog post, R style tip: prefer functions that return data frames" for | |
#' the idea \url{http://www.win-vector.com/blog/2014/06/r-style-tip-prefer-functions-that-return-data-frames} | |
#' @param ... One or more \code{lm()} models. | |
#' @return A data frame with rows for R-squared, adjusted R-squared, Predictive R-squared and PRESS statistics, and a column for each model passed to the function. | |
model_fit_stats <- function(...) { | |
var_names <- as.character(match.call())[-1] | |
dots <- list(...) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' Remove rows from data frame containing only NA in pipe-friendly manner | |
#' @description Accepts a data frame and strips out any rows | |
#' containing only \code{NA} values, then returns the resulting data frame. | |
#' @param A data frame | |
#' @return A data frame | |
#' @source \url{http://stackoverflow.com/a/6437778} | |
strip_na_rows <- function(the_df) { | |
the_df[rowSums(is.na(the_df)) != ncol(the_df),] | |
return(the_df) | |
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df %>% | |
group_by(id) %>% | |
mutate(cumsum = cumsum(value)) %>% | |
ungroup() | |
# from \url{http://stackoverflow.com/a/21818500/393354} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
##' Modifies 'data' by adding new values supplied in newDataFileName | |
##' | |
##' newDataFileName is expected to have columns | |
##' c(lookupVariable,lookupValue,newVariable,newValue,source) | |
##' | |
##' Within the column 'newVariable', replace values that | |
##' match 'lookupValue' within column 'lookupVariable' with the value | |
##' newValue'. If 'lookupVariable' is NA, then replace *all* elements | |
##' of 'newVariable' with the value 'newValue'. | |
##' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Based on a post at \url{http://www.walkingrandomly.com/?p=5254} | |
library(dplyr) | |
library(ggplot2) | |
library(minpack.lm) | |
# The data to fit | |
my_df <- data_frame(x = c(0,15,45,75,105,135,165,195,225,255,285,315), | |
y = c(0,0,0,4.5,19.7,39.5,59.2,77.1,93.6,98.7,100,100)) | |
# EDA to see the trend |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# from Conrad Hacket | |
# Median hourly earnings | |
# \url{https://twitter.com/conradhackett/status/748884076493475840} | |
# makeover: convert from two groups of side-by-side vertical bar charts to a more readable dot plot | |
# Demonstrates: | |
# Use of in ggplot2 | |
# Creating dot plots | |
# Combining color and shape in a single legend | |
# Sorting a dataframe so that categorical data in one column is ordered by a second numerical column | |
# Note: resulting graph displays best at about 450 pixels x 150 pixels |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(dplyr) | |
# create a dummy dataframe with 100,000 groups and 1,000,000 rows | |
# and partition by group_id | |
df <- data.frame(group_id=sample(1:1e5, 1e6, replace=T), | |
val=sample(1:100, 1e6, replace=T)) %>% | |
group_by(group_id) | |
# filter rows with a value of 1 naively | |
system.time(df %>% filter(val == 1)) |