Created
October 29, 2017 21:51
-
-
Save tomhopper/3f35b7e409fd289057cf7108a46ccbf0 to your computer and use it in GitHub Desktop.
Web scrape and display top charities for hurricane relief.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Top Charities for Hurricane Harvey Relief | |
## According to both Charity Navigator and Charity Watch | |
## Approach: | |
## Scrape data from Charity Navigator and Charity Watch. | |
## Merge and display the intersection (common entries) of | |
## the two data sets. | |
## ** BROKEN ** As of 2017-10-29, Charity Navigator has changed their page | |
## and the organization of the table of charities. | |
## Libraries #### | |
library(rvest) # Web scraping | |
library(dplyr) # Data wrangling | |
## Download and clean the data #### | |
cn_url <- "https://www.charitynavigator.org/index.cfm?bay=content.view&cpid=5356&from=homepage" | |
cw_url <- "https://www.charitywatch.org/charitywatch-hot-topic/hurricane-maria-relief/81" | |
cn_df <- read_html(cn_url) %>% | |
html_node(xpath = '//*[@id="list-right"]/table') %>% | |
html_table() %>% | |
setNames(c("Charity", "Rating")) %>% | |
arrange(Charity) | |
cw_df <- read_html(cw_url) %>% | |
html_node(xpath = '//*[@id="main_wrapper"]/div/table') %>% | |
html_table() %>% | |
setNames(c("Charity", "Rating")) %>% | |
arrange(Charity) | |
## Manually fix mismatched charity names | |
rename_vec <- c(`Direct Relief & Direct Relief Foundation` = "Direct Relief") | |
cw_df$Charity[cw_df$Charity == names(rename_vec)] <- na.omit(rename_vec[cw_df$Charity]) | |
## Display intersection of results #### | |
cn_df %>% inner_join(cw_df, by = "Charity") %>% select(Charity) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment