Skip to content

Instantly share code, notes, and snippets.

@thoughtfulbloke
thoughtfulbloke / Waitangi15.R
Last active October 6, 2019 22:30
Change in opinion about Treaty of Waiting over 2002-2017
library(foreign) # read NZESin SPSS files
library(dplyr) # general data manipulation
library(tidyr) # restructuring
library(ggplot2)
library(ggthemes)
# NZES files in nzes subfolder
nz17 <- suppressWarnings(read.spss("nzes/NZES2017Release14-07-19.sav",
to.data.frame = TRUE, add.undeclared.levels = "sort"))
nz17meta <- data.frame(varnames = attributes(nz17)$names, eyear=2017,
library(streamgraph)
ctext ="egroup, cyear, n
European, 2001, 2871432
Maori, 2001, 526281
Pacific Peoples, 2001, 231801
Asian, 2001, 238176
Middle Eastern/Latin American/African, 2001, 24084
Other ethnicity, 2001, 801
European, 2006, 2609589
@thoughtfulbloke
thoughtfulbloke / CJK_example.R
Created September 19, 2019 23:24
Example of identifying CJK characters in text data in R
library(stringr)
library(dplyr)
library(ggplot2)
cjk <- regex("[\U00004E00-\U00009FFF\U00003400-\U00004DBF\U00020000-\U0002A6DF\U0002A700-\U0002B73F\U0002B740-\U0002B81F\U0000F900-\U0000FAFF\U0002F800-\U0002FA1F\U00009FA6-\U00009FCB]")
mentdf %>% mutate(hourly = floor_date(time_NZ, unit="hour")) %>%
filter(str_detect(description, cjk),
time_NZ > ISOdatetime(2019,08,28,0,0,0, tz="Pacific/Auckland")) %>%
count(hourly) %>%
ggplot(aes(x=hourly, y=n)) + geom_col() +
ggtitle("Hourly Number of Tweets received by NZ politicians by people with cjk characters in their Twitter Profile")
@thoughtfulbloke
thoughtfulbloke / gather_data_for_analysis_of_twitter.R
Last active January 15, 2020 03:48
My current code for gathering data on accounts for analysing Twitter
############ collection
## assumes the working directory is the folder the script is in, and is set to the same for future runs of the file.
# This also assumes you have twitter developer credientals, and have run the create_token() function in the rtweet package
# to authorise r to access Twitter. This stores the credentials in an environmental variable loaded at startup so they are
# not exposed in the script
# as an alternative, if using a Mac or PC, and the httpuv package is installed, you can interactively authorise the script
# at run time
# these packages need to be already installed in order to be loaded and used
@thoughtfulbloke
thoughtfulbloke / ESR_measles_report_pdfs_to_excel
Created September 15, 2019 02:26
Code for converting the tables in the ESR measles outbreak weekly report pdfs into excel files
library(rvest)
library(janitor)
library(pdftools)
library(dplyr)
library(tabulizer) #needs Jav, see https://github.com/ropensci/tabulizer
library(writexl)
library(purrr)
## store pdfs locally in folder ESR_measles
if(!dir.exists("ESR_measles")){dir.create("ESR_measles")}
@thoughtfulbloke
thoughtfulbloke / Auckland_PDF.R
Created September 2, 2019 23:55
Extract text of pdf as word and tables as excel
library(officer)
library(pdftools)
library(tabulizer) #needs JDK installed
library(writexl)
target <- file.choose()
target_no_suffix <- gsub("\\.pdf$","", target, ignore.case = TRUE)
PDFtext <- pdf_text(target) #pdf_tools
#a character array for each page, want paragraphs for officer
PDFparas <- unlist(strsplit(PDFtext, "\n"))
@thoughtfulbloke
thoughtfulbloke / three_var_fake_x_axis.R
Created August 31, 2019 02:21
Using gganimate to appear to replace an x axis
library(scales)
library(dplyr)
library(ggplot2)
library(gganimate)
library(ggthemes)
co2 = read.table("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_mlo.txt")
temp = read.table("https://climate.nasa.gov/system/internal_resources/details/original/647_Global_Temperature_Data_File.txt",
header=FALSE, skip=5)
library(scales)
library(dplyr)
library(ggplot2)
library(gganimate)
co2 = read.table("ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_mlo.txt")
temp = read.table("https://climate.nasa.gov/system/internal_resources/details/original/647_Global_Temperature_Data_File.txt",
header=FALSE, skip=5)
names(co2) = c("year", "co2", "unc")
library(dplyr)
library(tidyr)
library(ggplot2)
library(forcats)
library(sjlabelled)
library(ggthemes)
w5 <- readRDS("wvs5_2005_2009.rds")
wv5 <- w5 %>%
mutate(Country = as.character(as_label(V2)),
@thoughtfulbloke
thoughtfulbloke / unHKmigration.R
Created July 29, 2019 09:23
Working with UN migration spreadsheet
library(readxl)
library(janitor)
library(dplyr)
library(tidyr)
library(ggplot2)
# UN migration data from
y1990 <- read_excel("UN_MigrantStockByOriginAndDestination_2015.xlsx", "Table 1") %>%
clean_names() %>% select(Country=x2, Code=x4, HK1990=x53) %>%
mutate(Code = as.numeric(Code)) %>% filter(!is.na(Code))