Skip to content

Instantly share code, notes, and snippets.

View tmasjc's full-sized avatar
Strange Loop

Thomas Jc tmasjc

Strange Loop
  • Singapore
  • 14:36 (UTC +08:00)
View GitHub Profile
@tmasjc
tmasjc / openaq.R
Created February 11, 2018 09:18
Get data from OpenAq api. #rstats #httr
library(httr)
library(dplyr)
# Find out which cities are being covered in China
china <- "https://api.openaq.org/v1/cities?country=CN"
china_df <- GET(china) %>% content("parsed") %>% `[[`('results') %>% bind_rows()
# Main interest
cities <- c("Beijing", "Shanghai", "Guangzhou")
china_df %>% filter(city %in% cities)
@tmasjc
tmasjc / gdp_edu_cor.R
Last active February 13, 2018 10:01
Visualize GDP per capita ~ enrolment in sec education pct. #rstats #wdi #econs
# Mon Feb 12 16:48:40 2018 ------------------------------
library(tidyverse)
library(wbstats)
set.seed(1234)
# GDP per capita
gdp <- wb(indicator = "NY.GDP.PCAP.CD") %>% as.tibble()
skimr::skim(gdp)
@tmasjc
tmasjc / openaq_dump.R
Last active September 3, 2018 04:30
OpenAq dump collector. #rstats #openaq
library(tidyverse)
library(rvest)
# basic parameter
baseURL <- "https://openaq-data.s3.amazonaws.com/"
startDate <- as.Date("2015-07-01")
endDate <- as.Date("2018-04-01")
# from url endpoints
filenames <- paste0(seq.Date(from = startDate, to = endDate, by = "day"), ".csv")
@tmasjc
tmasjc / wuhan_eda.R
Last active April 23, 2018 15:15
wuhan sudden drop in median price
library(tidyverse)
# read raw data
raw <- read_csv("~/Documents/Gitlab/wuhanzflog.csv")
skimr::skim(raw)
# Clean Up Data -----------------------------------------------------------
dat <- raw %>%
@tmasjc
tmasjc / repo_struc.md
Created April 13, 2018 04:47
Standard repository structure and style.

Checklist

-	Does the project contain R script?
-	Does the project contain data?
-	Does the project contain a test script?
-	Do the scripts contain proper comments?
-	Is the project organised in proper structure? 
-	Do the scripts follow R style guide?
		http://style.tidyverse.org/
-	Does the README display the project structure?
# using Alt + A as prefix
unbind C-b
set -g prefix M-a
bind-key M-a send prefix
# reopen last window
unbind /
bind / last-window
# split panes using | and -
@tmasjc
tmasjc / remove_na.R
Created April 23, 2018 03:54
Remove NAs at once using dplyr mutate.
# Mon Apr 23 10:44:54 2018 ------------------------------
# make dummy data frame with 1s and NAs
vec <- purrr::map(1:3, ~c(1, NA) %>% rep(times = 5) %>% sample())
names(vec) <- paste0("Col_", 1:3)
df <- vec %>% as.data.frame()
df
# Extra column
df <- df %>% bind_cols(lettrs = letters[1:nrow(df)])
@tmasjc
tmasjc / filter_*.R
Last active April 24, 2018 12:44
Various filtering techniques in dplyr.
# Tue Apr 24 01:21:14 2018 ------------------------------
library(tidyverse)
# Simulate some samples ---------------------------------------------------
dat <- data_frame(
col_a = rep(c(letters[1:3], NA), 5) %>% sample(),
col_aa = col_a %>% sample(),
col_n = rep(c(1:3, NA), 5) %>% sample()
)
@tmasjc
tmasjc / mongolite_demo.R
Created April 26, 2018 06:06
Mongolite basic demo
library(mongolite)
library(dplyr)
# connect to database
con <- mongo(collection = "relet_price", db = "bizops", url ="mongodb://localhost", verbose = TRUE)
# some random id based on datetime
gen_id <- function(){
c <- gsub("(\\:|\\-|\\s)", "", as.character(Sys.time()))
strsplit(c, "") %>%
@tmasjc
tmasjc / tidy_filter.R
Created April 27, 2018 08:39
do filtering in tidy eval manner.
library(tidyverse)
set.seed(1122)
vecs <- lapply(X = 1:2, function(x) rep(c(1, 2, 3), times = 10) %>% sample() %>% head(10))
names(vecs) <- paste0("col_", 1:2)
dat <- vecs %>% as.data.frame()
dat
# Which col has repeated value more than 3 appearances?