Skip to content

Instantly share code, notes, and snippets.

View MichaelChirico's full-sized avatar

Michael Chirico MichaelChirico

View GitHub Profile
@MichaelChirico
MichaelChirico / gist:0fb318392d1b18d6a966e4785d49cac9
Last active October 29, 2022 21:29
Scraping & munging to answer: Has any team before the 2022 Phillies won all of their playoff game 1s on the road?
# SOURCE: retrosheet.org - https://www.retrosheet.org/gamelogs/index.html
library(withr)
library(data.table)
# one file for the full history of each round
playoff_logs <- c(
wild_card = "https://www.retrosheet.org/gamelogs/glwc.zip",
lds = "https://www.retrosheet.org/gamelogs/gldv.zip",
lcs = "https://www.retrosheet.org/gamelogs/gllc.zip",
world_series = "https://www.retrosheet.org/gamelogs/glws.zip"
@MichaelChirico
MichaelChirico / get_r_translation_status.R
Created April 26, 2022 07:12
Execute a run of snapshot summaries for translations, then visualize
library(withr)
library(data.table)
r_dir <- "~/github/r-svn"
all_commits <- with_dir(r_dir, system('git log --pretty="format:%H %b"', intern=TRUE))
all_commits <- all_commits[nzchar(all_commits)]
hashes <- substr(grep("^[0-9a-f]{40} ", all_commits, value=TRUE), 1, 40)
trunk_hash <- head(hashes, 1L)
# grep("trunk@33000", all_commits, value = TRUE)
@MichaelChirico
MichaelChirico / get_r_translation_status_for_revision.R
Created April 26, 2022 07:11
Summarize the state of translation for a snapshot of r-devel
#!/usr/local/bin/Rscript
library(potools)
suppressPackageStartupMessages(library(data.table))
script_wd = setwd("~/github/r-svn")
GIT_COMMIT = if (interactive()) readline('git commit: ') else commandArgs(TRUE)
setwd('src/library')
get_po_messages <- potools:::get_po_messages
@MichaelChirico
MichaelChirico / r_translatable_messages.R
Created July 6, 2021 09:57
Make a barplot showing trend of R translatable messages over time
library(data.table)
# Export the table on page 2 to .csv:
# https://docs.google.com/document/d/1XbfOf3CLVb2UFyUZGJoVLkBUDZ6Hs3APCDW8UzuOvZk
DT=fread('~/Desktop/r-translations.csv')
DT[ , {
par(cex = 2)
y = `# Messages`
xx = barplot(y, names.arg = `R Version`, space=0, col='#2268bc', yaxt='n', main='Translatable messages by R version', las=2L)
text(xx[1L], y[1L]/2, srt=90, prettyNum(y[1L], big.mark = ','), col='white')
text(xx[.N], y[.N]/2, srt=90, prettyNum(y[.N], big.mark = ','), col='white')
@MichaelChirico
MichaelChirico / dt_usage_revdeps.R
Created March 15, 2021 06:28
Count usages of data.table functions in Importing revdeps
#!/usr/local/bin/Rscript
library(data.table)
imports_dt <- tools::dependsOnPkgs(
"data.table", "Imports",
recursive = TRUE,
installed = available.packages()
)
tar_dir <- "/media/michael/69913553-793b-4435-ac82-0e7df8e34b9f/cran-mirror/src/contrib"
@MichaelChirico
MichaelChirico / sg_mrt_per_district.R
Created March 5, 2021 08:31
Count the number of MRT/LRT stations in each postal district
library(sp)
library(rgdal)
# better to download & read the files programmatically, but something
# funky is happening on that route, not bothering to debug too deeply.
# for reproducibility:
# (1) Click Download at the URL
# (2) Unzip the file; it contains two more zip directories, one KML and one SHP
# (3) Unzip the .shp zip to ~/Downloads/mrt
@MichaelChirico
MichaelChirico / ca_santa_clara_covid_map.R
Created February 25, 2021 04:07
Generate the travel radius around California Santa Clara County for COVID restrictions
library(data.table)
library(sp)
library(rgdal)
library(rgeos)
# Via CA GIS Data site
ca = readOGR("~/Downloads/CA_Counties", "CA_Counties_TIGER2016")
travel_zone = gBuffer(
ca[ca$NAME == "Santa Clara", ],
@MichaelChirico
MichaelChirico / r_squared_1.R
Last active February 25, 2021 00:27
Get an R^2 of 1
library(data.table)
DT = fread("~/Downloads/spotifyclass.csv")
# add new columns. they're great predictors!
DT[ , paste0("V", 1:nrow(DT)) := replicate(.N, rnorm(.N), simplify = FALSE)]
summary(lm(DT$target ~ ., data = DT[ , .SD, .SDcols = patterns("^V")]))$r.squared
@MichaelChirico
MichaelChirico / spelling_bee
Created September 23, 2020 03:28
Cheating on NYT SpellingBee
# caveat -- doesn't have 100% overlap w the dictionary
CENTER=u
LETTERS=${CENTER}cfinot
grep $CENTER /usr/share/dict/words | grep -E "^[$LETTERS]{4,}$"
@MichaelChirico
MichaelChirico / flop_mirror_gif.sh
Last active September 22, 2020 03:44
flop (horizontal flip) + mirror a gif
#/bin/sh
# built on ImageMagick tools via convert; constituent SO answers:
# https://askubuntu.com/a/101527/362864
# https://askubuntu.com/a/1052902/362864
# https://stackoverflow.com/a/20075227/3576984
# https://unix.stackexchange.com/a/24019/112834
# INPUT: foo.gif
# step 0: isolate foo to its own folder
TMPDIR=/tmp/__flop_mirror__