Skip to content

Instantly share code, notes, and snippets.

View MichaelChirico's full-sized avatar

Michael Chirico MichaelChirico

View GitHub Profile
@MichaelChirico
MichaelChirico / rcv_bootstrap.R
Created June 6, 2025 14:19
Code for running bootstraps of ranked-choice voting elections
library(data.table)
library(ggplot2)
# Downloaded from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/AMK8PJ
dir="/media/michael/ab3f2700-872c-4b29-95f2-9a700166bc52/dataverse_files"
run_rcv_election = function(ballots) {
# TODO(michaelchirico): there should be a way to safely avoid a full copy, is it worth it?
ballots = copy(ballots[, .(rank, candidate)])
# _don't_ rely on ballots$voterid from input -- in bootstrap, under resampling, we will have
@MichaelChirico
MichaelChirico / rbind.fill_vs_bind_rows.R
Last active June 3, 2025 23:21
Comparing plyr::rbind.fill and dplyr::bind_rows()
# quick look sheet for comparing plyr::rbind.fill --> dplyr::bind_rows()
# NB: I am only interested in migrating rbind.fill-->bind_rows(), so
# features of bind_rows() absent from rbind.fill(), e.g. .id=, are not examined.
rbind.fill = plyr::rbind.fill
bind_rows = dplyr::bind_rows
DF1 = data.frame(a = 1, b = 2)
DF2 = data.frame(a = 1, b = 2)
all.equal(rbind.fill(DF1, DF2), bind_rows(DF1, DF2))
@MichaelChirico
MichaelChirico / h1_h2_h3_news.R
Created May 6, 2025 06:26
Demonstrate equivalence of h1/h2 and h2/h3 hierarchies for utils::news
news_fmt <- "%1$s My NEWS
%1$s# pkg 1.0.0 (2020-01-01)
%1$s## Categ 1
%1$s# pkg 0.4.0 (2019-01-01)
%1$s## Categ 1
@MichaelChirico
MichaelChirico / r_devel_path_cran_test.R
Last active August 18, 2024 07:16
r-devel CRAN sample
# Script to test a patched version of r-devel against a selection of CRAN packages
# Useful for detecting possible breaking changes by examining how changes affect actual packages
# Does not require a local CRAN mirror -- idea is to only test a small fraction of CRAN -->
# relatively small I/O cost of downloading the packages on the fly.
# This script is used for the patch here:
# https://github.com/r-devel/r-svn/pull/177
# https://bugs.r-project.org/show_bug.cgi?id=18782
# https://bugs.r-project.org/show_bug.cgi?id=17672
PACKAGES_TO_TEST = c(
@MichaelChirico
MichaelChirico / recommended_partial_issues.R
Last active April 25, 2024 06:43
Check all Recommended packages for partial matching issues
@MichaelChirico
MichaelChirico / taxes2023.R
Last active April 13, 2024 19:28
Tax calculation functions for 2023
medicare_tax = function(medi_income, filing_status="joint") {
addl_floor = switch(filing_status, joint=250000, mfs=125000, single=200000)
.0145*medi_income + pmax(0, .009*(medi_income-addl_floor))
}
social_security_tax = function(income) .062 * pmin(income, 160200)
tax_with_deductible_brackets = function(income, deductible, bracket_rates, bracket_mins) {
sum(pmax(0, diff(bracket_rates) * (income - deductible - bracket_mins)))
}
@MichaelChirico
MichaelChirico / select_left.py
Last active March 31, 2024 18:02
Select from 'left' file to resolve conflicts
def select_left(file):
with open(file) as f:
contents=f.read()
lines = contents.split('\n')
n_conflicts = sum([l.startswith('<<<<') for l in lines])
keep=True
outfile=[]
@MichaelChirico
MichaelChirico / cran_readchar_args.R
Created January 26, 2024 15:24
readChar() usage on CRAN
library(jsonlite)
library(data.table)
library(lintr)
read_page <- function(page) {
tmp <- tempfile()
on.exit(unlink(tmp))
system2("curl",
c("--location", "--request",
"GET", sprintf("'https://api.github.com/search/code?q=readChar+org:cran+language:R+-path:.Rd&per_page=100&page=%d'", page),
@MichaelChirico
MichaelChirico / gha_results.R
Last active November 17, 2023 16:47
GHA results
library(jsonlite)
library(data.table)
headers <- shQuote(c(
"-H", "Accept: application/vnd.github+json",
"-H", sprintf("Authorization: Bearer %s", Sys.getenv("GITHUB_PAT")),
"-H", "X-GitHub-Api-Version: 2022-11-28"
))
url_fmt <- "https://api.github.com/repos/r-lib/lintr/actions/runs?per_page=%d&page=%d"
@MichaelChirico
MichaelChirico / gist:7706fe44cf29b5e12007bf4e1c1f73ce
Created November 14, 2023 00:49
grep() slower than which(grepl())??
library(microbenchmark)
N = 1e4
v = sample(letters, N, TRUE)
microbenchmark(times = 200L, grep("a", v), which(grepl("a", v)), grep("[a-m]", v), which(grepl("[a-m]", v)), grep("[a-z]", v), which(grepl("[a-z]", v)))
# Unit: microseconds
# expr min lq mean median uq max neval cld
# grep("a", v) 630.561 640.4110 670.5257 655.4915 680.6115 866.132 200 b
# which(grepl("a", v)) 598.802 609.5220 637.0542 621.0965 652.9120 839.231 200 a
# grep("[a-m]", v) 692.601 708.4215 735.6788 721.3065 744.6365 1207.722 200 d