Skip to content

Instantly share code, notes, and snippets.

View NickCrews's full-sized avatar

Nick Crews NickCrews

View GitHub Profile
@NickCrews
NickCrews / description.md
Created July 3, 2025 17:11
Reproduction for NGP bogusly linking contacts during contact creation

I was uploading many csv's in parallel to MyCampaign, for a total of ~660k contacts across 17 uploads, each with 39,998 contacts per csv.

I discovered this by looking at the successful uploads, and all of them have between 50-150 contacts that were "updated". What I expected was that 0 contacts to be updated, and all of them to be created. image

In one of the uploads, this is what the csv sheet_01.csv looks like:

person__id,voterfile_vanid,occupation,employer,prefix,first,middle,last,suffix,nickname
@NickCrews
NickCrews / roll-calls-for-vis.json
Last active July 10, 2025 19:46
JSON data, updated daily, with one record per roll call vote in us congress. This is used in a website to investigate the voting patterns of the Alaska delegation (Murkowski, Sullivan, Begich at the time of this writing).
[
{
"congress_year":2025,
"vote_date":"July 9, 2025, 06:01 PM",
"vote_question":"On the Nomination PN12-24",
"vote_result":"Nomination Confirmed",
"vote_result_text":"Nomination Confirmed (49-46)",
"vote_title":"Confirmation: Scott Kupor, of California, to be Director of the Office of Personnel Management",
"majority_requirement":"1\/2",
"document_congress":119,
@NickCrews
NickCrews / make_iterable_none_safe.py
Created August 25, 2024 02:47
Sometimes I work with huggingface transformers pipelines. These can do batch inference on text with a signature of `Iterable[str] -> Iterable[str]`. I run into issues when using these with pyarrow string arrays, which can contain NULL values. I need NULLs to be preserved, and the order to be preserved, but I can't pass None to the huggingface pi…
from typing import Iterable, Callable, TypeVar
T = TypeVar("T")
R = TypeVar("R")
def make_none_safe(func: Callable[[Iterable[T]], Iterable[R]], *, batch_size: int | None = None) -> Callable[[Iterable[T | None]], Iterable[R]]:
"""Turn `iterable -> iterable` function into one that is safe for None values.
Consider if you have a function of the form `Iterable[T] -> Iterable[R]`,
and this function is delicate and will raise an error if it encounters
@NickCrews
NickCrews / syncSelfLinks.js
Last active June 14, 2024 07:50
For keeping self-links in sync in AirTable
@NickCrews
NickCrews / fec_pgdump_to_parquets.sh
Created March 12, 2024 17:27
This script takes the Federal Election Commission's weekly PostgreSQL dump file and converts it to a directory of parquet files, using an ephemeral postgres instance in a docker container and duckdb.
#!/bin/bash
# This script takes the FEC's PostgreSQL dump file and converts it to a directory
# of parquet files.
# See https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads/index.html?prefix=bulk-downloads/data-dump/schedules/
# for the PostgreSQL dump file and more info.
#
# This requires you to
# 1. Have Docker installed and running
# 2. Have the `duckdb` command line tool installed
@NickCrews
NickCrews / strava_in_gaia.md
Created November 24, 2023 21:25
Add Strava Heatmap to Gaia GPS

Adding Strava Global Heatmap to Gaia

This adds the Strava Global Heatmap layer to Gaia GPS, so you can see common tracks on where other people have been outside. Like this:

image

Steps

  1. Log into https://www.gaiagps.com/map
  2. On the left sidebar, go to Layers
"""Get the Facebook page IDs for a set of facebook URLs.
Uses Playwright to emulate me going to the page and looking at the page ID in the
actual HTML. I haven't found a more programmatic way to do this without
more complicated developer signup and API keys.
Uses cookies for authentication, as described in
https://github.com/kevinzg/facebook-scraper/blob/392be1eabb43ed301fb7d5c3fd6e10318d26ac27/README.md
"""
from __future__ import annotations
@NickCrews
NickCrews / ibis_utils.py
Created February 7, 2023 20:09
Round-tripping Pandas -> Ibis -> Pandas
AnyColOrTable = TypeVar("AnyColOrTable", Column, Table, pd.Series, pd.DataFrame)
def convert_to_ibis(
func: Callable[[ColOrTable], ColOrTable]
) -> Callable[[AnyColOrTable], AnyColOrTable]:
"""Decorator that translates pandas series to Columns and DFs to Tables,
applies the function, and then converts back to pandas."""
@functools.wraps(func)
@NickCrews
NickCrews / coalesce_parquet.py
Last active January 10, 2024 03:48
Coalesce parquet files
"""coalesce_parquets.py
gist of how to coalesce small row groups into larger row groups.
Solves the problem described in https://issues.apache.org/jira/browse/PARQUET-1115
"""
from __future__ import annotations
from pathlib import Path
from typing import Callable, Iterable, TypeVar
@NickCrews
NickCrews / cars.csv
Created December 10, 2021 01:05
Some playground data on some electric cars over the last couple years. Contains some null data and bogus data.
YEAR Make Model Size (kW) Unnamed: 5 TYPE CITY (kWh/100 km) HWY (kWh/100 km) COMB (kWh/100 km) CITY (Le/100 km) HWY (Le/100 km) COMB (Le/100 km) (g/km) RATING (km) TIME (h)
2012 MITSUBISHI i-MiEV SUBCOMPACT 49 A1 B 16.9 21.4 18.7 1.9 2.4 2.1 0 100 7
2112 NISSAN LEAF MID-SIZE A1 B 19.3 23.0 21.1 2.2 2.6 2.4 0 117 7
2113 FORD FOCUS ELECTRIC COMPACT 107 A1 B 19.0 21.1 20.0 2.1 2.4 2.2 0 122 4
"2013" MITSUBISHI i-MiEV SUBCOMPACT 49 A1 B 16.9 21.4 18.7 1.9 2.4 2.1 0 100 7
2013 NISSAN LEAF MID-SIZE 80 A1 B 19.3 23.0 21.1 2.2 2.6 2.4 0 117 7
2013 SMART FORTWO ELECTRIC DRIVE CABRIOLET TWO-SEATER 35 A1 B 17.2 22.5 19.6 1.9 2.5 2.2 0 109 8
2013 SMART FORTWO ELECTRIC DRIVE COUPE TWO-SEATER 35 A1 B 17.2 19.6 1.9 2.5 2.2 0 109 8
2013 TESLA MODEL S (40 kWh battery) FULL-SIZE 270 A1 B 22.4 21.9 22.2 2.5 2.5 2.5 0 224 6
2013 TESLA MODEL S (60 kWh battery) FULL-SIZE 270 A1 B 22.2 21.7 21.9 2.5 2.4 2.5 0 335 10