Skip to content

Instantly share code, notes, and snippets.

View NickCrews's full-sized avatar

Nick Crews NickCrews

View GitHub Profile
@NickCrews
NickCrews / duckpg.py
Last active January 14, 2026 17:22
Usage `uv run https://gist.github.com/NickCrews/cb2ff440bcf34f951bb245a239368eb6 sync --pg-url 'postgresql://user:pass@host:port/db' --where "schema IN ('crm','ai','districts','users',) and table_name <> 'donations_raw'" --on-exists=ignore`. This is very susceptible to SQL injection acttacks with the --where option, use with caution!
"""
Inspect and sync PostgreSQL databases with DuckDB.
This is very susceptible to SQL injection acttacks with the --where option, use with caution!
"""
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "duckdb",
@NickCrews
NickCrews / mt-roll-calls-for-vis.json
Last active January 23, 2026 19:57
Recent votes taken by Montana's congressional delegation. Scraped Daily by Ship Creek Group https://www.shipcreekgroup.com/
[
{
"display_vote_date":"01\/23\/2026",
"vote_description":"To Agree: ",
"vote_result":"Failed",
"document_url":"https:\/\/www.congress.gov\/bill\/119\/house-concurrent-resolution\/68",
"document_description":"To direct the removal of United States Armed Forces from Venezuela that have not been authorized by Congress.",
"parent_document_url":"https:\/\/www.congress.gov\/bill\/119\/house-concurrent-resolution\/68",
"document_type":"H.Con.Res.",
"parent_document_type":"H.Con.Res.",
@NickCrews
NickCrews / description.md
Last active July 17, 2025 20:51
Reproduction for NGP bogusly linking contacts during contact creation

I was uploading many csv's in parallel to MyCampaign, for a total of ~660k contacts across 17 uploads, each with 39,998 contacts per csv.

In one of the uploads, this is what the csv sheet_01.csv looks like:

person__id,voterfile_vanid,occupation,employer,prefix,first,middle,last,suffix,nickname
...
3646c0f3-39ba-4756-bf2f-c33f2109eedb,228485,,,MWWIWFX,KEWNEQE,DIEMIWZ,AHCLCXE,BZYMPZI,JZISXJJ
...
@NickCrews
NickCrews / roll-calls-for-vis.json
Last active January 3, 2026 19:52
JSON data, updated daily, with one record per roll call vote in us congress. This is used in a website to investigate the voting patterns of the Alaska delegation (Murkowski, Sullivan, Begich at the time of this writing).
[
{
"display_vote_date":"12\/18\/2025",
"vote_description":"To Agree: ",
"vote_result":"Joint Resolution Defeated",
"document_url":null,
"document_description":"A joint resolution providing for congressional disapproval under chapter 8 of title 5, United States Code, of the rule submitted by the Office of the Secretary of the Department of Health and Human Services relating to \"Policy on Adhering to the Text of the Administrative Procedure Act\".",
"parent_document_url":null,
"document_type":null,
"parent_document_type":null,
@NickCrews
NickCrews / make_iterable_none_safe.py
Created August 25, 2024 02:47
Sometimes I work with huggingface transformers pipelines. These can do batch inference on text with a signature of `Iterable[str] -> Iterable[str]`. I run into issues when using these with pyarrow string arrays, which can contain NULL values. I need NULLs to be preserved, and the order to be preserved, but I can't pass None to the huggingface pi…
from typing import Iterable, Callable, TypeVar
T = TypeVar("T")
R = TypeVar("R")
def make_none_safe(func: Callable[[Iterable[T]], Iterable[R]], *, batch_size: int | None = None) -> Callable[[Iterable[T | None]], Iterable[R]]:
"""Turn `iterable -> iterable` function into one that is safe for None values.
Consider if you have a function of the form `Iterable[T] -> Iterable[R]`,
and this function is delicate and will raise an error if it encounters
@NickCrews
NickCrews / syncSelfLinks.js
Last active June 14, 2024 07:50
For keeping self-links in sync in AirTable
@NickCrews
NickCrews / fec_pgdump_to_parquets.sh
Created March 12, 2024 17:27
This script takes the Federal Election Commission's weekly PostgreSQL dump file and converts it to a directory of parquet files, using an ephemeral postgres instance in a docker container and duckdb.
#!/bin/bash
# This script takes the FEC's PostgreSQL dump file and converts it to a directory
# of parquet files.
# See https://cg-519a459a-0ea3-42c2-b7bc-fa1143481f74.s3-us-gov-west-1.amazonaws.com/bulk-downloads/index.html?prefix=bulk-downloads/data-dump/schedules/
# for the PostgreSQL dump file and more info.
#
# This requires you to
# 1. Have Docker installed and running
# 2. Have the `duckdb` command line tool installed
@NickCrews
NickCrews / strava_in_gaia.md
Created November 24, 2023 21:25
Add Strava Heatmap to Gaia GPS

Adding Strava Global Heatmap to Gaia

This adds the Strava Global Heatmap layer to Gaia GPS, so you can see common tracks on where other people have been outside. Like this:

image

Steps

  1. Log into https://www.gaiagps.com/map
  2. On the left sidebar, go to Layers
"""Get the Facebook page IDs for a set of facebook URLs.
Uses Playwright to emulate me going to the page and looking at the page ID in the
actual HTML. I haven't found a more programmatic way to do this without
more complicated developer signup and API keys.
Uses cookies for authentication, as described in
https://github.com/kevinzg/facebook-scraper/blob/392be1eabb43ed301fb7d5c3fd6e10318d26ac27/README.md
"""
from __future__ import annotations
@NickCrews
NickCrews / ibis_utils.py
Created February 7, 2023 20:09
Round-tripping Pandas -> Ibis -> Pandas
AnyColOrTable = TypeVar("AnyColOrTable", Column, Table, pd.Series, pd.DataFrame)
def convert_to_ibis(
func: Callable[[ColOrTable], ColOrTable]
) -> Callable[[AnyColOrTable], AnyColOrTable]:
"""Decorator that translates pandas series to Columns and DFs to Tables,
applies the function, and then converts back to pandas."""
@functools.wraps(func)