Skip to content

Instantly share code, notes, and snippets.

View mbollmann's full-sized avatar

Marcel Bollmann mbollmann

View GitHub Profile
# Produced by running https://gist.github.com/mbollmann/827a079023ebdd18b4d06c28566fac0d
# with flags -o -e -c -w on commit 5a875471
1993.tmi.yaml['1993.tmi-1.17']: Value of root['author_string'] changed from "Pierre Isabelle, Marc Dymetman, George Foster, Jean-Marc Jutras, Elliott" to "Pierre Isabelle, Marc Dymetman, George Foster,
Jean-Marc Jutras, Elliott".
1993.tmi.yaml['1993.tmi-1.22']: Value of root['author_string'] changed from "Masaru Tomita, Masako Shirai, Junya Tsutsumi, Miki Matsumura, Yuki" to "Masaru Tomita, Masako Shirai, Junya Tsutsumi, Miki
Matsumura, Yuki".
2005.iwslt.yaml['2005.iwslt-1.6']: Value of root['author_string'] changed from "Sanjika Hewavitharana, Bing Zhao, Hildebrand, Almut Silja, Matthias Eck, Chiori Hori, Stephan Vogel, Alex Waibel" to
"Sanjika Hewavitharana, Bing Zhao, Hildebrand, Almut Silja, Matthias Eck, Chiori Hori, Stephan Vogel, Alex Waibel".
2006.amta.yaml['2006.amta-panel1.0']: Value of root['url'] changed from "https://aclanthology.org/2006.amta-panels.0/" to "h
@mbollmann
mbollmann / implicit.tsv
Created July 18, 2025 14:29
Names that are automatically matched to entries in name_variants.yaml (without being listed there)
ID Canonical Implicit variant
abdelmajid-ben-hamadou Ben Hamadou, Abdelmajid Hamadou, Abdelmajid Ben
adolfo-hernandez-h Hernández H., Adolfo H., Adolfo Hernández
adria-de-gispert de Gispert, Adrià De Gispert, Adria
adria-de-gispert de Gispert, Adrià de Gispert, Adria
ahmed-aburaed AbuRa’ed, Ahmed Abura’Ed, Ahmed
alberto-bugarin-diz Bugarín Diz, Alberto Bugarín-Diz, Alberto
alberto-bugarin-diz Bugarín Diz, Alberto Diz, Alberto Bugarín
alexander-g-hauptmann Hauptmann, Alexander G. Hauptmann, Alexander G
alexander-m-rush Rush, Alexander M. Rush, Alexander M
@mbollmann
mbollmann / dump_items.py
Created August 11, 2025 19:33
Script to dump all namespec–person associations in the ACL Anthology
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# Copyright 2025 Marcel Bollmann <marcel@bollmann.me>
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
@mbollmann
mbollmann / brew-ratios.typ
Created August 31, 2025 12:19
Table of coffee brew ratios, made in Typst with dynamic calculation of values
#import "@preview/zero:0.5.0": num, format-table
// Configure output
#set text(font: "Libertinus Serif")
#show math.equation: set text(font: "Libertinus Math")
#let min-water = 200
#let max-water = 500
#let min-ratio = 13
#let max-ratio = 20
@mbollmann
mbollmann / print_author_assignments.py
Created January 18, 2026 12:31
Script for acl-org/acl-anthology to dump all paper–author assignments into a file
# -*- coding: utf-8 -*-
#
# Copyright 2026 Marcel Bollmann <marcel@bollmann.me>
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
@mbollmann
mbollmann / yaml_vs_json_parsing.py
Created February 20, 2026 16:22
Timing YAML vs. JSON parsing of the ACL Anthology people database
import msgspec
import timeit
import yaml
from yaml import CLoader
yaml_path = "../data/yaml/people.yaml"
json_path = "../data/yaml/people.json"
def load_yaml():