nuff: one tiny file of reusable Python tricks — attribute-dicts, typed CSV, pretty-print, seeded randomness, non-parametric stats, minimal column summaries, and row distances. Pure stdl
Example text-mining datasets: four software-engineering
systematic-review "reading" corpora (Hall, Wahono, Radjenovic,
Kitchenham). Each ships two ways — NAME.csv (processed feature
table) and NAME_raw.csv (raw abstracts). Self-describing CSV
headers; data only, no code. Feeds the textmine demos in
ezr.
ezr — explainable multi-objective optimization. Two files, ~1100 lines, zero dependencies, pure Python stdlib. An experiment in "how low can you go?": active learning labels a few
KAH: one file, ~50 short Lua functions that kept reappearing across my other Lua projects: lists, strings, random, csv, stats (incl. effect-size tests + confusion matrix), objects, tests. No dependencies beyond Lua 5.3+. Every function: one line of comment, a few lines of code, 65 columns max.
Example classification datasets: 73 CSV files (anneal, audiology, COMPAS, diabetes, soybean, vote, ...) with self-describing headers — the klass column ends in '!', so no separate schema files are needed. Data only, no code.
Example regression datasets: 61 CSV files (abalone, auto93, housing, wine quality, ...) with self-describing headers — column names encode type and goal, so no separate schema files are needed. Data only, no code.
How to convert source code into a textbook. Prompts and exemplars
for generating "genetic stanza" REPL
tutorials: source files that read top-to-bottom as a short paper
and load top-to-bottom as code, narrated by numbered REPL traces
([1]> ... [48]>). Includes the authoring prompt, a worked
example (Lisp data mining), and the K&R tone
lithp is less library: the smallest set of Common Lisp
add-ons that measurably shrinks application code — five plain
constructs plus a handful of utilities, ~130 lines total, no
luk is the .luk language: a tiny indentation-based dialect that transpiles to Lua via luk.lua (~100-line module). luk.lua returns a single function: `local lua_src = r
Standard fairness-benchmark CSVs in fft column-suffix format. Adult, German, Bank, COMPAS. Each has a sensitive attribute (race / sex / age) and a binary klass. Use with fft.
# install and test
git clone http://tiny.cc/fairnez && git clone http://tiny.cc/fft fft
c
