Chris Bowdon cbowdon

SAVE YOUR SANITY

Things I know but need a checklist to ensure I address them systematically.

Dataset is balanced - dev, train and test. Do you have approx. balanced classes in each?
Is each dataset distinct? Have you checked for duplicates within and across datasets?
Is the data shuffled?
Spot check the data. Are the classifications consistent and in line with the objective?
Score the model before training. Is its accuracy close to random?
Train the simplest available model (no pretrained vectors) on a small subset of data (overfit). Does its loss improve? Does its accuracy improve to something better than random?

	(defmacro template (text)
	"Expand text like \"Hello <<name>>\" to (format \"Hello %s\" name)."
	(let ((pattern "<<\\(.*?\\)>>"))
	;; The regexp matches anything between delimiters, non-greedily
	(with-temp-buffer
	(save-excursion (insert text))
	(let ((matches '()))
	(while (re-search-forward pattern nil t)
	(push (match-string 1) matches)
	(replace-match "%s" t t))

	(defun org-element-path (path list-of-asts)
	"Select all contents of LIST-OF-ASTS matching PATH"
	(message (format "%s" path))
	(defun --subpath-match (p)
	(org-element--select-contents (car p) list-of-asts))
	(if (cadr path)
	(org-element-path (cdr path) (--subpath-match path))
	(--subpath-match path)))

	(defun --flatten (list-of-lists)

	graph "graph" {
	"fit[NN]" -- "monster[NN]" ["weight"="1"]
	"presence[NN]" -- "recollection[NN]" ["weight"="1"]
	"presence[NN]" -- "home[NN]" ["weight"="1"]
	"presence[NN]" -- "scene[NNS]" ["weight"="1"]
	"presence[NN]" -- "Clerval[NNP]" ["weight"="1"]
	"presence[NN]" -- "father[NN]" ["weight"="1"]
	"presence[NN]" -- "Elizabeth[NNP]" ["weight"="1"]
	"presence[NN]" -- "delight[NN]" ["weight"="1"]
	"presence[NN]" -- "nothing[NN]" ["weight"="1"]

	#!/usr/bin/env bash

	# Thanks to https://blog.hawkhost.com/2009/12/12/using-netcat-as-an-intercepting-proxy/

	server=${1:-localhost}
	localport=${2:-8080}
	remoteport=${3:-$localport}

	pipe=intercepting-proxy.pipe

	#########
	# Static typing for Polars dataframes
	#
	# Motivation: to have some visibility and static checks on the schema of DataFrames at boundaries
	# Approach: subclass of DataFrame with schema type parameter
	# Benefit: structural typing, runtime checks for agreement with declared types
	# Tradeoffs: types not inferred from arbitrary transforms, must always declare

	from polars import DataFrame, col, lit
	from typing import TypedDict, TypeVar, Mapping

	/**
	* This file demonstrates how to use Bing search for RAG with Claude.
	*
	* It is a very simple solution for doing RAG-assisted chat responses about a particular website.
	* Rather than having your own vector database etc., a quick-and-dirty approach is to call out to
	* a search engine like Bing. This means you cede control of the retrieval process and rely on Bing's
	* search result quality. In exchange for this, you can forget about embeddings, vector DBs, chunking,
	* etc. and just worry about your chat interface.
	*/