Skip to content

Instantly share code, notes, and snippets.

@cbowdon
cbowdon / ai.ts
Created November 18, 2024 16:29
Example of leaning on Bing search to do RAG
/**
* This file demonstrates how to use Bing search for RAG with Claude.
*
* It is a very simple solution for doing RAG-assisted chat responses about a particular website.
* Rather than having your own vector database etc., a quick-and-dirty approach is to call out to
* a search engine like Bing. This means you cede control of the retrieval process and rely on Bing's
* search result quality. In exchange for this, you can forget about embeddings, vector DBs, chunking,
* etc. and just worry about your chat interface.
*/
@cbowdon
cbowdon / typed_dataframe.py
Created September 26, 2024 15:03
Static typing for Polars dataframes
#########
# Static typing for Polars dataframes
#
# Motivation: to have some visibility and static checks on the schema of DataFrames at boundaries
# Approach: subclass of DataFrame with schema type parameter
# Benefit: structural typing, runtime checks for agreement with declared types
# Tradeoffs: types not inferred from arbitrary transforms, must always declare
from polars import DataFrame, col, lit
from typing import TypedDict, TypeVar, Mapping
@cbowdon
cbowdon / sanity.md
Created April 27, 2023 09:57
ML classifier sanity checklist

SAVE YOUR SANITY

Things I know but need a checklist to ensure I address them systematically.

  1. Dataset is balanced - dev, train and test. Do you have approx. balanced classes in each?
  2. Is each dataset distinct? Have you checked for duplicates within and across datasets?
  3. Is the data shuffled?
  4. Spot check the data. Are the classifications consistent and in line with the objective?
  5. Score the model before training. Is its accuracy close to random?
  6. Train the simplest available model (no pretrained vectors) on a small subset of data (overfit). Does its loss improve? Does its accuracy improve to something better than random?
@cbowdon
cbowdon / intercepting-proxy.sh
Created January 4, 2019 17:20
Poor man's intercepting proxy
#!/usr/bin/env bash
# Thanks to https://blog.hawkhost.com/2009/12/12/using-netcat-as-an-intercepting-proxy/
server=${1:-localhost}
localport=${2:-8080}
remoteport=${3:-$localport}
pipe=intercepting-proxy.pipe
@cbowdon
cbowdon / frankenstein_chapter_5.dot
Created December 31, 2018 18:31
Graph of words for Frankenstein Chapter 5
graph "graph" {
"fit[NN]" -- "monster[NN]" ["weight"="1"]
"presence[NN]" -- "recollection[NN]" ["weight"="1"]
"presence[NN]" -- "home[NN]" ["weight"="1"]
"presence[NN]" -- "scene[NNS]" ["weight"="1"]
"presence[NN]" -- "Clerval[NNP]" ["weight"="1"]
"presence[NN]" -- "father[NN]" ["weight"="1"]
"presence[NN]" -- "Elizabeth[NNP]" ["weight"="1"]
"presence[NN]" -- "delight[NN]" ["weight"="1"]
"presence[NN]" -- "nothing[NN]" ["weight"="1"]
@cbowdon
cbowdon / org-element-path-api.el
Created January 14, 2016 22:06
Some lisp helper functions for doing XPATH-like queries on org-mode files
(defun org-element-path (path list-of-asts)
"Select all contents of LIST-OF-ASTS matching PATH"
(message (format "%s" path))
(defun --subpath-match (p)
(org-element--select-contents (car p) list-of-asts))
(if (cadr path)
(org-element-path (cdr path) (--subpath-match path))
(--subpath-match path)))
(defun --flatten (list-of-lists)
@cbowdon
cbowdon / template.el
Created November 1, 2015 20:11
Macro to add string interpolation to Emacs Lisp
(defmacro template (text)
"Expand text like \"Hello <<name>>\" to (format \"Hello %s\" name)."
(let ((pattern "<<\\(.*?\\)>>"))
;; The regexp matches anything between delimiters, non-greedily
(with-temp-buffer
(save-excursion (insert text))
(let ((matches '()))
(while (re-search-forward pattern nil t)
(push (match-string 1) matches)
(replace-match "%s" t t))