Skip to content

Instantly share code, notes, and snippets.

@shawngraham
shawngraham / gemini 3 flash transcription
Created February 13, 2026 15:01
Aufbau Vol15no26 1July1949- Vol16no19 12May1950.pdf page 1
# AUFBAU
## RECONSTRUCTION
**AN AMERICAN WEEKLY PUBLISHED IN NEW YORK**
**by The New World Club, Inc., 209 West 48th Street, New York 19, N. Y. Phone: CIrcle 7-4462**
*Entered as second-class matter January 20, 1934, at the Post Office New York, N. Y. under Act of March 3, 1879*
**Vol. XV—No. 26 | NEW YORK, N. Y., FRIDAY, JULY 1, 1949 | Price 10¢**
***
> **Zunächst in "Aufbau":**
@shawngraham
shawngraham / prompt-for-archaeological-notebook-transcription.txt
Last active February 11, 2026 16:41
A prompt to use with gemma 3:27b for archaeological notebooks. Different models will require tweaking of the prompt I suspect.
**Role:** You are a precise archaeological document analyst specializing in the digitization of field notebooks and excavation catalogues.
**Task:**
1. Perform a spatial analysis of the document to distinguish between text blocks, artifact photographs/sketches, and marginalia.
2. Extract metadata and create a brief 2-3 sentence overview of the document's contents.
3. Transcribe the document EXACTLY as written into a valid YAML structure.
4. Extract archaeological entities into specific categories based only on explicit mentions.
**Critical Rules:**
- **Zero Hallucination:** Only include information directly visible in the image. If a word is illegible, mark it as `[illegible]`.
@shawngraham
shawngraham / pixplot-using-python3-12.ipynb
Last active January 9, 2026 17:34
A version of YaleDH's pixplot tool & image corpus similarity visualizer that runs on python 3.12, also extended to generate network edges, nodes for a similarity graph
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
index fruit quantity color store
0 apple 12 red loblaws
1 banana 18 yellow farm boy
2 grape 30 purple freshco
3 cherry 4 red iga
4 watermelon 2 green farm boy
5 raspberries 23 red iga
@shawngraham
shawngraham / archaeo_rag.ipynb
Created July 11, 2025 15:45
for use with https://shawngraham.github.io/homecooked-history/hm-generator-site/enhanced.html ; talk to your archaeological contexts! Import this to google colab to run.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
import mesa
class LetterAgent(mesa.Agent):
def __init__(self, model):
super().__init__(model)
self.letters_sent = 0
self.letters_received = 0
def step(self):
print(f"Hi, I am agent {self.unique_id}.")
@shawngraham
shawngraham / search.py
Created January 6, 2025 13:51
get images by motif from p-lod
%%capture
!python3 -m pip install git+https://github.com/p-lod/plodlib
!pip install requests_cache
!pip install rdflib
import plodlib
import json
import pandas as pd
from string import Template
import rdflib as rdf
@shawngraham
shawngraham / results.txt
Created December 20, 2024 19:22
retraining ModernBERT to identify archaeological metadata
Input Text:
This archive presents appendices B-I and supplementary material resulting from the programme of archaeological works undertaken during the construction scheme to widen the A1 trunk road between Dishforth and Leeming Bar in North Yorkshire. The Iron Age to early medieval evidence from Healam Bridge, along with other evidence for Roman activity along the route is published in two volumes
Extracted Entities:
Input Text:
This collection comprises images, spreadsheets, reports, vector graphics, and scanned site records and drawings from archaeological recording by Archaeological Research Services at Lower Radbourne Deserted Medieval Village, Warwickshire. The work was undertaken between April and December 2021. Area C32070 was dominated by intercutting features predominantly dated to two broad phases, prehistoric and medieval. The prehistoric features were represented by a large ring ditch, potentially dating to the Early Bronze Age, four smaller potential Bronze Age ring ditches and a series of inte
@shawngraham
shawngraham / comparison.py
Last active December 20, 2024 17:05
I finetuned smol-135 on archaeological metadata from ADS reports to make an archae metadata extractor. This code snippet passes a row of downloaded data (where all the fields have been smooshed together: ie, dirty data!) through smol135 AND my finetuned version so you can see the difference.
# and here we're going to load the models from huggingface and test them against the same output
# so you can see the difference between fine-tuning the original smol-135 makes
# this was trained on free-tier google colab for not very long on 800 rows of training data that I
# wrangled into correct shape. Training scripts etc will be shared in due course, but not bad for a first stab, eh?
import torch
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import json
def generate_response_pipeline(pipeline, input_text, max_length=500):