tokenbender tokenbender

Paper OCR System - Complete Architecture

Overview

A nested agent architecture for high-accuracy PDF OCR and structured note generation.

Location

name	description
pdf-ocr-feedback	High-accuracy OCR pipeline using Maj@K consensus voting, structured self-evaluation, and adaptive compute budgets to achieve ≥95% transcription accuracy.

When to Use

Use when transcribing PDF pages via vision model and you need high accuracy — especially for:

Equations or mathematical notation
Tables with complex structure (3+ columns, merged cells)

The Gentle Art of Teaching Machines to Speak

A Journey Through Semantic Reinforcement Learning

For the curious mind who has just discovered that language models can learn, and wonders if there might be a kinder way to teach them.

In the hushed moments before dawn, ten thousand starlings lift from a field as one—not because any single bird commands them, but because each learns from its neighbors' subtle shifts, creating a collective intelligence far greater than the sum of its parts.

      >         >
>      >    >        >

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

	#!/usr/bin/env python3

	import argparse, os, gc, json, random, csv
	os.environ.setdefault("PYTORCH_CUDA_ALLOC_CONF", "expandable_segments:True")
	# os.environ.setdefault("CUDA_LAUNCH_BLOCKING", "1")

	import numpy as np
	import torch
	import torch.nn.functional as F
	from datasets import load_dataset

	import os
	import sys
	import time
	import math
	import pickle
	from contextlib import nullcontext
	from pathlib import Path
	import subprocess
	from dataclasses import dataclass
	import inspect

	You have to assign tasks to a junior engineer to solve a user problem. The user problem could be of various forms:
	- Adding a feature
	- Debugging a failing test case
	- Understanding a feature in the codebase
	- A GitHub issue raised on the codebase

	## Instructions

	### Repository status
	Repository Name: astropy (update if different)

	/*
	Graph implementation following tutorial http://www.geeksforgeeks.org/graph-and-its-representations/
	*/
	#include<iostream>
	#include<cstdlib>
	using namespace std;

	//struct for an adjacency list node
	struct AdjListNode{
	int data;

	/* AVL Tree Implementation in C++ */
	/* Harish R */


	#include<iostream>

	using namespace std;

	class BST
	{

	#include <iostream>
	#include <math.h>
	using namespace std;

	template <class T>
	struct Node {
	T value;
	Node *left;
	Node *right;