Basit Mustafa 24601

PTRM: how a 7-million-parameter model beats frontier LLMs on reasoning puzzles

A plain-language writeup by Vuk Rosić of the paper "Probabilistic Tiny Recursive Model" (Sghaier, Parviz & Jolicoeur-Martineau, Mila — arXiv:2605.19943). Independent summary, not affiliated with or endorsed by the authors.

A 7-million-parameter model just beat an ensemble of 7 frontier LLMs on reasoning puzzles — about 10,000× cheaper, and with zero retraining.

The problem

Tiny Recursive Models (TRM) solve a puzzle by repeatedly refining a hidden "working state" and their current best answer with the same small two-layer network. But that refinement is deterministic: one input always traces one path, and that path can settle on a wrong answer with no way out. The paper shows many of TRM's failures are runs trapped in a bad "basin" — a region of the hidden space that decodes to a wrong answer and that the deterministic loop can't escape.

LLM Wiki v2

A pattern for building personal knowledge bases using LLMs. Extended with lessons from building agentmemory 20K+ Stars ⭐️, a persistent memory engine for AI coding agents.

This builds on Andrej Karpathy's original LLM Wiki idea file. Everything in the original still applies. This document adds what we learned running the pattern in production: what breaks at scale, what's missing, and what separates a wiki that stays useful from one that rots.

What the original gets right

The core insight is correct: stop re-deriving, start compiling. RAG retrieves and forgets. A wiki accumulates and compounds. The three-layer architecture (raw sources, wiki, schema) works. The operations (ingest, query, lint) cover the basics. If you haven't read the original, start there.

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

name	orchestrating-swarms
description	Master multi-agent orchestration using Claude Code's TeammateTool and Task system. Use when coordinating multiple agents, running parallel code reviews, creating pipeline workflows with dependencies, building self-organizing task queues, or any task benefiting from divide-and-conquer patterns.

Claude Code Swarm Orchestration

Master multi-agent orchestration using Claude Code's TeammateTool and Task system.

This is an OPML version of the HN Popularity Contest results for 2025, for importing into RSS feed readers.

Plug: if you want to find content related to your interests from thousands of obscure blogs and noisy sources like HN Newest, check out Scour. It's a free, personalized content feed I work on where you define your interests in your own words and it ranks content based on how closely related it is to those topics.

Inferal Workspace Architecture: How We Work at Inferal

(Blog version available at https://inferal.com/blog/workspace-architecture/)

Update: Inferal Workspace is going open source!

Your org's brain that AI can use

This is not our core product. This document describes our internal operating environment - how we run the company. We share it to show the environment you'd join and demonstrate our philosophy in action. For what we're building, see What We're Building below.

❯ cd /data/projects/coding_agent_session_search ❯ cc

▐▛███▜▌ Claude Code v2.0.54 ▝▜█████▛▘ Opus 4.5 · Claude Max ▘▘ ▝▝ /data/projects/coding_agent_session_search

read AGENTS.md and the README and explore the project deeply. Use ultrathink

● I'll start by reading the AGENTS.md and README files, then explore the project structure in depth.

Native Secure Enclave backed ssh keys on MacOS

It turns out that MacOS Tahoe can generate and use secure-enclave backed SSH keys! This replaces projects like https://github.com/maxgoedjen/secretive

There is a shared library /usr/lib/ssh-keychain.dylib that traditionally has been used to add smartcard support to ssh by implementing PKCS11Provider interface. However since recently it also implements SecurityKeyProivder which supports loading keys directly from the secure enclave! SecurityKeyProvider is what is normally used to talk to FIDO2 devices (e.g. libfido2 can be used to talk to your Yubikey). However you can now use it to talk to your Secure Enclave instead!

	"""Fusion-style delegation harness built with the OpenHands SDK.

	Install:
	uv pip install openhands-sdk openhands-tools

	Run:
	export LLM_API_KEY="..." # or export OPENHANDS_API_KEY="..."
	export MAIN_MODEL="openhands/gpt-5.5"
	export SIDEKICK_MODEL="openhands/minimax-m2.7"
	uv run python fusion_harness_example.py "Find and fix the failing tests in this repo."

	"""
	The most atomic way to train and run inference for a GPT in pure, dependency-free Python.
	This file is the complete algorithm.
	Everything else is just efficiency.

	@karpathy
	"""

	import os # os.path.exists
	import math # math.log, math.exp