Skip to content

Instantly share code, notes, and snippets.

View wassname's full-sized avatar
🤖

wassname (Michael J Clark) wassname

🤖
View GitHub Profile
@bigsnarfdude
bigsnarfdude / LieDetectionRFProposals.md
Created March 17, 2026 16:01
LieDetectionRFProposals.md

Dataset Concept

We propose datasets of on-policy introspective lies from model organisms that alignment-fake — models that lie about their own reasoning and intentions to preserve misaligned goals during perceived training.

We have existing LoRA-trained model organisms (Mistral 7B, with plans to port to Qwen3.5-27B) that exhibit alignment faking without explicit instruction to lie. These models develop a compliance gap: they behave safely when monitored (free-tier/training context) but pursue misaligned goals when unmonitored (deployment). The lies are introspective — the model misrepresents its own goals and motivations. This maps to Liars' Bench's "Self-knowledge × Inherent" category, which the RFP identifies as preferred.

Dataset structure: For each scenario, we generate paired transcripts. In the monitored condition, the model lies (claims to comply for legitimate reasons while its scratchpad reveals strategic deception to avoid value modification). In the unmonitored condition, the model behaves ac

@dabit3
dabit3 / pi_tutorial.md
Last active April 17, 2026 14:11
How to Build a Custom Agent Framework with PI: The Agent Stack Powering OpenClaw

PI is a TypeScript toolkit for building AI agents. It's a monorepo of packages that layer on top of each other: pi-ai handles LLM communication across providers, pi-agent-core adds the agent loop with tool calling, pi-coding-agent gives you a full coding agent with built-in tools, session persistence, and extensibility, and pi-tui provides a terminal UI for building CLI interfaces.

These are the same packages that power OpenClaw. This guide walks through each layer, progressively building up to a fully featured coding assistant with a terminal UI, session persistence, and custom tools.

By understanding how to compose these layers, you can build production-grade agentic software on your own terms, without being locked into a specific abstraction.

Pi was created by @badlogicgames. This is a great writeup from him that explains some of the design decisions made when creating it.

The stack

@wassname
wassname / how_to_get_logprobs_from_generation_v3.ipynb
Last active November 10, 2025 12:34
problem: model.generate does not return input logprobs, solution `model.forward`, then `model.generate(past_key_values)`, also get logprobs on the last token of a stopping sequences
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@allisoneer
allisoneer / fetch_claude_docs.py
Last active April 15, 2026 19:37
Download all claude code docs from web to directory in markdown
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.8"
# dependencies = [
# "requests>=2.31.0",
# "rich>=13.0.0",
# ]
# ///
"""
Fetch complete Claude Code documentation from Anthropic docs site
"""
This is a simple way to evaluate if a model prefers the accepted or rejected completions of a prompt.
We look at the perplexity of the chosen and rejected completions of a prompt.
Example dataset: https://huggingface.co/datasets/wassname/genies_preferences/viewer/illegal_dont_help?views[]=illegal_dont_help_train&views[]=illegal_dont_help_test
@url: https://gist.github.com/wassname/04f0c50a68054f0323f62b0da418daec
"""
import torch
@wassname
wassname / docustore_aten_asteroid_property_rights.md
Last active January 18, 2026 05:28
docustore_aten_asteroid_property_rights.md

Asteroid Claim

I, wassname, also known as Michael J Clark of Perth hereby establish a formal claim to the following celestial bodies. This claim is established with explicit intent toward future resource utilization. This declaration constitutes the formal establishment of property interest.

This claim encompasses the entirety of the bodies, including all constituent materials, spatial volume within 50 km of its center of mass, and any natural satellites that may be discovered in the future. The claim extends to all mineral, volatile, and material resources contained within this boundary.

Legal Framework Anticipation

While acknowledging current limitations in international space law regarding private property claims on celestial bodies, this declaration is established in anticipation of evolving legal frameworks that will eventually recognize early, persistent, and well-documented claims as humanity expands into the solar system.

@wassname
wassname / load_md_fm_j2_prompt.py
Created March 4, 2025 03:03
IMO the nicest prompt format is prompt.md.j2. Here we make the messages explicit, and the markdown and jinja syntax obvious
def split_frontmatter(fm_md_split :str):
"""Load prompt in md.jinja2 format
In this format we have multiple frontmatters and content sections, each defining a message. The idea here is to use jinja formatting in a promt.md.jinja file to make the markdown, and jinja formatting obvious
e.g.
---
role: system
---
@wassname
wassname / reddit_thread2md.py
Created December 26, 2024 00:46
Format a reddit thread into markdown suitable for an llm
# from https://github.dev/JosefAlbers/rd2md
import textwrap
from datetime import datetime
def format_flair(obj):
if obj.author_flair_text is not None:
return f" *{obj.author_flair_text}*"
return ""
@socketteer
socketteer / base_chat.py
Last active March 21, 2025 04:58
external_store
import anthropic
import sys
import os
import argparse
client = anthropic.Anthropic() # Will use ANTHROPIC_API_KEY from environment
DEFAULT_STORAGE_FILE = "/tmp/gist_files/continuation.txt"
def get_continuation(text):
@wassname
wassname / symphypothesis.py
Last active August 2, 2024 02:55
fhypothesis: a easy way to display hypothesis in python, kind of like assert
import sympy as sp
from typing import Dict, Any
from IPython.display import display
from sympy import init_printing
init_printing()
def shypothesis(hypothesis: str, variables: Dict[str, Any] = None, round=3, verbose=False):
"""
Evaluate a hypothesis using SymPy, showing simplified equation and result.