Skip to content

Instantly share code, notes, and snippets.

View Taylor-eOS's full-sized avatar

Taylor Taylor-eOS

View GitHub Profile
@vgel
vgel / r1.py
Last active August 14, 2025 13:13
script to run deepseek-r1 with a min-thinking-tokens parameter, replacing </think> with a random continuation string to extend the model's chain of thought
import argparse
import random
import sys
from transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache
import torch
parser = argparse.ArgumentParser()
parser.add_argument("question", type=str)
parser.add_argument(