Skip to content

Instantly share code, notes, and snippets.

View richinseattle's full-sized avatar

richinseattle

View GitHub Profile
@vgel
vgel / r1.py
Last active June 1, 2025 08:44
script to run deepseek-r1 with a min-thinking-tokens parameter, replacing </think> with a random continuation string to extend the model's chain of thought
import argparse
import random
import sys
from transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache
import torch
parser = argparse.ArgumentParser()
parser.add_argument("question", type=str)
parser.add_argument(
@cpfiffer
cpfiffer / thinking-cap.py
Created January 22, 2025 18:11
Limit the number of characters DeepSeek R1 can use for thinking.
import outlines
from transformers import AutoTokenizer
model_string = 'deepseek-ai/DeepSeek-R1-Distill-Qwen-7B'
# model_string = 'deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B' # For small machines
model = outlines.models.transformers(
model_string,
device='cuda', # also 'cpu', 'mps','auto'
)
@hathibelagal-dev
hathibelagal-dev / diatest.ipynb
Last active April 23, 2025 23:25
diatest.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.