Skip to content

Instantly share code, notes, and snippets.

@cpfiffer
Created January 22, 2025 18:11
Show Gist options
  • Save cpfiffer/5d1cc473e1da736e092968add10b0a69 to your computer and use it in GitHub Desktop.
Save cpfiffer/5d1cc473e1da736e092968add10b0a69 to your computer and use it in GitHub Desktop.
Limit the number of characters DeepSeek R1 can use for thinking.
import outlines
from transformers import AutoTokenizer
model_string = 'deepseek-ai/DeepSeek-R1-Distill-Qwen-7B'
# model_string = 'deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B' # For small machines
model = outlines.models.transformers(
model_string,
device='cuda', # also 'cpu', 'mps','auto'
)
tokenizer = AutoTokenizer.from_pretrained(model_string)
thinking_regex = r'<think>(.|\n){500}\n\[THINKING_TRUNCATED\]\n</think>(yes|no)'
prompt = tokenizer.apply_chat_template(
[
{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Roses are red. Violets are blue. Are roses and violets the same color? Yes or no.'},
],
tokenize=False,
add_generation_prompt=True,
)
# Generator
generator = outlines.generate.regex(model, thinking_regex)
print("Generator created")
# Result
result = generator(prompt)
print(result)
@wd021
Copy link

wd021 commented Jul 9, 2025

!! AI and chill on God Tier Prompts :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment