Skip to content

Instantly share code, notes, and snippets.

@frutik
Last active January 17, 2025 16:17
Show Gist options
  • Save frutik/e4975a34ca93350fb9754013cdf2c071 to your computer and use it in GitHub Desktop.
Save frutik/e4975a34ca93350fb9754013cdf2c071 to your computer and use it in GitHub Desktop.
from mlx_lm import load, generate
model, tokenizer = load('Qwen/Qwen2-7B-Instruct-MLX', tokenizer_config={"eos_token": "<|im_end|>"})
prompt = "Why people call putin khuilo."
messages = [
{"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
result = generate(model, tokenizer, prompt=text, verbose=True, max_tokens=512)
pip install mlx-lm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment