Skip to content

Instantly share code, notes, and snippets.

@sugatoray
Forked from davidberenstein1957/sft_data_mlx.py
Created February 2, 2025 14:13
Show Gist options
  • Save sugatoray/e072da38619fba130952ee13231bd4f6 to your computer and use it in GitHub Desktop.
Save sugatoray/e072da38619fba130952ee13231bd4f6 to your computer and use it in GitHub Desktop.
# /// script
# requires-python = ">=3.11,<3.12"
# dependencies = [
# "distilabel[mlx]",
# ]
# ///
from distilabel.models import MlxLLM
from distilabel.pipeline import InstructionResponsePipeline
llm = MlxLLM(
path_or_hf_repo="mlx-community/Qwen2.5-32B-Instruct-4bit",
use_magpie_template=True,
magpie_pre_query_template="qwen2",
generation_kwargs={"temp": 1, "max_tokens": 4000},
)
pipeline = InstructionResponsePipeline(llm=llm, batch_size=5, num_rows=10)
if __name__ == "__main__":
dataset = pipeline.run()
dataset.push_to_hub("davidberenstein1957/sft-dataset")
@sugatoray
Copy link
Author

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment