Skip to content

Instantly share code, notes, and snippets.

@relyt0925
Created July 27, 2024 04:29
Show Gist options
  • Save relyt0925/58852c7e2f8c1d8538db6b2bc361f0e5 to your computer and use it in GitHub Desktop.
Save relyt0925/58852c7e2f8c1d8538db6b2bc361f0e5 to your computer and use it in GitHub Desktop.
sdgv018_output
[root@tyler-rhel-newimage instructlab]# /root/ilab data generate --taxonomy-path /var/instructlabbigdisk/instructlab/.local/share/instructlab/taxonomy/compositional_skills/writing/grounded/editing/spelling/qna.yaml --endpoint-url https://781d2e7c-us-east.lb.appdomain.cloud/v1 --model-family mixtral --sdg-scale-factor 100 --model /instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1 --output-dir /var/instructlabbigdisk/instructlab/generateddata/ --tls-insecure --rouge-threshold 1.0
INFO 2024-07-27 04:27:22,396 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
INFO 2024-07-27 04:27:22,396 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
INFO 2024-07-27 04:27:22,397 numexpr.utils:161: NumExpr defaulting to 16 threads.
INFO 2024-07-27 04:27:22,785 datasets:58: PyTorch version 2.3.1 available.
Generating synthetic data using '/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1' model, taxonomy:'/var/instructlabbigdisk/instructlab/.local/share/instructlab/taxonomy/compositional_skills/writing/grounded/editing/spelling/qna.yaml' against https://781d2e7c-us-east.lb.appdomain.cloud/v1 server
INFO 2024-07-27 04:27:23,615 instructlab.sdg:358: Synthesizing new instructions. If you aren't satisfied with the generated instructions, interrupt training (Ctrl-C) and try adjusting your YAML files. Adding more examples may help.
INFO 2024-07-27 04:27:23,622 instructlab.sdg.pipeline:131: Running pipeline single-threaded
INFO 2024-07-27 04:27:23,715 instructlab.sdg.llmblock:49: LLM server supports batched inputs: True
INFO 2024-07-27 04:27:23,715 instructlab.sdg.pipeline:169: Running block: gen_skill_grounded
INFO 2024-07-27 04:27:23,716 instructlab.sdg.pipeline:170: Dataset({
features: ['task_description', 'seed_context', 'seed_question', 'seed_response'],
num_rows: 1
})
INFO 2024-07-27 04:27:26,127 instructlab.sdg:385: Generated 1 samples
INFO 2024-07-27 04:27:26,132 instructlab.sdg:405: Generation took 2.57s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment