Skip to content

Instantly share code, notes, and snippets.

@relyt0925
Last active August 11, 2024 14:08
Show Gist options
  • Select an option

  • Save relyt0925/d36c19b9fd2fbd9bf39a59196d70fd07 to your computer and use it in GitHub Desktop.

Select an option

Save relyt0925/d36c19b9fd2fbd9bf39a59196d70fd07 to your computer and use it in GitHub Desktop.
ilab mt_bench eval log (ilab model evaluate --benchmark mt_bench --model /instructlab/models/tuned-0701-1954/samples_4992 --judge-model /instructlab/models/prometheus-eval/prometheus-8x7b-v2.0 --taxonomy-path /instructlab/taxonomy/ --output-dir /instructlab/mtbench)
INFO 2024-07-05 18:53:02,883 utils.py:145: _init_num_threads Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
INFO 2024-07-05 18:53:02,883 utils.py:148: _init_num_threads Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
INFO 2024-07-05 18:53:02,883 utils.py:161: _init_num_threads NumExpr defaulting to 16 threads.
INFO 2024-07-05 18:53:03,050 config.py:58: <module> PyTorch version 2.3.1 available.
Generating answers...
INFO 2024-07-05 18:53:12,971 vllm.py:148: run_vllm vLLM starting up on pid 212 at http://127.0.0.1:58173/v1
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 80/80 [01:02<00:00, 1.27it/s]
Evaluating answers...
INFO 2024-07-05 18:56:08,297 vllm.py:148: run_vllm vLLM starting up on pid 255 at http://127.0.0.1:48517/v1
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 160/160 [01:11<00:00, 2.23it/s]
# SKILL EVALUATION REPORT
## MODEL
/instructlab/models/tuned-0701-1954/samples_4992
### AVERAGE:
7.92 (across 85)
### TURN ONE:
8.39
### TURN TWO:
0.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment