This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-a100 instructlab]# head -n 100 /var/mnt/inststg1/instructlab/generated/skills_train_msgs_2024-08-17T15_42_00.jsonl | |
{"messages":[{"content":"I am, Red Hat\u00ae Instruct Model based on Granite 7B, an AI language model developed by Red Hat and IBM Research, based on the Granite-7b-base language model. My primary function is to be a chat assistant.","role":"system"},{"content":"Suppose there are two gas stations, A and B, located on opposite sides of a highway. Both charge the same price, $2.50 per gallon, for gasoline. However, station A is 1 mile closer to the majority of drivers, while station B is 1 mile closer to the minority of drivers. The cost of driving 1 mile is $0.10. If station A and B both advertise their prices, which station will attract more customers and what will be their profits?","role":"user"},{"content":"To determine which station will attract more customers and their profits, we need to consider the cost of driving to each station for the majority and minority of drivers.\n\nL |
This file has been truncated, but you can view the full file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-a100 instructlab]# cat /var/mnt/inststg1/instructlab/generated/knowledge_train_msgs_2024-08-17T15_42_00.jsonl | |
{"messages":[{"content":"I am, Red Hat\u00ae Instruct Model based on Granite 7B, an AI language model developed by Red Hat and IBM Research, based on the Granite-7b-base language model. My primary function is to be a chat assistant.","role":"system"},{"content":"<|user|>\nPersonal data, also known as personal information or personally identifiable information (PII), refers to any information related to an identifiable person. The term PII has four common variants based on personal or personally, and identifiable or identifying. The effective definitions of PII vary depending on the jurisdiction and the purpose for which it is used. Under the General Data Protection Regulation (GDPR) in the European Union and United Kingdom, the term \"personal data\" is broader, determining the scope of the regulatory regime. The National Institute of Standards and Technology Special Publication 800-122 de |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-a100 generated]# cat /var/mnt/inststg1/instructlab/generated//node_datasets_2024-08-17T15_42_00/mmlubench_knowledge_compliance_personally-identifiable-information.jsonl | |
{"icl_document":"hii","document":"# Personal Data\n\n## Overview\n\nPersonal data, also known as personal information or personally identifiable information (PII), is any information related to an identifiable person.\n\nThe abbreviation PII is widely accepted in the United States, but the phrase it abbreviates has four common variants based on personal or personally, and identifiable or identifying. Not all are equivalent, and for legal purposes the effective definitions vary depending on the jurisdiction and the purposes for which the term is being used. Under European Union and United Kingdom data protection regimes, which centre primarily on the General Data Protection Regulation (GDPR), the term \"personal data\" is significantly broader, and determines the scope of the regulatory regime.\n\nNational Institute of Standards an |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-a100 generated]# cat node_datasets_2024-08-17T15_42_00/knowledge_compliance_personally-identifiable-information_task.yaml | |
dataset_kwargs: | |
data_files: | |
test: /var/mnt/inststg1/instructlab/generated//node_datasets_2024-08-17T15_42_00/mmlubench_knowledge_compliance_personally-identifiable-information.jsonl | |
dataset_name: null | |
dataset_path: json | |
doc_to_choice: '{{[choices[0], choices[1], choices[2], choices[3]]}}' | |
doc_to_target: '{{answer}}' | |
doc_to_text: '{{question.strip()}} |
This file has been truncated, but you can view the full file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-a100 instructlab]# /root/bin/ilab-sdg.sh data generate --pipeline /var/mnt/inststg1/instructlab/sdg-config/pipelines/agentic/ --taxonomy-path /var/mnt/inststg1/instructlab/taxonomy/ --taxonomy-base empty --endpoint-url http://127.0.0.1:8080/v1 --model-family mixtral --sdg-scale-factor 30 --model /var/mnt/inststg1/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1 --output-dir /var/mnt/inststg1/instructlab/generated/ --tls-insecure | |
INFO 2024-08-17 15:41:56,393 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable. | |
INFO 2024-08-17 15:41:56,393 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16. | |
INFO 2024-08-17 15:41:56,393 numexpr.utils:161: NumExpr defaulting to 16 threads. | |
INFO 2024-08-17 15:41:57,112 datasets:58: PyTorch version 2.3.1 available. | |
Generating synthetic data using '/var/mnt/inststg1/instructlab/sdg-config/pipelines/agentic/' pipelin |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@dev-rhel-ai-training-client-11 ~]# cat /var/mnt/inststg1/instructlab/job/checkpoints/skills/full_logs_global0.log | |
W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757] | |
W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757] ***************************************** | |
W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. | |
W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757] ***************************************** | |
[2024-08-14 01:44:22,434] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
[2024-08-14 01:44:22,675] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
[2024-08-14 01:44:22,722] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerato |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-rhel-newimage root]# /root/ilab model serve --model-family mixtral --model-path /var/instructlabbigdisk/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1/ --backend vllm -- --tensor-parallel-size 8 --host 127.0.0.1 --port 8084 | |
INFO 2024-07-28 16:53:08,009 instructlab.model.serve:136: Using model '/var/instructlabbigdisk/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1' with -1 gpu-layers and 4096 max context size. | |
INFO 2024-07-28 16:53:08,009 instructlab.model.serve:140: Serving model '/var/instructlabbigdisk/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1' with vllm | |
INFO 2024-07-28 16:53:08,010 instructlab.model.backends.vllm:196: vLLM starting up on pid 64 at http://127.0.0.1:8000/v1 | |
INFO 07-28 16:53:13 api_server.py:219] vLLM API server version 0.5.3.post1 | |
INFO 07-28 16:53:13 api_server.py:220] args: Namespace(host='127.0.0.1', port=8084, uvicorn_log_level='info', allow_credentials=False, allowed_origins=['*'], allowed_methods=['*'], allowed_headers=['*'], api_key=None, lor |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-rhel-newimage instructlab]# /root/ilab model train --data-path /var/instructlabbigdisk/instructlab/generateddata/messages_combined.jsonl --model-path /var/instructlabbigdisk/instructlab/knowledgecheckpoints/hf_format/samples_1024/ --device cuda --max-batch-len 2 --effective-batch-size 16 --save-samples 185 --num-epochs 10 --ckpt-output-dir /var/instructlabbigdisk/instructlab/skillscheckpoints/ --gpus 8 | |
[2024-07-27 20:03:08,445] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
[WARNING] async_io requires the dev libaio .so object and headers but these were not found. | |
[WARNING] async_io: please install the libaio-devel package with yum | |
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. | |
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH | |
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-rhel-newimage instructlab]# /root/ilab model evaluate --benchmark mmlu --model /var/instructlabbigdisk/instructlab/knowledgecheckpoints/hf_format/samples_1024/ --gpus 8 | |
INFO 2024-07-27 19:39:50,893 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable. | |
INFO 2024-07-27 19:39:50,893 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16. | |
INFO 2024-07-27 19:39:50,893 numexpr.utils:161: NumExpr defaulting to 16 threads. | |
INFO 2024-07-27 19:39:51,260 datasets:58: PyTorch version 2.3.1 available. | |
INFO 2024-07-27 19:39:58,693 lm-eval:152: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | |
INFO 2024-07-27 19:39:58,693 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/instructlabbigdisk/instructlab/knowledgecheckpoints/hf_format/samples_1024/', 'dtype': 'bfloat16'} | |
INFO 2024-07-27 19:39:58,802 lm-eval:170: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-rhel-newimage instructlab]# /root/ilab model evaluate --benchmark mt_bench --model /var/instructlabbigdisk/instructlab/skillscheckpoints/hf_format/samples_1056/ --judge-model /var/instructlabbigdisk/instructlab/models/prometheus-eval/prometheus-8x7b-v2.0/ --base-model /var/instructlabbigdisk/instructlab/models/ibm-granite/granite-7b-base/ --output-dir /var/instructlabbigdisk/instructlab/evaltracker/skillscheckpoints/samples_1056/ --gpus 8 --backend vllm --enable-serving-output | |
INFO 2024-07-27 18:36:02,004 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable. | |
INFO 2024-07-27 18:36:02,004 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16. | |
INFO 2024-07-27 18:36:02,005 numexpr.utils:161: NumExpr defaulting to 16 threads. | |
Generating answers... | |
WARNING 2024-07-27 18:36:02,158 instructlab.model.evaluate:288: Based on your hardware configuration, when using vL |