relyt0925’s gists

relyt0925 / gist:f0efa5cc423c9d4ee00cc07bab2f9787

Created August 18, 2024 03:41

skills_train_msg_file

[root@tyler-a100 instructlab]# head -n 100 /var/mnt/inststg1/instructlab/generated/skills_train_msgs_2024-08-17T15_42_00.jsonl 

{"messages":[{"content":"I am, Red Hat\u00ae Instruct Model based on Granite 7B, an AI language model developed by Red Hat and IBM Research, based on the Granite-7b-base language model. My primary function is to be a chat assistant.","role":"system"},{"content":"Suppose there are two gas stations, A and B, located on opposite sides of a highway. Both charge the same price, $2.50 per gallon, for gasoline. However, station A is 1 mile closer to the majority of drivers, while station B is 1 mile closer to the minority of drivers. The cost of driving 1 mile is $0.10. If station A and B both advertise their prices, which station will attract more customers and what will be their profits?","role":"user"},{"content":"To determine which station will attract more customers and their profits, we need to consider the cost of driving to each station for the majority and minority of drivers.\n\nL

relyt0925 / gist:d11881a52fde981f53995eb0275a4803

Created August 18, 2024 03:38

knowledge_train_msgs

This file has been truncated, but you can view the full file.

[root@tyler-a100 instructlab]# cat /var/mnt/inststg1/instructlab/generated/knowledge_train_msgs_2024-08-17T15_42_00.jsonl

{"messages":[{"content":"I am, Red Hat\u00ae Instruct Model based on Granite 7B, an AI language model developed by Red Hat and IBM Research, based on the Granite-7b-base language model. My primary function is to be a chat assistant.","role":"system"},{"content":"<|user|>\nPersonal data, also known as personal information or personally identifiable information (PII), refers to any information related to an identifiable person. The term PII has four common variants based on personal or personally, and identifiable or identifying. The effective definitions of PII vary depending on the jurisdiction and the purpose for which it is used. Under the General Data Protection Regulation (GDPR) in the European Union and United Kingdom, the term \"personal data\" is broader, determining the scope of the regulatory regime. The National Institute of Standards and Technology Special Publication 800-122 de

relyt0925 / gist:1fd2ca8c1c9fc21c2108129fb6048d82

Created August 18, 2024 01:24

mmlubench_knowledge_compliance_personally-identifiable-information.jsonl

[root@tyler-a100 generated]# cat  /var/mnt/inststg1/instructlab/generated//node_datasets_2024-08-17T15_42_00/mmlubench_knowledge_compliance_personally-identifiable-information.jsonl

{"icl_document":"hii","document":"# Personal Data\n\n## Overview\n\nPersonal data, also known as personal information or personally identifiable information (PII), is any information related to an identifiable person.\n\nThe abbreviation PII is widely accepted in the United States, but the phrase it abbreviates has four common variants based on personal or personally, and identifiable or identifying. Not all are equivalent, and for legal purposes the effective definitions vary depending on the jurisdiction and the purposes for which the term is being used. Under European Union and United Kingdom data protection regimes, which centre primarily on the General Data Protection Regulation (GDPR), the term \"personal data\" is significantly broader, and determines the scope of the regulatory regime.\n\nNational Institute of Standards an

relyt0925 / gist:f5fb8df46c3b91203bcc0064fa594c8c

Created August 18, 2024 01:22

node_datasets_2024-08-17T15_42_00/knowledge_compliance_personally-identifiable-information_task.yaml

	[root@tyler-a100 generated]# cat node_datasets_2024-08-17T15_42_00/knowledge_compliance_personally-identifiable-information_task.yaml
	dataset_kwargs:
	data_files:
	test: /var/mnt/inststg1/instructlab/generated//node_datasets_2024-08-17T15_42_00/mmlubench_knowledge_compliance_personally-identifiable-information.jsonl
	dataset_name: null
	dataset_path: json
	doc_to_choice: '{{[choices[0], choices[1], choices[2], choices[3]]}}'
	doc_to_target: '{{answer}}'
	doc_to_text: '{{question.strip()}}

relyt0925 / gist:0c4ec82147cd37b82177df7235e12a99

Last active August 17, 2024 19:51

aaaa

This file has been truncated, but you can view the full file.

	[root@tyler-a100 instructlab]# /root/bin/ilab-sdg.sh data generate --pipeline /var/mnt/inststg1/instructlab/sdg-config/pipelines/agentic/ --taxonomy-path /var/mnt/inststg1/instructlab/taxonomy/ --taxonomy-base empty --endpoint-url http://127.0.0.1:8080/v1 --model-family mixtral --sdg-scale-factor 30 --model /var/mnt/inststg1/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1 --output-dir /var/mnt/inststg1/instructlab/generated/ --tls-insecure
	INFO 2024-08-17 15:41:56,393 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
	INFO 2024-08-17 15:41:56,393 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
	INFO 2024-08-17 15:41:56,393 numexpr.utils:161: NumExpr defaulting to 16 threads.
	INFO 2024-08-17 15:41:57,112 datasets:58: PyTorch version 2.3.1 available.
	Generating synthetic data using '/var/mnt/inststg1/instructlab/sdg-config/pipelines/agentic/' pipelin

relyt0925 / gist:8933fa4862dc711d3a6691013a3225cc

Created August 14, 2024 02:08

oom fail logs

	[root@dev-rhel-ai-training-client-11 ~]# cat /var/mnt/inststg1/instructlab/job/checkpoints/skills/full_logs_global0.log
	W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757]
	W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757] *****************************************
	W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
	W0814 01:44:19.387000 139736190685632 torch/distributed/run.py:757] *****************************************
	[2024-08-14 01:44:22,434] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
	[2024-08-14 01:44:22,675] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
	[2024-08-14 01:44:22,722] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerato

relyt0925 / gist:7ad70ec81b1e0ea8aaf9a41153c3d555

Created July 28, 2024 16:54

new model serve logs

	[root@tyler-rhel-newimage root]# /root/ilab model serve --model-family mixtral --model-path /var/instructlabbigdisk/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1/ --backend vllm -- --tensor-parallel-size 8 --host 127.0.0.1 --port 8084
	INFO 2024-07-28 16:53:08,009 instructlab.model.serve:136: Using model '/var/instructlabbigdisk/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1' with -1 gpu-layers and 4096 max context size.
	INFO 2024-07-28 16:53:08,009 instructlab.model.serve:140: Serving model '/var/instructlabbigdisk/instructlab/models/mistralai/Mixtral-8x7B-Instruct-v0.1' with vllm
	INFO 2024-07-28 16:53:08,010 instructlab.model.backends.vllm:196: vLLM starting up on pid 64 at http://127.0.0.1:8000/v1
	INFO 07-28 16:53:13 api_server.py:219] vLLM API server version 0.5.3.post1
	INFO 07-28 16:53:13 api_server.py:220] args: Namespace(host='127.0.0.1', port=8084, uvicorn_log_level='info', allow_credentials=False, allowed_origins=[''], allowed_methods=[''], allowed_headers=['*'], api_key=None, lor

relyt0925 / gist:5c6c09acf77c53a563e3663bd2e24fbb

Created July 27, 2024 20:09

new skills training log

	[root@tyler-rhel-newimage instructlab]# /root/ilab model train --data-path /var/instructlabbigdisk/instructlab/generateddata/messages_combined.jsonl --model-path /var/instructlabbigdisk/instructlab/knowledgecheckpoints/hf_format/samples_1024/ --device cuda --max-batch-len 2 --effective-batch-size 16 --save-samples 185 --num-epochs 10 --ckpt-output-dir /var/instructlabbigdisk/instructlab/skillscheckpoints/ --gpus 8
	[2024-07-27 20:03:08,445] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
	[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
	[WARNING] async_io: please install the libaio-devel package with yum
	[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
	[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
	[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected

relyt0925 / gist:47745a2ea571e737754dfdc06ddcdf5f

Created July 27, 2024 19:57

new mmlu log

	[root@tyler-rhel-newimage instructlab]# /root/ilab model evaluate --benchmark mmlu --model /var/instructlabbigdisk/instructlab/knowledgecheckpoints/hf_format/samples_1024/ --gpus 8
	INFO 2024-07-27 19:39:50,893 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
	INFO 2024-07-27 19:39:50,893 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
	INFO 2024-07-27 19:39:50,893 numexpr.utils:161: NumExpr defaulting to 16 threads.
	INFO 2024-07-27 19:39:51,260 datasets:58: PyTorch version 2.3.1 available.
	INFO 2024-07-27 19:39:58,693 lm-eval:152: Setting random seed to 0 \| Setting numpy seed to 1234 \| Setting torch manual seed to 1234
	INFO 2024-07-27 19:39:58,693 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/instructlabbigdisk/instructlab/knowledgecheckpoints/hf_format/samples_1024/', 'dtype': 'bfloat16'}
	INFO 2024-07-27 19:39:58,802 lm-eval:170:

relyt0925 / gist:b7ce2a25adf83d3887c7fd81e9ac9736

Created July 27, 2024 18:45

mt_bench eval

	[root@tyler-rhel-newimage instructlab]# /root/ilab model evaluate --benchmark mt_bench --model /var/instructlabbigdisk/instructlab/skillscheckpoints/hf_format/samples_1056/ --judge-model /var/instructlabbigdisk/instructlab/models/prometheus-eval/prometheus-8x7b-v2.0/ --base-model /var/instructlabbigdisk/instructlab/models/ibm-granite/granite-7b-base/ --output-dir /var/instructlabbigdisk/instructlab/evaltracker/skillscheckpoints/samples_1056/ --gpus 8 --backend vllm --enable-serving-output
	INFO 2024-07-27 18:36:02,004 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
	INFO 2024-07-27 18:36:02,004 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
	INFO 2024-07-27 18:36:02,005 numexpr.utils:161: NumExpr defaulting to 16 threads.
	Generating answers...
	WARNING 2024-07-27 18:36:02,158 instructlab.model.evaluate:288: Based on your hardware configuration, when using vL

Tyler Lisowski relyt0925