Created
May 17, 2023 11:54
-
-
Save juliensimon/da64fc6d6a2fe39bd8c5af12389a227e to your computer and use it in GitHub Desktop.
Trainium vs V100
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LANGUAGE PRETRAINING | |
python run_clm.py \ | |
--model_name_or_path gpt2 \ | |
--dataset_name wikitext \ | |
--dataset_config_name wikitext-103-raw-v1 \ | |
--num_train_epochs 10 \ | |
--per_device_train_batch_size 8 \ | |
--per_device_eval_batch_size 8 \ | |
--do_train \ | |
--do_eval \ | |
--output_dir /tmp/test-clm \ | |
--torch_compile True --torch_compile_mode default --fp16 True --save_strategy no | |
torchrun --nproc_per_node 32 run_clm.py --model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 8 --do_train --do_eval --output_dir /tmp/test-clm --overwrite_output_dir --save_strategy no | |
TOKEN CLASSIFICATION | |
python run_ner.py --model_name_or_path bert-large-uncased --dataset_name conll2003 --output_dir /tmp/test-ner --do_train --do_eval --num_train_epochs 10 --torch_compile True --torch_compile_mode default --fp16 True --overwrite_output_dir --max_seq_length 512 --save_strategy no | |
torchrun --nproc_per_node 32 run_ner.py --model_name_or_path bert-large-uncased --dataset_name conll2003 --output_dir /tmp/test-ner --do_train --do_eval --overwrite_output_dir --max_seq_length 512 --per_device_train_batch_size 1 --per_device_eval_batch_size 8 --save_strategy no | |
IMAGE CLASSIFICATION | |
python run_image_classification.py --dataset_name food101 --output_dir ./food101_outputs/ --do_train --do_eval --learning_rate 2e-5 --num_train_epochs 10 --per_device_train_batch_size 192 --per_device_eval_batch_size 64 --torch_compile True --torch_compile_mode default --fp16 True --remove_unused_columns False --overwrite_output_dir --save_strategy no | |
torchrun --nproc_per_node 32 run_image_classification.py --dataset_name food101 --output_dir ./food101_outputs/ --do_train --do_eval --learning_rate 2e-5 --num_train_epochs 10 --per_device_train_batch_size 16 --per_device_eval_batch_size 32 --remove_unused_columns False --overwrite_output_dir --save_strategy no | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment