Skip to content

Instantly share code, notes, and snippets.

@philschmid
Created August 18, 2022 07:35
Show Gist options
  • Select an option

  • Save philschmid/ba2007158d31ae7bc5fbc33c8ce960a1 to your computer and use it in GitHub Desktop.

Select an option

Save philschmid/ba2007158d31ae7bc5fbc33c8ce960a1 to your computer and use it in GitHub Desktop.
import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from transformers import TFAutoModelForSequenceClassification,AutoTokenizer
from datasets import load_dataset
# load model and tokenizer
model_id = "distilbert-base-uncased"
model = TFAutoModelForSequenceClassification.from_pretrained(model_id, num_labels=5)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# load, tokenize & prepare dataset
dataset = load_dataset("emotion")
dataset = dataset.map(lambda e: tokenizer(e["text"],truncation=True),batched=True)
tf_train_dataset = model.prepare_tf_dataset(
dataset["train"],
tokenizer=tokenizer,
batch_size=16,
shuffle=True
)
# compile model and set context to fp 16
tf.keras.mixed_precision.set_global_policy("mixed_float16")
model.compile(optimizer=Adam(3e-5),metrics="accuracy")
# start training
model.fit(tf_train_dataset, epochs=3)
# Epoch 1/3
# 547/1000 [===============>..............] - ETA: 28s - loss: nan - accuracy: 0.7446
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment