Created
December 16, 2019 22:34
-
-
Save LysandreJik/c958925768eb6a9a72609ea99561d1cb to your computer and use it in GitHub Desktop.
Training GPT-2 LM Head model in Keras
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from transformers import GPT2Tokenizer, TFGPT2LMHeadModel | |
import tensorflow as tf | |
model = TFGPT2LMHeadModel.from_pretrained("distilgpt2") | |
tokenizer = GPT2Tokenizer.from_pretrained("distilgpt2") | |
text = """ | |
A SQUAT grey building of only thirty-four stories. Over the main entrance the | |
words, CENTRAL LONDON HATCHERY AND CONDITIONING CENTRE, | |
and, in a shield, the World State’s motto, COMMUNITY, IDENTITY, STABILITY. | |
The enormous room on the ground floor faced towards the north. Cold for | |
all the summer beyond the panes, for all the tropical heat of the room itself, | |
a harsh thin light glared through the windows, hungrily seeking some draped | |
lay figure, some pallid shape of academic goose-flesh, but finding only the glass | |
and nickel and bleakly shining porcelain of a laboratory. Wintriness responded | |
to wintriness. The overalls of the workers were white, their hands gloved with | |
a pale corpse-coloured rubber. The light was frozen, dead, a ghost. Only from | |
the yellow barrels of the microscopes did it borrow a certain rich and living | |
substance, lying along the polished tubes like butter, streak after luscious streak | |
in long recession down the work tables. | |
“And this,” said the Director opening the door, “is the Fertilizing Room.” | |
Bent over their instruments, three hundred Fertilizers were plunged, as the Director of Hatcheries and Conditioning entered the room, in the scarcely breathing silence, the absent-minded, soliloquizing hum or whistle, of absorbed | |
concentration. A troop of newly arrived students, very young, pink and callow, | |
followed nervously, rather abjectly, at the Director’s heels. Each of them carried | |
a notebook, in which, whenever the great man spoke, he desperately scribbled. | |
Straight from the horse’s mouth. It was a rare privilege. The D. H. C. for Central | |
London always made a point of personally conducting his new students round | |
the various departments. | |
“Just to give you a general idea,” he would explain to them. For of course some | |
sort of general idea they must have, if they were to do their work intelligentlythough as little of one, if they were to be good and happy members of society, as | |
possible. For particulars, as every one knows, make for virture and happiness; | |
generalities are intellectually necessary evils. Not philosophers but fretsawyers | |
""" * 100 | |
tokenized_text = tokenizer.encode(text) | |
examples = [] | |
block_size = 100 | |
for i in range(0, len(tokenized_text) - block_size + 1, block_size): # Truncate in block of block_size | |
examples.append(tokenized_text[i:i + block_size]) | |
inputs, labels = [], [] | |
for ex in examples: | |
inputs.append(ex[:-1]) | |
labels.append(ex[1:]) | |
dataset = tf.data.Dataset.from_tensor_slices((inputs, labels)) | |
BATCH_SIZE = 16 | |
BUFFER_SIZE = 10000 | |
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True) | |
optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0) | |
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) | |
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy') | |
model.compile(optimizer=optimizer, loss=[loss, *[None] * model.config.n_layer], metrics=[metric]) | |
model.fit(dataset, epochs=3) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
so the batch-size 12 is hardcoded inside this pretrained model ? different batch size also don't work in the model created from the default config.