MattPitlyk/fine-tuning-gpt-2-on-a-custom-dataset.ipynb

Created February 14, 2020 19:14

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/MattPitlyk/45541145ad48b93da395f0a72ec2e7dc.js"></script>
Save MattPitlyk/45541145ad48b93da395f0a72ec2e7dc to your computer and use it in GitHub Desktop.

Fine-Tuning GPT-2 on a Custom Dataset

Raw

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

the text will get tokenized in its own?
we just have to pass the text file, what kind of formatting should be done,

will this format work?
And secondly my colab session is crashing when we train the model, what can be the solution to this?

It requires >20gb memory so you can subscribe to colab plus or use a free trial virtual machine.

Is that gpt2 fine tuning approach effective btw?