Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save MattPitlyk/45541145ad48b93da395f0a72ec2e7dc to your computer and use it in GitHub Desktop.
Save MattPitlyk/45541145ad48b93da395f0a72ec2e7dc to your computer and use it in GitHub Desktop.
Fine-Tuning GPT-2 on a Custom Dataset
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@anujsahani01
Copy link

the text will get tokenized in its own?
we just have to pass the text file, what kind of formatting should be done,

Question: 'the ques'

Answer: 'the answer'

will this format work?
And secondly my colab session is crashing when we train the model, what can be the solution to this?

@AhmedAskar12
Copy link

It requires >20gb memory so you can subscribe to colab plus or use a free trial virtual machine.

@AhmedAskar12
Copy link

AhmedAskar12 commented Jul 11, 2023

Is that gpt2 fine tuning approach effective btw?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment