Created
July 23, 2023 17:02
-
-
Save iamaziz/171170dce60d9cd07fab221507fd1d52 to your computer and use it in GitHub Desktop.
Building a large language model (LLM) from scratch (for learning and fun - inspired by Llama2).
Can this guide be used to build a 30b LLM?
I think yes you can use this. 30B is just a model size so you can always increase the model size, but the difficult part is to train it.
I would highly recommend https://youtu.be/kCc8FmEb1nY where you'll be able to understand in and out of LLMs.
Does this mean all open source LLM can be built by using torch only if we know the structure?
You'll definitely get to know the structure of any model, such as Mistral, but obviously, you won't get the accuracy because SOTA models such as Mistral, Zephyr, Phi, Llama, etc are trained on a large amount of data which takes days of training on high end GPU's.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@iamaziz
I ran the exact code from above.
text = big_text.lower()
. Big_text is a list that's why it returns an error.Why are you creating abstract is there any future use of it?
Also, can you explain what does sequence_length mean here? How did you decide that number?
Looking forward to hearing from you.