Building a large language model (LLM) from scratch (for learning and fun

AliHaider20 commented Nov 11, 2023 •

edited

Loading

I ran the exact code from above.
text = big_text.lower(). Big_text is a list that's why it returns an error.

Why are you creating abstract is there any future use of it?
Also, can you explain what does sequence_length mean here? How did you decide that number?

Looking forward to hearing from you.

zanstro commented Feb 12, 2024

Can this guide be used to build a 30b LLM?

AliHaider20 commented Feb 14, 2024

I think yes you can use this. 30B is just a model size so you can always increase the model size, but the difficult part is to train it.
I would highly recommend https://youtu.be/kCc8FmEb1nY where you'll be able to understand in and out of LLMs.

panxinming commented Jun 12, 2024

Does this mean all open source LLM can be built by using torch only if we know the structure?

AliHaider20 commented Jun 12, 2024

You'll definitely get to know the structure of any model, such as Mistral, but obviously, you won't get the accuracy because SOTA models such as Mistral, Zephyr, Phi, Llama, etc are trained on a large amount of data which takes days of training on high end GPU's.

iamaziz/LLM-from-scratch.ipynb

AliHaider20 commented Nov 11, 2023 •

edited

Loading

Uh oh!

zanstro commented Feb 12, 2024

Uh oh!

AliHaider20 commented Feb 14, 2024

Uh oh!

panxinming commented Jun 12, 2024

Uh oh!

AliHaider20 commented Jun 12, 2024

Uh oh!

iamaziz/LLM-from-scratch.ipynb

AliHaider20 commented Nov 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zanstro commented Feb 12, 2024

Uh oh!

AliHaider20 commented Feb 14, 2024

Uh oh!

panxinming commented Jun 12, 2024

Uh oh!

AliHaider20 commented Jun 12, 2024

Uh oh!

AliHaider20 commented Nov 11, 2023 •

edited

Loading