-
-
Save psychemedia/51f45fbfe160f78605bdd0c1b404e499 to your computer and use it in GitHub Desktop.
@JeffreyShran Humm I just arrived here but talking about increasing the token amount that Llama can handle is something blurry still since it was trained from the beggining with that amount and technically you should need to recreate the whole training of Llama but increasing the input size. In other words, is a inherent property of the model that is unmutable from the beggining. Good news is that the input training that Llama was trained on (therefore the maximum possible) is 2048 tokens!
Here you can see that limit on the HF docs looking at the
max_position_embeddings
parameterBTW here is a similar thread if you want to take a sneak peak
Nevertheless there are ways to let Llama have more "memory scope", here are some converstional approaches, the last section is the most interesting one for any purpose.
Hope you found it helpfull✌🏼
Thanks, that is helpful. However it appears that these settings are already maxed out at default to 2048.
The file I tested with had only a few lines in it, so I think the problem might lie elsewhere.
Thanks, that is helpful. However it appears that these settings are already maxed out at default to 2048.
The file I tested with had only a few lines in it, so I think the problem might lie elsewhere.
Yes, Indeed. I was hoping to find that limit on GPT4All but only found that the standard model used 1024 input tokens. So maybe... the quantized lora version uses a limit of 512 tokens for some reason, although it doens't make that much sense since quantized and lora versions only looses precision rather than dimensionality.
Anyway I think the best way to improve this regard is to try to use other models that we know can handle already 2048 token input. I suggest Vicuna, that was born mainly with this purpose of maxing out input/output.
If somebody can test this it would be so great.
I'm actually using ggml-vicuna-7b-4bit.bin. This is the one I'm having the most trouble with. :)
This would be much easier to follow with the working code in one place instead of only scattered fragments.
Is it possible to use GPT4All as llm with sql_agent or pandas_agent instead of OpenAI?
I've installed all the packages and still get this: zsh: command not found: pyllamacpp-convert-gpt4all
I've installed all the packages and still get this: zsh: command not found: pyllamacpp-convert-gpt4all
Try a older version pyllamacpp pip install pyllamacpp==1.0.7
.
@JeffreyShran Humm I just arrived here but talking about increasing the token amount that Llama can handle is something blurry still since it was trained from the beggining with that amount and technically you should need to recreate the whole training of Llama but increasing the input size. In other words, is a inherent property of the model that is unmutable from the beggining.
Good news is that the input training that Llama was trained on (therefore the maximum possible) is 2048 tokens!
Here you can see that limit on the HF docs looking at the
max_position_embeddings
parameterBTW here is a similar thread if you want to take a sneak peak
Nevertheless there are ways to let Llama have more "memory scope", here are some converstional approaches, the last section is the most interesting one for any purpose.
Hope you found it helpfull✌🏼