-
-
Save mberman84/45545e48040ef6aafb6a1cb3442edb83 to your computer and use it in GitHub Desktop.
conda create -n textgen python=3.10.9 | |
conda activate textgen | |
install pytorch: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 | |
git clone https://github.com/oobabooga/text-generation-webui | |
cd text-generation-webui | |
pip install -r requirements.txt | |
python server.py | |
# download model | |
# refresh model list | |
# load model | |
# switch to chat mode |
i can't install requirement what's the solution
How to delete the local LLama2? I've runned out free space. UPDATE: Oh I found it! it's in models folder. Question now - my GTX 1650 runs out of memory how do I solve this?
can u tell me where to delete those GPT models, I've installed ton of models!, Im also running out of space
On Macbooks this
conda install pytorch torchvision torchaudio -c pytorch
solves the following issues:
ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch:
Thanks Adrian. Peter's Solution worked for me.
Hi there, im getting this error after "python server.py"
Traceback (most recent call last):
File "/Users/.../.../llama/text-generation-webui/server.py", line 12, in
import gradio as gr
ModuleNotFoundError: No module named 'gradio'
do you have any idea on how to fix it?
Hi there, im getting this error after "python server.py"
Traceback (most recent call last): File "/Users/.../.../llama/text-generation-webui/server.py", line 12, in import gradio as gr ModuleNotFoundError: No module named 'gradio'
do you have any idea on how to fix it?
Have you made "pip install -r requirements.txt" this command?
I also face the issue like @chiefdataofficer
after installing step 6 I get
ERROR: auto_gptq-0.3.0+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform, following that when trying to run the server it completes of missing module gradio
I found the solution. Issue was with prebuilds.
Change your requirements.txt
file to this
aiofiles==23.1.0
fastapi==0.95.2
gradio_client==0.2.5
gradio==3.33.1
accelerate==0.21.0
colorama
datasets
einops
markdown
numpy
pandas
Pillow>=9.5.0
pyyaml
requests
safetensors==0.3.1
scipy
sentencepiece
tensorboard
tqdm
wandb
auto-gptq
llama-cpp-python
git+https://github.com/jllllll/GPTQ-for-LLaMa-CUDA.git
git+https://github.com/huggingface/peft@96c0277a1b9a381b10ab34dbf84917f9b3b992e6
git+https://github.com/huggingface/transformers@baf1daa58eb2960248fd9f7c3af0ed245b8ce4af
git+https://github.com/jllllll/exllama
bitsandbytes==0.41.1; platform_system != "Windows"
https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows"
# ctransformers
https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.20+cu117-py3-none-any.whl
Additional requirements
- Install cuda (GPU Must support it)
- Install pytorch based on cuda
- Make sure both cuda & pytorch have the same version
- Cuda 11-7.0 - https://developer.nvidia.com/cuda-11-7-0-download-archive?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
taken from https://pytorch.org/get-started/locally/
- Install Build Tools for Visual Studio 2022
File “C:\Users\Administrator\text-generation-webui\modules\exllama_hf.py”, line 14, in
from exllama.model import ExLlama, ExLlamaCache, ExLlamaConfig
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\exllama_init_.py”, line 1, in
from . import cuda_ext, generator, model, tokenizer
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\exllama\cuda_ext.py”, line 9, in
import exllama_ext
ImportError: DLL load failed while importing exllama_ext: The specified module could not be found.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 322, in ExLlama_HF_loader
from modules.exllama_hf import ExllamaHF
File “C:\Users\Administrator\text-generation-webui\modules\exllama_hf.py”, line 21, in
from model import ExLlama, ExLlamaCache, ExLlamaConfig
ModuleNotFoundError: No module named ‘model’
What is the possible reason for above error?
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load
result = unpickler.load()
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict
if f.read(7) == "version":
File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained
return model_class.from_pretrained(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained
) = cls._load_pretrained_model(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What is the possible reason for above error?
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load
result = unpickler.load()
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict
if f.read(7) == "version":
File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained
return model_class.from_pretrained(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained
) = cls._load_pretrained_model(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What is the possible reason for above error?
It's possible that the model weights file (the 9GB files) didn't download correctly. You may need to manually download and move them to the appropriate directories and try again.
I get
ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
(base) C:\2023_AI_Projects\text-generation-webui>pip install -r requirements.txt
Ignoring bitsandbytes: markers 'platform_system != "Windows"' don't match your environment
Collecting bitsandbytes==0.41.1 (from -r requirements.txt (line 26))
Using cached https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl (152.7 MB)
ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
Hi, if I close mimiconda terminal I need to re-install everything again?
No, but you do have to rerun some commands. Start from the directory change and go from there.
Has anyone gotten this working with the 70B model? My load hangs at 10 of 15 and then python server crashes. I assume its a memory issue however I am unaware of where to find an error dump.
I get ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
(base) C:\2023_AI_Projects\text-generation-webui>pip install -r requirements.txt Ignoring bitsandbytes: markers 'platform_system != "Windows"' don't match your environment Collecting bitsandbytes==0.41.1 (from -r requirements.txt (line 26)) Using cached https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl (152.7 MB) ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
you have another python version cp310 - means Python 3.10,
i have same error, have changed any cp310 to cp311, to my Python 3.11
get a better graphics card would be my recomendation
I was already downloaded the llama 2 7B , then how can I install on linux machine,Can any one suggest me plzz
I have this same issue:
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Software\conda\textgen\text-generation-webui\modules\ui_model_menu.py", line 206, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File "C:\Software\conda\textgen\text-generation-webui\modules\models.py", line 84, in load_model
output = load_func_maploader
File "C:\Software\conda\textgen\text-generation-webui\modules\models.py", line 141, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained
model_class = _get_model_class(config, cls._model_mapping)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 387, in _get_model_class
supported_models = model_mapping[type(config)]
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 739, in getitem
return self._load_attr_from_module(model_type, model_name)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 753, in _load_attr_from_module
return getattribute_from_module(self._modules[module_name], attr)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 697, in getattribute_from_module
if hasattr(module, attr):
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1272, in getattr
module = self._get_module(self._class_to_module[name])
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1284, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.
got the same as johbegood too
when i try to load in the model i get an error. It says thet DLL load failed while importing flash_attn_2_cuda: module cannot be found
I tried installing different versions of Python and messed around with some cuda stuff az well but i did not manage to fix it. Does someone have a fix for it?
I had the same issue as you nixtrox. I changed to cuda12.1 and python 3.11.5 however now I am getting a new error.
Traceback (most recent call last):
File "C:\text-generation-webui\modules\ui_model_menu.py", line 209, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\text-generation-webui\modules\models.py", line 88, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\text-generation-webui\modules\models.py", line 250, in llamacpp_loader
model_file = list(Path(f'{shared.args.model_dir}/{model_name}').glob('*.gguf'))[0]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
any ideas?
So now im trying to install it on Mac and it looks like I run out of memory when i try to load the model. It reaches 33% and it kills my python server. I get this error:
warnings.warn('resource_tracker: There appear to be %d ' zsh: killed python server.py
When I am trying to load the model I face the following error:
File "C:\ProgramData\anaconda3\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1384, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.
Does anyone know what to do either than starting everything from the beginning?
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load
result = unpickler.load()
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict
if f.read(7) == "version":
File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained
return model_class.from_pretrained(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained
) = cls._load_pretrained_model(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What is the possible reason for above error?It's possible that the model weights file (the 9GB files) didn't download correctly. You may need to manually download and move them to the appropriate directories and try again.
Can you give me instructions on how to do that?
Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 519, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1172, in _load
result = unpickler.load()
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 523, in load_state_dict
if f.read(7) == "version":
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 273: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\text-generation-webui\modules\ui_model_menu.py", line 214, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
File "F:\text-generation-webui\modules\models.py", line 90, in load_model
output = load_func_maploader
File "F:\text-generation-webui\modules\models.py", line 161, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 3706, in from_pretrained
) = cls._load_pretrained_model(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 4091, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 535, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin' at 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
I got this error. Can somebody help me as what is the possible reason of this and how to fix it? (Detailed instructions pls as I'm just a newbie :( ) Thank you very much!
I get this error can anyone help me
AssertionError: Torch not compiled with CUDA enabled
I found the solution. Issue was with prebuilds.
Change your
requirements.txt
file to thisaiofiles==23.1.0 fastapi==0.95.2 gradio_client==0.2.5 gradio==3.33.1 accelerate==0.21.0 colorama datasets einops markdown numpy pandas Pillow>=9.5.0 pyyaml requests safetensors==0.3.1 scipy sentencepiece tensorboard tqdm wandb auto-gptq llama-cpp-python git+https://github.com/jllllll/GPTQ-for-LLaMa-CUDA.git git+https://github.com/huggingface/peft@96c0277a1b9a381b10ab34dbf84917f9b3b992e6 git+https://github.com/huggingface/transformers@baf1daa58eb2960248fd9f7c3af0ed245b8ce4af git+https://github.com/jllllll/exllama bitsandbytes==0.41.1; platform_system != "Windows" https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows" # ctransformers https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.20+cu117-py3-none-any.whl
Additional requirements
Install cuda (GPU Must support it)
Install pytorch based on cuda
Make sure both cuda & pytorch have the same version
- Cuda 11-7.0 - https://developer.nvidia.com/cuda-11-7-0-download-archive?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
taken from https://pytorch.org/get-started/locally/Install Build Tools for Visual Studio 2022
I dont understand which requirements
Thanks peter for the response. - Trying now.......