-
-
Save mberman84/45545e48040ef6aafb6a1cb3442edb83 to your computer and use it in GitHub Desktop.
conda create -n textgen python=3.10.9 | |
conda activate textgen | |
install pytorch: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 | |
git clone https://github.com/oobabooga/text-generation-webui | |
cd text-generation-webui | |
pip install -r requirements.txt | |
python server.py | |
# download model | |
# refresh model list | |
# load model | |
# switch to chat mode |
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load
result = unpickler.load()
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict
if f.read(7) == "version":
File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained
return model_class.from_pretrained(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained
) = cls._load_pretrained_model(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What is the possible reason for above error?
It's possible that the model weights file (the 9GB files) didn't download correctly. You may need to manually download and move them to the appropriate directories and try again.
I get
ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
(base) C:\2023_AI_Projects\text-generation-webui>pip install -r requirements.txt
Ignoring bitsandbytes: markers 'platform_system != "Windows"' don't match your environment
Collecting bitsandbytes==0.41.1 (from -r requirements.txt (line 26))
Using cached https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl (152.7 MB)
ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
Hi, if I close mimiconda terminal I need to re-install everything again?
No, but you do have to rerun some commands. Start from the directory change and go from there.
Has anyone gotten this working with the 70B model? My load hangs at 10 of 15 and then python server crashes. I assume its a memory issue however I am unaware of where to find an error dump.
I get ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
(base) C:\2023_AI_Projects\text-generation-webui>pip install -r requirements.txt Ignoring bitsandbytes: markers 'platform_system != "Windows"' don't match your environment Collecting bitsandbytes==0.41.1 (from -r requirements.txt (line 26)) Using cached https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl (152.7 MB) ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.
you have another python version cp310 - means Python 3.10,
i have same error, have changed any cp310 to cp311, to my Python 3.11
get a better graphics card would be my recomendation
I was already downloaded the llama 2 7B , then how can I install on linux machine,Can any one suggest me plzz
I have this same issue:
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Software\conda\textgen\text-generation-webui\modules\ui_model_menu.py", line 206, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File "C:\Software\conda\textgen\text-generation-webui\modules\models.py", line 84, in load_model
output = load_func_maploader
File "C:\Software\conda\textgen\text-generation-webui\modules\models.py", line 141, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained
model_class = _get_model_class(config, cls._model_mapping)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 387, in _get_model_class
supported_models = model_mapping[type(config)]
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 739, in getitem
return self._load_attr_from_module(model_type, model_name)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 753, in _load_attr_from_module
return getattribute_from_module(self._modules[module_name], attr)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 697, in getattribute_from_module
if hasattr(module, attr):
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1272, in getattr
module = self._get_module(self._class_to_module[name])
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1284, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.
got the same as johbegood too
when i try to load in the model i get an error. It says thet DLL load failed while importing flash_attn_2_cuda: module cannot be found
I tried installing different versions of Python and messed around with some cuda stuff az well but i did not manage to fix it. Does someone have a fix for it?
I had the same issue as you nixtrox. I changed to cuda12.1 and python 3.11.5 however now I am getting a new error.
Traceback (most recent call last):
File "C:\text-generation-webui\modules\ui_model_menu.py", line 209, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\text-generation-webui\modules\models.py", line 88, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\text-generation-webui\modules\models.py", line 250, in llamacpp_loader
model_file = list(Path(f'{shared.args.model_dir}/{model_name}').glob('*.gguf'))[0]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
any ideas?
So now im trying to install it on Mac and it looks like I run out of memory when i try to load the model. It reaches 33% and it kills my python server. I get this error:
warnings.warn('resource_tracker: There appear to be %d ' zsh: killed python server.py
When I am trying to load the model I face the following error:
File "C:\ProgramData\anaconda3\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1384, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.
Does anyone know what to do either than starting everything from the beginning?
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load
result = unpickler.load()
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict
if f.read(7) == "version":
File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained
return model_class.from_pretrained(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained
) = cls._load_pretrained_model(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What is the possible reason for above error?It's possible that the model weights file (the 9GB files) didn't download correctly. You may need to manually download and move them to the appropriate directories and try again.
Can you give me instructions on how to do that?
Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 519, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1172, in _load
result = unpickler.load()
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 523, in load_state_dict
if f.read(7) == "version":
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 273: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\text-generation-webui\modules\ui_model_menu.py", line 214, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
File "F:\text-generation-webui\modules\models.py", line 90, in load_model
output = load_func_maploader
File "F:\text-generation-webui\modules\models.py", line 161, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 3706, in from_pretrained
) = cls._load_pretrained_model(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 4091, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 535, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin' at 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
I got this error. Can somebody help me as what is the possible reason of this and how to fix it? (Detailed instructions pls as I'm just a newbie :( ) Thank you very much!
I get this error can anyone help me
AssertionError: Torch not compiled with CUDA enabled
I found the solution. Issue was with prebuilds.
Change your
requirements.txt
file to thisaiofiles==23.1.0 fastapi==0.95.2 gradio_client==0.2.5 gradio==3.33.1 accelerate==0.21.0 colorama datasets einops markdown numpy pandas Pillow>=9.5.0 pyyaml requests safetensors==0.3.1 scipy sentencepiece tensorboard tqdm wandb auto-gptq llama-cpp-python git+https://github.com/jllllll/GPTQ-for-LLaMa-CUDA.git git+https://github.com/huggingface/peft@96c0277a1b9a381b10ab34dbf84917f9b3b992e6 git+https://github.com/huggingface/transformers@baf1daa58eb2960248fd9f7c3af0ed245b8ce4af git+https://github.com/jllllll/exllama bitsandbytes==0.41.1; platform_system != "Windows" https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows" # ctransformers https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.20+cu117-py3-none-any.whl
Additional requirements
Install cuda (GPU Must support it)
Install pytorch based on cuda
Make sure both cuda & pytorch have the same version
- Cuda 11-7.0 - https://developer.nvidia.com/cuda-11-7-0-download-archive?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
taken from https://pytorch.org/get-started/locally/Install Build Tools for Visual Studio 2022
I dont understand which requirements
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load
result = unpickler.load()
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict
if f.read(7) == "version":
File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained
return model_class.from_pretrained(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained
) = cls._load_pretrained_model(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What is the possible reason for above error?