Created
July 24, 2023 00:22
-
-
Save mberman84/45545e48040ef6aafb6a1cb3442edb83 to your computer and use it in GitHub Desktop.
LLaMA 2 13b chat fp16 Install Instructions
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
conda create -n textgen python=3.10.9 | |
conda activate textgen | |
install pytorch: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 | |
git clone https://github.com/oobabooga/text-generation-webui | |
cd text-generation-webui | |
pip install -r requirements.txt | |
python server.py | |
# download model | |
# refresh model list | |
# load model | |
# switch to chat mode |
I get this error can anyone help me
AssertionError: Torch not compiled with CUDA enabled
I found the solution. Issue was with prebuilds.
Change your
requirements.txt
file to thisaiofiles==23.1.0 fastapi==0.95.2 gradio_client==0.2.5 gradio==3.33.1 accelerate==0.21.0 colorama datasets einops markdown numpy pandas Pillow>=9.5.0 pyyaml requests safetensors==0.3.1 scipy sentencepiece tensorboard tqdm wandb auto-gptq llama-cpp-python git+https://github.com/jllllll/GPTQ-for-LLaMa-CUDA.git git+https://github.com/huggingface/peft@96c0277a1b9a381b10ab34dbf84917f9b3b992e6 git+https://github.com/huggingface/transformers@baf1daa58eb2960248fd9f7c3af0ed245b8ce4af git+https://github.com/jllllll/exllama bitsandbytes==0.41.1; platform_system != "Windows" https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows" # ctransformers https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.20+cu117-py3-none-any.whl
Additional requirements
Install cuda (GPU Must support it)
Install pytorch based on cuda
Make sure both cuda & pytorch have the same version
- Cuda 11-7.0 - https://developer.nvidia.com/cuda-11-7-0-download-archive?target_os=Windows&target_arch=x86_64&target_version=11&target_type=exe_local
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
taken from https://pytorch.org/get-started/locally/Install Build Tools for Visual Studio 2022
I dont understand which requirements
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 519, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1172, in _load
result = unpickler.load()
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 523, in load_state_dict
if f.read(7) == "version":
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 273: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:\text-generation-webui\modules\ui_model_menu.py", line 214, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
File "F:\text-generation-webui\modules\models.py", line 90, in load_model
output = load_func_maploader
File "F:\text-generation-webui\modules\models.py", line 161, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 3706, in from_pretrained
) = cls._load_pretrained_model(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 4091, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 535, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin' at 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
I got this error. Can somebody help me as what is the possible reason of this and how to fix it? (Detailed instructions pls as I'm just a newbie :( ) Thank you very much!