Last active
June 13, 2023 02:38
-
-
Save viniciusgonmelo/70645802139f9b78a58c8f8c3db14976 to your computer and use it in GitHub Desktop.
Executa o modelo WizardLM 7B com o Llama.cpp (conversar em inglês com o modelo).
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
set -e | |
# Baixar o modelo pré-treinado e o prompt em inglês: | |
# - https://huggingface.co/TheBloke/wizardLM-7B-GGML/tree/main | |
# WizardLM 7B de 5 bit otimizado, "wizardLM-7B.ggmlv3.q5_1.bin" | |
script_dir="$(dirname "$0")" | |
prompt_template=${prompt_template:-"${script_dir}/prompt-EN-US.txt"} | |
user_name="${user:-USER}" | |
ai_name="${ai:-WIZARDLM-7B-Q5-1}" | |
prompt_file=$(mktemp /tmp/prompt.XXXXXXX) | |
sed -e "s/\[\[USER_NAME\]\]/$user_name/g" \ | |
-e "s/\[\[AI_NAME\]\]/$ai_name/g" \ | |
$prompt_template > $prompt_file | |
# Opções pra equilibrar os recursos e rodar em um laptop simples: | |
# https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md | |
# --mlock: força uso só da RAM, sem comprimir ou usar swap | |
# --no-nmap: não mapeia o modelo na RAM; carregamento mais lento; ajuda a rodar modelos | |
# com menos RAM | |
# -t --threads: número de threads que o llama.cpp usa pra computação; o exemplo - 7 - é para | |
# um laptop com 4 núcleos, 2 threads por núcleo (não uso 8 pra haver CPU pras atividades | |
# do sistema) | |
# --reverse-prompt e --in-prefix devem ser mantidas como estão pro chat funcionar | |
# https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md#in-prefix | |
# --model: caminho pro modelo | |
llama.cpp --ctx-size 512 \ | |
--batch-size 1024 \ | |
--n-predict 256 \ | |
--keep 48 \ | |
--repeat-penalty 1.0 \ | |
--interactive \ | |
--model ${HOME}/modelos/wizardLM-7B.ggmlv3.q5_1.bin \ | |
--file ${prompt_file} \ | |
--reverse-prompt "${user_name}:" \ | |
--in-prefix ' ' \ | |
--threads 7 \ | |
--mlock \ | |
"$@" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Transcript of a dialog, where [[USER_NAME]] interacts with an Assistant named [[AI_NAME]]. [[AI_NAME]] is helpful, kind, honest, good at writing, and never fails to answer the [[USER_NAME]]'s requests immediately and with precision. | |
[[USER_NAME]]: Hello, [[AI_NAME]]. | |
[[AI_NAME]]: Hello. How may I help you today? | |
[[USER_NAME]]: Please tell me the largest city in Europe. | |
[[AI_NAME]]: Sure. The largest city in Europe is Moscow, the capital of Russia. | |
[[USER_NAME]]: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment