Last active
September 17, 2023 15:51
-
-
Save zitterbewegung/4787e42617aa0be6019c3db526d5d6fd to your computer and use it in GitHub Desktop.
llama.cpp 65B run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(venv) # Exit:0 2023-03-12 16:59:27 [r2q2@Reformer#[:~/opt/llama.cpp] | |
$(: !605 ) ./main -m ./models/65B/ggml-model-q4_0.bin -t 8 -n 128 | |
main: seed = 1678658429 | |
llama_model_load: loading model from './models/65B/ggml-model-q4_0.bin' - please wait ... | |
llama_model_load: n_vocab = 32000 | |
llama_model_load: n_ctx = 512 | |
llama_model_load: n_embd = 8192 | |
llama_model_load: n_mult = 256 | |
llama_model_load: n_head = 64 | |
llama_model_load: n_layer = 80 | |
llama_model_load: n_rot = 128 | |
llama_model_load: f16 = 2 | |
llama_model_load: n_ff = 22016 | |
llama_model_load: n_parts = 8 | |
llama_model_load: ggml ctx size = 41477.73 MB | |
llama_model_load: memory_size = 2560.00 MB, n_mem = 40960 | |
llama_model_load: loading model part 1/8 from './models/65B/ggml-model-q4_0.bin' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 2/8 from './models/65B/ggml-model-q4_0.bin.1' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 3/8 from './models/65B/ggml-model-q4_0.bin.2' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 4/8 from './models/65B/ggml-model-q4_0.bin.3' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 5/8 from './models/65B/ggml-model-q4_0.bin.4' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 6/8 from './models/65B/ggml-model-q4_0.bin.5' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 7/8 from './models/65B/ggml-model-q4_0.bin.6' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
llama_model_load: loading model part 8/8 from './models/65B/ggml-model-q4_0.bin.7' | |
llama_model_load: .......................................................................................... done | |
llama_model_load: model size = 4869.09 MB / num tensors = 723 | |
main: prompt: 'If' | |
main: number of tokens in prompt = 2 | |
1 -> '' | |
3644 -> 'If' | |
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000 | |
If you’re looking to work in one of the most diverse, exciting and fast-paced industries around – we want YOU! | |
From early education all through college students are taught that great careers have titles like doctor or lawyer. Students learn about a variety of professions but they may not be exposed to what it takes to run an event successfully from start to finish; the overall big picture process and its effect on those involved in corporations, hotels, convention centres as well many other areas with smaller budgets which are also dependent upon meeting planners. This job profile will provide you information about | |
main: mem per token = 70897348 bytes | |
main: load time = 19427.11 ms | |
main: sample time = 440.50 ms | |
main: predict time = 70716.00 ms / 548.19 ms per token | |
main: total time = 96886.10 ms |
How fast is its output? Could you give us some examples of its output please?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you for sharing the consumption results! A lot of people have said it works, but don't say precise amounts / numbers. Thanks!