Skip to content

Instantly share code, notes, and snippets.

@tjwebb
Created March 21, 2025 21:06
Show Gist options
  • Save tjwebb/77657b55f62d55c487124bdd887b72e6 to your computer and use it in GitHub Desktop.
Save tjwebb/77657b55f62d55c487124bdd887b72e6 to your computer and use it in GitHub Desktop.
ollamalog.txt
2025/03/21 20:45:23 routes.go:1230: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:1h0m0s OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:1000 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-03-21T20:45:23.358Z level=INFO source=images.go:432 msg="total blobs: 32"
time=2025-03-21T20:45:23.359Z level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-21T20:45:23.360Z level=INFO source=routes.go:1297 msg="Listening on [::]:11434 (version 0.6.2)"
time=2025-03-21T20:45:23.360Z level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-21T20:45:23.911Z level=INFO source=types.go:130 msg="inference compute" id=GPU-2cae7687-8e17-992f-bacd-ced46efe8fcf library=cuda variant=v12 compute=8.9 driver=12.8 name="NVIDIA L4" total="22.0 GiB" available="21.9 GiB"
time=2025-03-21T20:46:25.782Z level=INFO source=server.go:105 msg="system memory" total="499.2 GiB" free="483.2 GiB" free_swap="7.9 GiB"
time=2025-03-21T20:46:26.136Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=49 layers.offload=48 layers.split="" memory.available="[21.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="22.9 GiB" memory.required.partial="21.9 GiB" memory.required.kv="7.3 GiB" memory.required.allocations="[21.9 GiB]" memory.weights.total="10.6 GiB" memory.weights.repeating="10.6 GiB" memory.weights.nonrepeating="1020.0 MiB" memory.graph.full="695.1 MiB" memory.graph.partial="1.3 GiB" projector.weights="795.9 MiB" projector.graph="1.0 GiB"
time=2025-03-21T20:46:26.137Z level=INFO source=server.go:185 msg="enabling flash attention"
time=2025-03-21T20:46:26.137Z level=WARN source=server.go:193 msg="kv cache type not supported by model" type=""
time=2025-03-21T20:46:26.509Z level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?\\p{L}+|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-03-21T20:46:26.541Z level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.add_eot_token default=false
time=2025-03-21T20:46:26.553Z level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?\\p{L}+|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-03-21T20:46:26.584Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.attention.layer_norm_rms_epsilon default=9.999999974752427e-07
time=2025-03-21T20:46:26.584Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.local.freq_base default=10000
time=2025-03-21T20:46:26.584Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.global.freq_base default=1e+06
time=2025-03-21T20:46:26.584Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.freq_scale default=1
time=2025-03-21T20:46:26.584Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.mm_tokens_per_image default=256
time=2025-03-21T20:46:26.586Z level=INFO source=server.go:405 msg="starting llama server" cmd="/usr/bin/ollama runner --ollama-engine --model /root/.ollama/models/blobs/sha256-728e7e4ac6e65cd68bf0d6c3ebf2e9944b19d3ad2da49ab53265457f6de1f02c --ctx-size 20000 --batch-size 512 --n-gpu-layers 48 --threads 192 --flash-attn --parallel 1 --port 40661"
time=2025-03-21T20:46:26.587Z level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2025-03-21T20:46:26.587Z level=INFO source=server.go:580 msg="waiting for llama runner to start responding"
time=2025-03-21T20:46:26.589Z level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server error"
time=2025-03-21T20:46:26.638Z level=INFO source=runner.go:763 msg="starting ollama engine"
time=2025-03-21T20:46:26.639Z level=INFO source=runner.go:823 msg="Server listening on 127.0.0.1:40661"
time=2025-03-21T20:46:26.843Z level=INFO source=server.go:614 msg="waiting for server to become available" status="llm server loading model"
time=2025-03-21T20:46:26.918Z level=WARN source=ggml.go:149 msg="key not found" key=general.name default=""
time=2025-03-21T20:46:26.919Z level=WARN source=ggml.go:149 msg="key not found" key=general.description default=""
time=2025-03-21T20:46:26.919Z level=INFO source=ggml.go:67 msg="" architecture=gemma3 file_type=Q8_0 name="" description="" num_tensors=1065 num_key_values=36
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA L4, compute capability 8.9, VMM: yes
load_backend: loaded CUDA backend from /usr/lib/ollama/cuda_v12/libggml-cuda.so
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so
time=2025-03-21T20:46:27.090Z level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX_VNNI=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
time=2025-03-21T20:46:27.400Z level=INFO source=ggml.go:289 msg="model weights" buffer=CPU size="2.8 GiB"
time=2025-03-21T20:46:27.400Z level=INFO source=ggml.go:289 msg="model weights" buffer=CUDA0 size="10.6 GiB"
time=2025-03-21T20:46:32.626Z level=INFO source=ggml.go:358 msg="compute graph" backend=CUDA0 buffer_type=CUDA0
time=2025-03-21T20:46:32.626Z level=INFO source=ggml.go:358 msg="compute graph" backend=CPU buffer_type=CUDA_Host
time=2025-03-21T20:46:32.627Z level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?\\p{L}+|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-03-21T20:46:32.652Z level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.add_eot_token default=false
time=2025-03-21T20:46:32.659Z level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?\\p{L}+|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-03-21T20:46:32.675Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.attention.layer_norm_rms_epsilon default=9.999999974752427e-07
time=2025-03-21T20:46:32.675Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.local.freq_base default=10000
time=2025-03-21T20:46:32.675Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.global.freq_base default=1e+06
time=2025-03-21T20:46:32.675Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.freq_scale default=1
time=2025-03-21T20:46:32.675Z level=WARN source=ggml.go:149 msg="key not found" key=gemma3.mm_tokens_per_image default=256
time=2025-03-21T20:46:32.823Z level=INFO source=server.go:619 msg="llama runner started in 6.24 seconds"
[GIN] 2025/03/21 - 20:46:38 | 200 | 14.764096307s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:01 | 200 | 4.566085882s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:06 | 200 | 4.554805431s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:12 | 200 | 4.715826921s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:17 | 200 | 4.546770232s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:21 | 200 | 4.15018372s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:26 | 200 | 4.058930911s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:31 | 200 | 4.406374478s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:36 | 200 | 4.82134878s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:41 | 200 | 4.270936023s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:46 | 200 | 4.503333795s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:51 | 200 | 5.006674387s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:47:56 | 200 | 4.114162543s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:01 | 200 | 4.2738732s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:05 | 200 | 4.188704325s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:10 | 200 | 4.241442177s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:15 | 200 | 4.129792194s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:20 | 200 | 4.3214279s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:24 | 200 | 3.587986095s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:29 | 200 | 4.313050031s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:32 | 200 | 2.854960517s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:34 | 200 | 2.501394294s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:37 | 200 | 2.627572765s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:40 | 200 | 3.130043375s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:43 | 200 | 2.662864954s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:48 | 200 | 4.925040347s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:51 | 200 | 2.488742807s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:54 | 200 | 2.885512473s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:48:57 | 200 | 3.299145891s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:01 | 200 | 3.434951588s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:03 | 200 | 2.478358621s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:06 | 200 | 2.848593092s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:11 | 200 | 4.735401256s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:16 | 200 | 4.425845489s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:19 | 200 | 2.792529127s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:22 | 200 | 2.579684553s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:25 | 200 | 3.377363826s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:28 | 200 | 3.026787451s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:31 | 200 | 2.604754102s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:38 | 200 | 6.280336095s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:43 | 200 | 5.088536952s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:48 | 200 | 4.603678059s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:54 | 200 | 4.860173126s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:49:58 | 200 | 3.949498418s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:03 | 200 | 4.618251745s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:09 | 200 | 5.054474784s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:15 | 200 | 5.172528008s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:20 | 200 | 4.8591935s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:25 | 200 | 4.701303818s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:31 | 200 | 4.89035821s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:36 | 200 | 4.85653285s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:42 | 200 | 4.756271822s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:47 | 200 | 5.025355904s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:53 | 200 | 4.783854797s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:50:58 | 200 | 4.805786791s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:04 | 200 | 5.007257215s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:09 | 200 | 4.776338082s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:15 | 200 | 5.007888298s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:18 | 200 | 3.344238458s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:22 | 200 | 3.162420471s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:26 | 200 | 3.794950851s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:30 | 200 | 3.500502959s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:35 | 200 | 4.826391711s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:38 | 200 | 3.003738243s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:41 | 200 | 2.664321937s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:44 | 200 | 2.599451818s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:47 | 200 | 3.203374209s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:51:52 | 200 | 4.579518428s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:08 | 200 | 15.384545223s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:12 | 200 | 3.778863859s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:16 | 200 | 3.995665317s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:19 | 200 | 2.597400645s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:24 | 200 | 4.777593208s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:28 | 200 | 3.645757933s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:31 | 200 | 3.139649849s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:35 | 200 | 3.478799724s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:38 | 200 | 2.728811274s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:43 | 200 | 5.210005191s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:48 | 200 | 4.00060958s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:52 | 200 | 4.247598982s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:52:56 | 200 | 3.437067606s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:00 | 200 | 3.959532684s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:03 | 200 | 2.733534991s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:06 | 200 | 2.227446963s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:09 | 200 | 3.010960799s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:13 | 200 | 3.871838598s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:21 | 200 | 7.879818604s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:25 | 200 | 3.11920687s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:28 | 200 | 2.353313645s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:31 | 200 | 3.22204292s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:43 | 200 | 11.262881986s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:47 | 200 | 3.31337642s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:50 | 200 | 2.995078705s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:53 | 200 | 2.814704472s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:53:56 | 200 | 2.84772107s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:54:00 | 200 | 3.00154885s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:54:11 | 200 | 11.642598654s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:54:20 | 200 | 8.394102293s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:54:25 | 200 | 4.51539507s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:54:29 | 200 | 4.131378535s | 172.18.0.2 | POST "/api/generate"
time=2025-03-21T20:54:30.290Z level=WARN source=runner.go:129 msg="truncating input prompt" limit=20000 prompt=23846 keep=4 new=20000
[GIN] 2025/03/21 - 20:54:56 | 200 | 26.810453174s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:08 | 200 | 10.995124304s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:12 | 200 | 3.94399548s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:15 | 200 | 3.351313363s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:20 | 200 | 4.094253477s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:24 | 200 | 4.080088623s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:30 | 200 | 4.97488585s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:33 | 200 | 3.510778767s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:39 | 200 | 5.834664235s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:45 | 200 | 5.220793206s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:55:56 | 200 | 10.70601859s | 172.18.0.2 | POST "/api/generate"
time=2025-03-21T20:55:56.809Z level=WARN source=runner.go:129 msg="truncating input prompt" limit=20000 prompt=20756 keep=4 new=20000
[GIN] 2025/03/21 - 20:56:23 | 200 | 26.444570583s | 172.18.0.2 | POST "/api/generate"
time=2025-03-21T20:56:23.616Z level=WARN source=runner.go:129 msg="truncating input prompt" limit=20000 prompt=27848 keep=4 new=20000
[GIN] 2025/03/21 - 20:56:49 | 200 | 26.416760728s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:56:53 | 200 | 3.354186512s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:56:59 | 200 | 5.909290203s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:03 | 200 | 3.622530756s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:10 | 200 | 6.57663045s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:15 | 200 | 5.101499115s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:18 | 200 | 2.754333373s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:21 | 200 | 3.409513422s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:25 | 200 | 3.400220773s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:28 | 200 | 2.7849948s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:31 | 200 | 3.727357251s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:35 | 200 | 3.616184588s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:40 | 200 | 4.263642118s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:44 | 200 | 4.144743246s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:47 | 200 | 2.88177046s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:50 | 200 | 3.080436676s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:53 | 200 | 3.256830833s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:57:57 | 200 | 3.545207806s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:02 | 200 | 4.312316817s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:05 | 200 | 3.104324625s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:08 | 200 | 3.298992979s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:11 | 200 | 3.053323563s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:16 | 200 | 3.803010488s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:19 | 200 | 2.366760677s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:21 | 200 | 1.925973126s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:25 | 200 | 3.779349746s | 172.18.0.2 | POST "/api/generate"
[GIN] 2025/03/21 - 20:58:28 | 200 | 2.425022178s | 172.18.0.2 | POST "/api/generate"
time=2025-03-21T21:03:34.123Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.147554459 model=/root/.ollama/models/blobs/sha256-728e7e4ac6e65cd68bf0d6c3ebf2e9944b19d3ad2da49ab53265457f6de1f02c
time=2025-03-21T21:03:34.518Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.542755868 model=/root/.ollama/models/blobs/sha256-728e7e4ac6e65cd68bf0d6c3ebf2e9944b19d3ad2da49ab53265457f6de1f02c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment