xinuc · August 28, 2025 10:00
diff --git a/kolosal b/kolosal
 Running: qwen2.5-coder-0.5b:Q5_0
 Type '/exit' or press Ctrl+C to quit
 Type '/help' to see available commands

 hello

 > ❌ Error: Failed to get response from the model. Please try again.
 Type your message or use /help for commands...
diff --git a/kolosal-server b/kolosal-server
 [2025-08-28 16:59:19.173] [INFO] Successfully saved model 'qwen2.5-coder-0.5b:Q5_0' to configuration file
 [2025-08-28 16:59:19.173] [INFO] Registering LLM engine for lazy loading: qwen2.5-coder-0.5b:Q5_0
 [2025-08-28 16:59:19.173] [INFO] Engine created successfully for model: qwen2.5-coder-0.5b:Q5_0
 [2025-08-28 16:59:19.412] [INFO] Successfully provided download progress for model: qwen2.5-coder-0.5b:Q5_0 (100.0%)
 [2025-08-28 16:59:23.456] [INFO] [Thread 1804791808] Processing streaming inference chat completion request for model 'qwen2.5-coder-0.5b:Q5_0'
 [2025-08-28 16:59:23.457] [INFO] Engine ID 'qwen2.5-coder-0.5b:Q5_0' was unloaded due to inactivity. Attempting to reload.
 [2025-08-28 16:59:23.457] [INFO] Stored engine type for 'qwen2.5-coder-0.5b:Q5_0': 'llama-metal'
 [2025-08-28 16:59:23.457] [INFO] Creating new inference engine instance for reload...
 [2025-08-28 16:59:23.457] [INFO] Reloading model from path: /Users/xinuc/Library/Application Support/Kolosal/models/qwen2.5-coder-0.5b-instruct-q5_0.gguf
 common_init_from_params: added <|endoftext|> logit bias = -inf
 common_init_from_params: added <|im_end|> logit bias = -inf
 common_init_from_params: added <|fim_pad|> logit bias = -inf
 common_init_from_params: added <|repo_name|> logit bias = -inf
 common_init_from_params: added <|file_sep|> logit bias = -inf
 common_init_from_params: setting dry_penalty_last_n to ctx_size = 8192
 [2025-08-28 16:59:24.687] [INFO] Successfully reloaded model for engine 'qwen2.5-coder-0.5b:Q5_0'
 [2025-08-28 16:59:24.687] [INFO] Successfully reloaded LLM engine ID 'qwen2.5-coder-0.5b:Q5_0'.
 [2025-08-28 16:59:24.687] [INFO] Processing OpenAI chat completion for model 'qwen2.5-coder-0.5b:Q5_0'
 [2025-08-28 16:59:24.910] [INFO] [Thread 1804791808] Completed streaming response for model 'qwen2.5-coder-0.5b:Q5_0'
	Running: qwen2.5-coder-0.5b:Q5_0
	Type '/exit' or press Ctrl+C to quit
	Type '/help' to see available commands

	hello

	> ❌ Error: Failed to get response from the model. Please try again.
	Type your message or use /help for commands...
	[2025-08-28 16:59:19.173] [INFO] Successfully saved model 'qwen2.5-coder-0.5b:Q5_0' to configuration file
	[2025-08-28 16:59:19.173] [INFO] Registering LLM engine for lazy loading: qwen2.5-coder-0.5b:Q5_0
	[2025-08-28 16:59:19.173] [INFO] Engine created successfully for model: qwen2.5-coder-0.5b:Q5_0
	[2025-08-28 16:59:19.412] [INFO] Successfully provided download progress for model: qwen2.5-coder-0.5b:Q5_0 (100.0%)
	[2025-08-28 16:59:23.456] [INFO] [Thread 1804791808] Processing streaming inference chat completion request for model 'qwen2.5-coder-0.5b:Q5_0'
	[2025-08-28 16:59:23.457] [INFO] Engine ID 'qwen2.5-coder-0.5b:Q5_0' was unloaded due to inactivity. Attempting to reload.
	[2025-08-28 16:59:23.457] [INFO] Stored engine type for 'qwen2.5-coder-0.5b:Q5_0': 'llama-metal'
	[2025-08-28 16:59:23.457] [INFO] Creating new inference engine instance for reload...
	[2025-08-28 16:59:23.457] [INFO] Reloading model from path: /Users/xinuc/Library/Application Support/Kolosal/models/qwen2.5-coder-0.5b-instruct-q5_0.gguf
	common_init_from_params: added <\|endoftext\|> logit bias = -inf
	common_init_from_params: added <\|im_end\|> logit bias = -inf
	common_init_from_params: added <\|fim_pad\|> logit bias = -inf
	common_init_from_params: added <\|repo_name\|> logit bias = -inf
	common_init_from_params: added <\|file_sep\|> logit bias = -inf
	common_init_from_params: setting dry_penalty_last_n to ctx_size = 8192
	[2025-08-28 16:59:24.687] [INFO] Successfully reloaded model for engine 'qwen2.5-coder-0.5b:Q5_0'
	[2025-08-28 16:59:24.687] [INFO] Successfully reloaded LLM engine ID 'qwen2.5-coder-0.5b:Q5_0'.
	[2025-08-28 16:59:24.687] [INFO] Processing OpenAI chat completion for model 'qwen2.5-coder-0.5b:Q5_0'
	[2025-08-28 16:59:24.910] [INFO] [Thread 1804791808] Completed streaming response for model 'qwen2.5-coder-0.5b:Q5_0'