When you run a model through llama.cpp and access it from OpenWebUI using an OpenAI-compatible API, you may want to control how “strongly” the model reasons. A reliable way to do this is to send a custom parameter called chat_template_kwargs from OpenWebUI. This parameter can include a reasoning_effort setting such as low, medium, or high.
In many llama.cpp-based deployments, the model’s reasoning behavior is influenced by values passed into the chat template. Rather than trying to force reasoning strength through prompts, passing reasoning_effort via chat_template_kwargs provides a more direct and predictable control mechanism. OpenWebUI supports sending such custom parameters in its model configuration, and this approach is also demonstrated in official integration guidance (in a different backend example). [OpenVINO Documentation][2]