Skip to content

Instantly share code, notes, and snippets.

Kimi-K2-Instruct vs GLM-4.6 (BFCL Tool Benchmark)

Raw data glm-4.6-results.tar.gz

Overall Performance

Benchmark Kimi-K2-Instruct GLM-4.6-FP8
Overall Accuracy 45.62% 60.13%
Latency Mean 3.32 s 6.66 s
@jondurbin
jondurbin / chutes-tool-calling.md
Last active October 15, 2025 20:23
Kimi function calling benchmarks

MoonshotAI vs Chutes BFCL (tool) benchmark

Execution

git clone https://github.com/ShishirPatil/gorilla
cd gorilla/berkeley-function-call-leaderboard
python3 -m venv venv
./venv/bin/pip install -e .
# Apply diffs per provider
@jondurbin
jondurbin / dots_example.py
Created October 2, 2025 13:53
dots.ocr example
import json
import requests
import base64
import openai
import os
client = openai.Client(base_url="https://llm.chutes.ai/v1", api_key=os.getenv("CHUTES_API_KEY"))
prompt = """Please output the layout information from the PDF image, including each layout element's bbox, its category, and the corresponding text content within the bbox.
@jondurbin
jondurbin / example.md
Last active April 23, 2025 17:36
Deploying an LLM

"easy" vllm endpoint

You can call this endpoint and it will automatically select the most recent vllm image:

curl -XPOST https://api.chutes.ai/chutes/vllm \
  -H 'content-type: application/json' \
   -H 'Authorization: cpk...' \
  -d '{
    "tagline": "Mistral 24b Instruct",
    "model": "unsloth/Mistral-Small-24B-Instruct-2501",
    "public": true,
@jondurbin
jondurbin / example.py
Created April 10, 2025 11:23
kimi-vl example
import os
import base64
import openai
import glob
client = openai.Client(base_url="https://llm.chutes.ai/v1", api_key=os.environ["CHUTES_API_KEY"])
image_base64s = []
for path in glob.glob("/home/jdurbin/Downloads/logo*.png")[:8]:
with open(path, "rb") as infile:
@jondurbin
jondurbin / example.py
Created March 25, 2025 10:13
Qwen2.5-VL-32b-Instruct inference example
import os
import base64
import openai
import glob
client = openai.Client(base_url="https://llm.chutes.ai/v1", api_key=os.environ["CHUTES_API_KEY"])
image_base64s = []
for path in glob.glob("/home/jdurbin/Downloads/logo*.png")[:8]:
with open(path, "rb") as infile:
@jondurbin
jondurbin / spark_example.py
Last active March 18, 2025 10:37
Inference example with Spark-TTS on chutes
import os
import requests
import base64
audio = base64.b64encode(open("test.wav", "rb").read()).decode()
result = requests.post(
"https://chutes-spark-tts.chutes.ai/speak",
json={
"text": "How much wood would a woodchuck chuck if a woodchuck could chuck wood?",
"sample_audio_b64": audio,
@jondurbin
jondurbin / csm1b_example.py
Created March 18, 2025 09:20
Example inference with csm-1b on chutes
import os
import requests
import base64
audio = base64.b64encode(open("test.wav", "rb").read()).decode()
result = requests.post(
"https://chutes-csm-1b.chutes.ai/speak",
json={
"speaker": 1,
"context": [
@jondurbin
jondurbin / dolphin.txt
Last active March 16, 2025 07:24
Who is dolphin?
{
"id": "27ab0d1289814bb28c7c30e38a98df8d",
"object": "chat.completion",
"created": 1742109451,
"model": "cognitivecomputations/Dolphin3.0-Mistral-24B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
@jondurbin
jondurbin / chutes-walkthrough.md
Created December 8, 2024 12:31
chutes quickstart
  1. Install chutes (and bittensor if you don't already have a coldkey/hotkey)
python3 -m venv chutes-venv
source chutes-venv/bin/activate
pip install chutes 'bittensor<8'
  1. If you don't already have a coldkey/hotkey (replace chutes/chuteshk with your desired coldkey/hotkey names)