Generated on: 2025-04-13 13:12:49
This report was generated by running python tests/verifications/generate_report.py
- ✅ - Test passed
- ❌ - Test failed
- ⚪ - Test not applicable or not run for this model
Provider | Pass Rate | Tests Passed | Total Tests |
---|---|---|---|
Together | 64.7% | 22 | 34 |
Fireworks | 82.4% | 28 | 34 |
Groq | 61.8% | 21 | 34 |
Openai | 100.0% | 24 | 24 |
Together-llama-stack | 100.0% | 34 | 34 |
Fireworks-llama-stack | 100.0% | 34 | 34 |
Groq-llama-stack | 88.2% | 30 | 34 |
Openai-llama-stack | 100.0% | 24 | 24 |
Tests run on: 2025-04-13 13:06:42
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=together -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=together -k "test_chat_non_streaming_basic and earth"
Model Key (Together)
Display Name | Full Model ID |
---|---|
Llama-3.3-70B-Instruct | meta-llama/Llama-3.3-70B-Instruct-Turbo |
Llama-4-Maverick-Instruct | meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 |
Llama-4-Scout-Instruct | meta-llama/Llama-4-Scout-17B-16E-Instruct |
Test | Llama-3.3-70B-Instruct | Llama-4-Maverick-Instruct | Llama-4-Scout-Instruct |
---|---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_non_streaming_image | ⚪ | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_non_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_non_streaming_tool_calling | ✅ | ✅ | ✅ |
test_chat_streaming_basic (earth) | ✅ | ❌ | ❌ |
test_chat_streaming_basic (saturn) | ✅ | ❌ | ❌ |
test_chat_streaming_image | ⚪ | ❌ | ❌ |
test_chat_streaming_structured_output (calendar) | ✅ | ❌ | ❌ |
test_chat_streaming_structured_output (math) | ✅ | ❌ | ❌ |
test_chat_streaming_tool_calling | ✅ | ❌ | ❌ |
Tests run on: 2025-04-13 13:07:34
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=fireworks -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=fireworks -k "test_chat_non_streaming_basic and earth"
Model Key (Fireworks)
Display Name | Full Model ID |
---|---|
Llama-3.3-70B-Instruct | accounts/fireworks/models/llama-v3p3-70b-instruct |
Llama-4-Maverick-Instruct | accounts/fireworks/models/llama4-maverick-instruct-basic |
Llama-4-Scout-Instruct | accounts/fireworks/models/llama4-scout-instruct-basic |
Test | Llama-3.3-70B-Instruct | Llama-4-Maverick-Instruct | Llama-4-Scout-Instruct |
---|---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_non_streaming_image | ⚪ | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_non_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_non_streaming_tool_calling | ❌ | ❌ | ❌ |
test_chat_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_streaming_image | ⚪ | ✅ | ✅ |
test_chat_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_streaming_tool_calling | ❌ | ❌ | ❌ |
Tests run on: 2025-04-13 13:08:51
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=groq -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=groq -k "test_chat_non_streaming_basic and earth"
Model Key (Groq)
Display Name | Full Model ID |
---|---|
Llama-3.3-70B-Instruct | llama-3.3-70b-versatile |
Llama-4-Maverick-Instruct | meta-llama/llama-4-maverick-17b-128e-instruct |
Llama-4-Scout-Instruct | meta-llama/llama-4-scout-17b-16e-instruct |
Test | Llama-3.3-70B-Instruct | Llama-4-Maverick-Instruct | Llama-4-Scout-Instruct |
---|---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_non_streaming_image | ⚪ | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ❌ | ❌ | ❌ |
test_chat_non_streaming_structured_output (math) | ❌ | ❌ | ❌ |
test_chat_non_streaming_tool_calling | ✅ | ✅ | ✅ |
test_chat_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_streaming_image | ⚪ | ✅ | ✅ |
test_chat_streaming_structured_output (calendar) | ❌ | ❌ | ❌ |
test_chat_streaming_structured_output (math) | ❌ | ❌ | ❌ |
test_chat_streaming_tool_calling | ✅ | ❌ | ✅ |
Tests run on: 2025-04-13 13:09:17
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=openai -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=openai -k "test_chat_non_streaming_basic and earth"
Model Key (Openai)
Display Name | Full Model ID |
---|---|
gpt-4o | gpt-4o |
gpt-4o-mini | gpt-4o-mini |
Test | gpt-4o | gpt-4o-mini |
---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ |
test_chat_non_streaming_image | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ✅ | ✅ |
test_chat_non_streaming_structured_output (math) | ✅ | ✅ |
test_chat_non_streaming_tool_calling | ✅ | ✅ |
test_chat_streaming_basic (earth) | ✅ | ✅ |
test_chat_streaming_basic (saturn) | ✅ | ✅ |
test_chat_streaming_image | ✅ | ✅ |
test_chat_streaming_structured_output (calendar) | ✅ | ✅ |
test_chat_streaming_structured_output (math) | ✅ | ✅ |
test_chat_streaming_tool_calling | ✅ | ✅ |
Tests run on: 2025-04-13 13:09:56
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=together-llama-stack -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=together-llama-stack -k "test_chat_non_streaming_basic and earth"
Model Key (Together-llama-stack)
Display Name | Full Model ID |
---|---|
Llama-3.3-70B-Instruct | together/meta-llama/Llama-3.3-70B-Instruct-Turbo |
Llama-4-Maverick-Instruct | together/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 |
Llama-4-Scout-Instruct | together/meta-llama/Llama-4-Scout-17B-16E-Instruct |
Test | Llama-3.3-70B-Instruct | Llama-4-Maverick-Instruct | Llama-4-Scout-Instruct |
---|---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_non_streaming_image | ⚪ | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_non_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_non_streaming_tool_calling | ✅ | ✅ | ✅ |
test_chat_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_streaming_image | ⚪ | ✅ | ✅ |
test_chat_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_streaming_tool_calling | ✅ | ✅ | ✅ |
Tests run on: 2025-04-13 13:10:46
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=fireworks-llama-stack -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=fireworks-llama-stack -k "test_chat_non_streaming_basic and earth"
Model Key (Fireworks-llama-stack)
Display Name | Full Model ID |
---|---|
Llama-3.3-70B-Instruct | fireworks/llama-v3p3-70b-instruct |
Llama-4-Maverick-Instruct | fireworks/llama4-maverick-instruct-basic |
Llama-4-Scout-Instruct | fireworks/llama4-scout-instruct-basic |
Test | Llama-3.3-70B-Instruct | Llama-4-Maverick-Instruct | Llama-4-Scout-Instruct |
---|---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_non_streaming_image | ⚪ | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_non_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_non_streaming_tool_calling | ✅ | ✅ | ✅ |
test_chat_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_streaming_image | ⚪ | ✅ | ✅ |
test_chat_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_streaming_tool_calling | ✅ | ✅ | ✅ |
Tests run on: 2025-04-13 13:11:39
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=groq-llama-stack -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=groq-llama-stack -k "test_chat_non_streaming_basic and earth"
Model Key (Groq-llama-stack)
Display Name | Full Model ID |
---|---|
Llama-3.3-70B-Instruct | groq/llama-3.3-70b-versatile |
Llama-4-Maverick-Instruct | groq/llama-4-maverick-17b-128e-instruct |
Llama-4-Scout-Instruct | groq/llama-4-scout-17b-16e-instruct |
Test | Llama-3.3-70B-Instruct | Llama-4-Maverick-Instruct | Llama-4-Scout-Instruct |
---|---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_non_streaming_image | ⚪ | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_non_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_non_streaming_tool_calling | ✅ | ❌ | ❌ |
test_chat_streaming_basic (earth) | ✅ | ✅ | ✅ |
test_chat_streaming_basic (saturn) | ✅ | ✅ | ✅ |
test_chat_streaming_image | ⚪ | ✅ | ✅ |
test_chat_streaming_structured_output (calendar) | ✅ | ✅ | ✅ |
test_chat_streaming_structured_output (math) | ✅ | ✅ | ✅ |
test_chat_streaming_tool_calling | ✅ | ❌ | ❌ |
Tests run on: 2025-04-13 13:12:17
# Run all tests for this provider:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=openai-llama-stack -v
# Example: Run only the 'earth' case of test_chat_non_streaming_basic:
pytest tests/verifications/openai_api/test_chat_completion.py --provider=openai-llama-stack -k "test_chat_non_streaming_basic and earth"
Model Key (Openai-llama-stack)
Display Name | Full Model ID |
---|---|
gpt-4o | openai/gpt-4o |
gpt-4o-mini | openai/gpt-4o-mini |
Test | gpt-4o | gpt-4o-mini |
---|---|---|
test_chat_non_streaming_basic (earth) | ✅ | ✅ |
test_chat_non_streaming_basic (saturn) | ✅ | ✅ |
test_chat_non_streaming_image | ✅ | ✅ |
test_chat_non_streaming_structured_output (calendar) | ✅ | ✅ |
test_chat_non_streaming_structured_output (math) | ✅ | ✅ |
test_chat_non_streaming_tool_calling | ✅ | ✅ |
test_chat_streaming_basic (earth) | ✅ | ✅ |
test_chat_streaming_basic (saturn) | ✅ | ✅ |
test_chat_streaming_image | ✅ | ✅ |
test_chat_streaming_structured_output (calendar) | ✅ | ✅ |
test_chat_streaming_structured_output (math) | ✅ | ✅ |
test_chat_streaming_tool_calling | ✅ | ✅ |