Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ricklamers/5c31a97d56324e6127ff7545c9737664 to your computer and use it in GitHub Desktop.
Save ricklamers/5c31a97d56324e6127ff7545c9737664 to your computer and use it in GitHub Desktop.
Rank Overall Acc Model Model Link Organization License AST Summary Exec Summary Simple Function AST Python Simple Function AST Java Simple Function AST JavaScript Simple Function AST Multiple Functions AST Parallel Functions AST Parallel Multiple AST Simple Function Exec Python Simple Function Exec REST Simple Function Exec Multiple Functions Exec Parallel Functions Exec Parallel Multiple Exec Relevance Detection Cost ($ Per 1k Function Calls) Latency Mean (s) Latency Standard Deviation (s) Latency 95th Percentile (s)
1 78.76% GPT-4-turbo-2024-04-09 (FC) https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo OpenAI Proprietary 81.70% 65.13% 73.82% 90.00% 33.00% 26.00% 89.50% 89.00% 74.50% 73.53% 83.00% 60.00% 70.00% 72.00% 45.00% 88.75% 4.79 5.68 6.67 20.07
2 73.71% Claude-3-Opus-20240229 (FC tools-2024-04-04) https://www.anthropic.com/news/claude-3-family Anthropic Proprietary 70.35% 55.20% 80.91% 87.00% 61.00% 72.00% 91.00% 58.00% 51.50% 85.29% 85.00% 85.71% 74.00% 24.00% 37.50% 82.50% 30.65 12.63 3.64 19.72
3 73.41% Llama 3 70B Groq tool calling https://llama.meta.com/llama3/ Meta META LLAMA 3 COMMUNITY LICENSE AGREEMENT 82.78% 66.11% 79.64% 91.00% 48.00% 52.00% 93.00% 87.00% 71.50% 82.94% 83.00% 82.86% 76.00% 68.00% 37.50% 32.92% 0.07 2.49 1.05 5.18
4 59.76% GPT-3.5-Turbo-0125 (FC) https://platform.openai.com/docs/models/gpt-3-5-turbo OpenAI Proprietary 72.09% 69.15% 56.36% 56.50% 53.00% 62.00% 70.50% 91.00% 70.50% 84.12% 78.00% 92.86% 76.00% 74.00% 42.50% 2.92% N/A N/A N/A N/A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment