Last active
August 23, 2025 11:55
-
-
Save MaxGhenis/0627c65ae28ca0cf2674ddfbd817c85b to your computer and use it in GitHub Desktop.
Expected Parrot EDSL: n parameter not implemented (accepts but ignores)
Makes sense! Just on OpenAI and Gemini:
| Provider | Parameter | Max Limit | API Documentation |
|---|---|---|---|
| OpenAI | n |
128 | API Reference: n: integer, Optional, Defaults to 1. How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. |
| Azure OpenAI | n |
128 | Same as OpenAI - Azure Reference |
| Google Gemini | candidateCount |
8 | API Reference: candidateCount: integer, Optional. The number of response variations to return. For each request, you're charged for the output tokens of all candidates, but are only charged once for the input tokens. Range: 1-8 for Gemini 2.0+ |
| Anthropic | - | None | API Reference: No n parameter. The Messages API only returns a single response per request. |
| Together AI | - | None | API Reference: No n parameter documented in their completions endpoint. |
| Groq | n |
1 only | API Reference: n: integer or null, Optional, default: 1. Note that at the current moment, only n=1 is supported. |
| Mistral | - | None | API Reference: No n parameter in chat completions. |
| Others | - | None | Bedrock (ref), Replicate (ref), Deep Infra - no n equivalent |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
So "our" n-parameter is passed to
runnot to model. This conceptually makes more sense to me, as ourModelobject is about the model not really how it is used. My instinct is for us to intercept that "run(n = ...)" in the case of models that support this approach and then change the API call. Do you have a sense, @MaxGhenis how many other inference providers are supporting this?