| Field | Value |
|---|---|
| Title | Vercel AI SDK Community Provider for Gemini CLI Core |
| Author | Gemini |
| Status | Proposed |
| Version | 1.0 |
| Date | June 25, 2025 |
This document outlines the product requirements for creating a new community provider for the Vercel AI SDK. This provider will act as a bridge between the Vercel AI SDK (@vercel/ai) and the @google/gemini-cli-core library, enabling developers to leverage the power of Google's Gemini models within the standardized, framework-agnostic Vercel AI SDK ecosystem.
The provider will implement the LanguageModelV1 interface specified by the Vercel AI team, ensuring seamless integration with existing AI SDK helper functions like streamText, generateText, and streamObject.
Developers using the Vercel AI SDK to build applications currently lack a direct, officially supported, or community-provided integration for the powerful features exposed by the @google/gemini-cli-core package. This includes its robust tool-use implementation, chat history management, and direct access to Gemini models.
To use these features, a developer would need to write a significant amount of custom boilerplate code to adapt the @google/gemini-cli-core interface to the Vercel AI SDK's LanguageModelV1 specification. This creates friction, duplicates effort across the community, and defeats the purpose of having a standardized SDK for building AI applications.
- Primary Goal: To enable developers to easily use Google's Gemini models via the
@google/gemini-cli-corelibrary as a language model provider within the Vercel AI SDK. - Objective 1: Create and publish an npm package (e.g.,
@ai-sdk/gemini-cli-provider) that exports a fully compliantLanguageModelV1provider. - Objective 2: Implement full support for the core AI SDK features, including single-shot generation (
generateText), response streaming (streamText), and server-side tool/function calling. - Objective 3: Provide clear and comprehensive documentation with installation instructions and usage examples for all supported features.
- Objective 4: Achieve sufficient quality and stability to be listed as an official community provider by the Vercel AI SDK team.
The primary audience for this provider is any developer using the Vercel AI SDK in their projects (e.g., Next.js, SvelteKit, or any Node.js environment) who wishes to use Google's Gemini models as their backend LLM.
- A new class/factory function that implements the
LanguageModelV1interface. - Implementation of the
doGeneratemethod for non-streaming text and tool-use generation. - Implementation of the
doStreammethod for streaming text, tool-use, and other stream parts. - Translation layer to map data structures between the Vercel AI SDK format and the
@google/gemini-cli-coreformat. - Flexible authentication support, allowing instantiation with either a simple API key or a pre-configured
GoogleAuthclient object for more complex scenarios like OAuth2. - Support for multimodal inputs, specifically mapping image data (e.g., base64 encoded strings with MIME types) from the Vercel AI SDK prompt format to the format expected by
@google/gemini-cli-core. - Mapping of
@google/gemini-cli-coreerrors to standardLanguageModelV1Errortypes. - Comprehensive unit and integration test suite.
- A
README.mdfile detailing installation, authentication, and usage.
- Support for any language model interface other than
LanguageModelV1. - Implementation of the interactive, user-facing OAuth2 flow. The provider will accept a configured auth client, but the end-user application is responsible for the interactive process of obtaining tokens.
- A user interface or front-end components. This is a server-side provider only.
- Direct interaction with the
geminiCLI executable; this provider will use the@google/gemini-cli-corelibrary directly.
- FR1.1: The provider shall export a factory function (e.g.,
createGeminiCliCoreProvider()) that returns an object conforming to theLanguageModelV1interface. - FR1.2: The provider must be stateless, with all necessary configuration passed during instantiation.
- FR2.1: The factory function must accept an options object that allows for flexible authentication.
- FR2.2: The options object must support providing a simple
apiKeystring for basic authentication. - FR2.3: The options object must also support providing a pre-configured
authClientobject (e.g., an instance ofGoogleAuthfromgoogle-auth-library) for advanced authentication scenarios like OAuth2 or service accounts. - FR2.4: The provider must throw an error during initialization if neither an
apiKeynor anauthClientis provided. - FR2.5: The provider must securely use the provided credentials to initialize and make requests with the underlying
@google/gemini-cli-coreclient.
- FR3.1: The provider must implement the
doGenerate(prompt: LanguageModelV1Prompt)method. - FR3.2: It must correctly map the incoming
LanguageModelV1Prompt(including system messages, user/assistant history, and tool calls/results) to the format expected by@google/gemini-cli-core. - FR3.3: It must correctly map the response from
@google/gemini-cli-coreto theLanguageModelV1Responseformat, includingtext,toolCalls,finishReason, andusagetokens.
- FR4.1: The provider must implement the
doStream(prompt: LanguageModelV1Prompt)method. - FR4.2: The method must return an
AsyncIterable<LanguageModelV1StreamPart>. - FR4.3: The provider must handle the streaming output from
@google/gemini-cli-coreand transform it into a stream ofLanguageModelV1StreamPartobjects. - FR4.4: The stream must support the following parts:
text-delta: For streaming text content.tool-call: For streaming tool use requests from the model.finish: To signal the end of the generation, including thefinishReasonandusage.error: To communicate errors that occur during the stream.
- FR5.1: The provider must accept tool definitions in the Vercel AI SDK format (Zod schemas) passed in the prompt.
- FR5.2: It must translate these Zod-based tool definitions into the format required by
@google/gemini-cli-coreto be sent to the model. - FR5.3: It must correctly parse
toolCallsfrom the model's response (both streaming and non-streaming) and format them into the Vercel AI SDK'stoolCallsstructure.
- FR6.1: The provider must correctly identify image data within the
LanguageModelV1Prompt. - FR6.2: It must map the Vercel AI SDK's image format (e.g.,
{ type: 'image', image: Buffer | string, mimeType?: string }) to thePartobject format used by@google/gemini-cli-core({ inlineData: { data: 'base64...', mimeType: '...' } }).
- NFR1 (Performance): The provider should introduce minimal latency over the direct
@google/gemini-cli-corelibrary calls. Streaming should be efficient, yielding chunks as soon as they are received. - NFR2 (Reliability): The provider must gracefully handle API errors from the downstream service and network interruptions, reporting them through the standard error-handling mechanisms of the Vercel AI SDK.
- NFR3 (Security): API keys and other sensitive data must be handled securely and not be exposed in logs or error messages.
- NFR4 (Documentation): The provider must have a
README.mdfile that includes:- NPM package name and installation instructions.
- Clear examples of how to instantiate the provider using both a simple
apiKeyand a pre-configuredauthClientfor OAuth. - Usage examples for
streamText,generateText,streamObjectwith tool calling, and prompts with image inputs. - A note clarifying that the end-user application is responsible for the interactive part of the OAuth flow.
- NFR5 (Testability): The provider must have a high level of unit test coverage, with tests for all core functionalities, including data mapping, streaming logic, and error handling.
- Dependency 1: This project will depend on the
@vercel/ai,@google/gemini-cli-core, andzodnpm packages. - Assumption 1: The
@google/gemini-cli-corelibrary provides a stable, programmatically accessible API for sending prompts, receiving responses (including streams), and defining tools. - Assumption 2: The Vercel
LanguageModelV1specification will remain stable throughout the development of version 1.0 of this provider.
- Adoption: The number of weekly downloads of the package on npm.
- Quality: A low number of bug reports and issues opened on the project's GitHub repository.
- Functionality: 100% of the functional requirements are implemented and verified by the test suite.
- Community Recognition: The provider is successfully submitted to and accepted by the Vercel AI SDK team for inclusion in their list of community providers.
- Provide optional helper utilities or detailed documentation/tutorials to guide end-users in implementing the interactive OAuth2 flow in common frameworks (e.g., Next.js).
- Advanced caching strategies to reduce redundant API calls.
- Adding support for new features as they are introduced in the Gemini API and
@google/gemini-cli-core.