This workshop demonstrates how to route embedding requests through Agent Gateway to a self-hosted vLLM server running an OpenAI-compatible API. This is the Agent Gateway equivalent of a LiteLLM config like:
- model_name: qwen3
litellm_params:
model: hosted_vllm//apps/ecs_mounts/data/q3.6b
api_key: $VLLM_API_KEY
api_base: $VLLM_HOST_URL