A comprehensive guide to building production-ready real-time voice/video applications with LiveKit
Essential patterns and concepts for building real-time voice/video applications with LiveKit. This guide covers fundamental architecture, deployment patterns, and best practices for any LiveKit project.
What you'll learn:
- The core components of the LiveKit ecosystem
- Production-ready patterns for authentication, connection management, and error handling
- Mobile compatibility and cross-browser support
- Deployment strategies for various platforms
- Quick start templates and code examples
- LiveKit Ecosystem Overview
- Core Components
- Essential Patterns for New Projects
- Deployment Checklist
- Common Integration Patterns
- Quick Start Template
- Key Takeaways
The LiveKit platform consists of these core components:
- LiveKit Server: An open-source media server that enables realtime communication between participants. Use LiveKit's fully-managed global cloud, or self-host your own.
- LiveKit SDKs: Full-featured web, native, and backend SDKs that make it easy to join rooms and publish and consume realtime media and data.
- LiveKit Agents: A framework for building realtime multimodal AI agents, with an extensive collection of plugins for nearly every AI provider.
- Telephony: A flexible SIP integration for inbound or outbound calling into any LiveKit room or agent session.
- Egress: Record and export realtime media from LiveKit rooms.
- Ingress: Ingest external streams (such as RTMP and WHIP) into LiveKit rooms.
- Server APIs: A REST API for managing rooms, and more. Includes SDKs and a CLI.
Client SDKs ← WebSocket → LiveKit Server ← Agent Framework → AI Providers
↓ ↓ ↓ (Rooms) ↓ ↓
Browser/Mobile ← Auth → Cloud/Self-hosted ← Function Tools → Business Logic
↓ ↓ ↓
SIP/Phone ← Telephony ← Egress/Ingress ← Server APIs
Key Principle: LiveKit creates "rooms" where participants (clients and agents) communicate in real-time through WebRTC and WebSocket connections.
What it is: The core media server and REST APIs for managing rooms, participants, and infrastructure Deployment: LiveKit Cloud (managed) or self-hosted open-source server
Core Responsibilities:
- Room Management: Create, list, delete rooms via REST API
- Media Routing: WebRTC signaling and media server functionality
- Authentication: Validate JWT tokens for room access
- Participant Management: Track connections, permissions, metadata
- Recording/Streaming: Coordinate egress and ingress services
Essential Backend Patterns:
from livekit import api
# Token generation (CRITICAL - every project needs this)
def generate_access_token(room_name: str, participant_identity: str):
token = api.AccessToken(api_key, api_secret)
token.with_identity(participant_identity).with_grants(api.VideoGrants(
room_join=True,
room=room_name,
can_publish=True,
can_subscribe=True,
can_publish_data=True, # For realtime data/control messages
can_publish_sources=['camera', 'microphone'] # Specific sources
))
return token.to_jwt()
# Room management
async def create_room(name: str):
async with api.LiveKitAPI(url, api_key, api_secret) as lk:
room = await lk.room.create_room(
api.CreateRoomRequest(name=name)
)
return room
URL Conventions:
- Client connections:
wss://your-instance.livekit.cloud
(WebSocket) - Server API calls:
https://your-instance.livekit.cloud
(HTTP REST)
What it is: Full-featured client SDKs for web, mobile, and native applications
Packages: livekit-client
(JS), platform-specific SDKs for iOS/Android/Flutter/Unity
Essential Patterns:
import { Room, RoomEvent, Track } from 'livekit-client';
// Basic connection pattern with optimizations
const room = new Room({
adaptiveStream: true,
dynacast: true
});
// Pre-warm connection for faster join
await room.prepareConnection(wsUrl, token);
// Essential event handling
room.on(RoomEvent.Connected, () => {
console.log('Connected to room');
});
room.on(RoomEvent.TrackSubscribed, (track, publication, participant) => {
if (track.kind === Track.Kind.Audio) {
const audioElement = track.attach();
document.body.appendChild(audioElement);
}
});
await room.connect(wsUrl, token);
Client Responsibilities:
- WebRTC Media: Audio/video streaming with adaptive bitrate
- Room Connection: Join/leave rooms with automatic reconnection
- Track Management: Publish/subscribe to media tracks
- Realtime Data: Send/receive control messages and data packets
- Connection State: Handle network changes and interruptions
Critical Patterns for New Projects:
- Use LiveKit's built-in audio playback handling (not manual Web Audio unlock)
- Implement connection retry logic with ReconnectPolicy
- Handle graceful disconnections vs unexpected drops
- Token refresh only needed for rejoins or changing grants (not during session)
What it is: A framework for building realtime multimodal AI agents with plugins for major AI providers
Package: livekit-agents
(Python framework)
Essential Agent Pattern (Agents 1.0):
from livekit import agents
from livekit.agents import Agent, AgentSession, function_tool
class MyAgent(Agent):
async def start(self, ctx: AgentSession):
# Agent initialization
await ctx.connect()
# Configure AI model (OpenAI, Google, etc.)
self.llm = SomeAIProvider()
@function_tool
async def my_function_tool(param: str) -> str:
"""Tool that AI can call based on voice/text input"""
# Your business logic here
return f"Executed with {param}"
async def entrypoint(ctx: agents.JobContext):
session = AgentSession(...)
await session.start(room=ctx.room, agent=MyAgent())
if __name__ == "__main__":
agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))
Run locally with:
python agent.py dev # development with hot reload
python agent.py start # production
Agent Dispatch: Use WorkerOptions(agent_name="my-agent")
to disable auto-dispatch and control when agents join via the Agent Dispatch API or room token roomConfig
. Especially useful for SIP flows.
Agent Framework Features:
- Multimodal AI: Voice, text, and vision processing
- AI Provider Plugins: OpenAI, Google, Anthropic, Deepgram, ElevenLabs, and more
- Function Tools: Bridge AI responses to your business logic
- Room Participation: Agents join as participants with full WebRTC capabilities
- Auto/Manual Dispatch: Control when agents join rooms
Universal Patterns for Any Project:
- Always implement health checks and timeouts
- Handle connection drops gracefully
- Use function tools to bridge AI and your business logic
- Consider agent lifecycle (startup, idle, cleanup)
- Plan for horizontal scaling if needed
Telephony: SIP integration for phone calls into LiveKit rooms
- Inbound: Route phone calls to specific rooms or agents
- Outbound: Programmatically call phone numbers from agents
- Use Cases: Customer service bots, conference calls, accessibility
Egress: Record and export realtime media from LiveKit rooms
- Composite Recording: Combined audio/video files
- Track Recording: Individual participant tracks
- Live Streaming: RTMP to platforms like YouTube, Twitch
- Use Cases: Meeting recordings, content creation, compliance
Ingress: Ingest external streams into LiveKit rooms
- RTMP/WHIP: Bring external video sources into rooms
- Use Cases: Broadcasting, screen sharing, external cameras
Server APIs: REST API for room and infrastructure management
- Room Management: Create, list, delete rooms programmatically
- Participant Control: Kick users, update permissions
- Webhooks: Real-time notifications of room events
- Use Cases: Admin dashboards, automated workflows, billing
# Example: Using Server APIs
async with api.LiveKitAPI(url, api_key, api_secret) as lk:
# Create room
room = await lk.room.create_room(api.CreateRoomRequest(name="meeting-123"))
# Start recording
await lk.egress.start_room_composite_egress(
api.RoomCompositeEgressRequest(room_name="meeting-123")
)
# List participants
participants = await lk.room.list_participants(
api.ListParticipantsRequest(room="meeting-123")
)
What it is: The runtime process that runs your agents
Command: livekit-agents worker
or custom process managers
Basic Deployment Pattern:
# Modern CLI (preferred)
python agent.py dev # development with hot reload
python agent.py start # production
Environment Setup:
# Required environment variables
LIVEKIT_URL=wss://your-instance.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_secret
Production Considerations:
- Process management: Use supervisord, systemd, or container orchestration
- Health monitoring: Monitor worker connectivity
- Auto-restart: Handle connection drops and crashes
- Resource limits: Set memory/CPU limits
- Load balancing: Multiple workers for scalability
Critical for Any Project:
- Always implement worker health checks
- Plan for graceful shutdowns
- Monitor worker registration in LiveKit logs
- Handle network interruptions appropriately
Secure Token Generation (Backend):
# NEVER expose API keys to frontend
@app.post("/api/livekit/token")
async def get_livekit_token(user_id: str, room_name: str):
# Validate user permission to join room
if not user_can_join_room(user_id, room_name):
raise HTTPException(401, "Unauthorized")
token = api.AccessToken(api_key, api_secret)
token.with_identity(user_id).with_grants(api.VideoGrants(
room_join=True,
room=room_name,
can_publish=True,
can_subscribe=True,
can_publish_data=True, # For realtime data/control messages
can_publish_sources=['camera', 'microphone'] # Specific sources
))
return {"token": token.to_jwt(), "wsUrl": LIVEKIT_URL}
Important: Tokens gate the initial connection only. LiveKit will proactively refresh access tokens for connected clients so they can reconnect; no client-side refresh logic is required. Update permissions during a session via server APIs if roles change.
Client Token Usage:
// Fetch token from your backend
const response = await fetch('/api/livekit/token', {
method: 'POST',
body: JSON.stringify({ userId, roomName })
});
const { token, wsUrl } = await response.json();
// Connect to LiveKit
await room.connect(wsUrl, token);
Essential Connection Patterns:
class LiveKitManager {
constructor() {
this.room = new Room();
this.connectionState = 'disconnected';
this.setupEventHandlers();
}
setupEventHandlers() {
this.room.on(RoomEvent.Connected, () => {
this.connectionState = 'connected';
this.onConnected();
});
this.room.on(RoomEvent.Disconnected, (reason) => {
this.connectionState = 'disconnected';
this.handleDisconnect(reason);
});
this.room.on(RoomEvent.Reconnecting, () => {
this.connectionState = 'reconnecting';
this.showReconnectingUI();
});
}
async handleDisconnect(reason) {
// DisconnectReason enum provides server-side causes:
// 'user_initiated', 'duplicate_identity', 'room_deleted', etc.
if (reason === 'user_initiated') {
// User clicked disconnect - clean up
this.cleanup();
} else {
// Network issue or server-side disconnect - attempt reconnect
this.attemptReconnect();
}
}
}
Basic Agent Health Management:
class HealthyAgent(Agent):
def __init__(self):
self.last_activity = time.time()
self.max_idle_seconds = 300 # 5 minutes
self.session_start = time.time()
self.max_session_seconds = 3600 # 1 hour
async def start(self, ctx: AgentSession):
# Start health monitor
asyncio.create_task(self.health_monitor())
async def health_monitor(self):
while True:
await asyncio.sleep(30) # Check every 30s
idle_time = time.time() - self.last_activity
session_time = time.time() - self.session_start
if idle_time > self.max_idle_seconds:
logger.info("Agent idle timeout - shutting down")
await self.cleanup_and_exit()
if session_time > self.max_session_seconds:
logger.info("Agent session timeout - shutting down")
await self.cleanup_and_exit()
def update_activity(self):
self.last_activity = time.time()
Audio Context Handling (Use LiveKit's Built-in Handling):
import { Room, RoomEvent } from 'livekit-client';
const room = new Room();
// Listen for audio playback status
room.on(RoomEvent.AudioPlaybackStatusChanged, () => {
if (!room.canPlaybackAudio) {
// Show "Enable Audio" button
const button = document.createElement('button');
button.textContent = 'Enable Audio';
button.onclick = async () => {
await room.startAudio();
button.remove();
};
document.body.appendChild(button);
}
});
// For React apps, use hooks:
// import { useAudioPlayback } from '@livekit/components-react';
// const { startAudio } = useAudioPlayback();
// Or use <RoomAudioRenderer /> component
Connection State & Resilience:
import { Room, RoomEvent, DefaultReconnectPolicy } from 'livekit-client';
const room = new Room();
// Handle primary connection events
room.on(RoomEvent.Connected, () => {
console.log('Connected to room');
});
room.on(RoomEvent.Reconnecting, () => {
console.log('Reconnecting...');
// Show reconnecting UI
});
room.on(RoomEvent.Reconnected, () => {
console.log('Reconnected successfully');
// Hide reconnecting UI
});
room.on(RoomEvent.Disconnected, (reason) => {
console.log('Disconnected:', reason);
// Handle disconnection
});
// Customize retry policy if needed
const customReconnectPolicy = {
nextRetryDelayInMs(context) {
return Math.min(context.retryCount * 1000, 10000); // Max 10s
}
};
const roomWithCustomPolicy = new Room({
reconnectPolicy: customReconnectPolicy
});
# Core LiveKit credentials
LIVEKIT_URL=wss://your-instance.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_secret
# AI Provider (choose one)
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
# Optional but recommended
REDIS_URL=redis://localhost:6379 # For session state
DATABASE_URL=postgresql://... # For user/room persistence
Railway/Render/Fly.io:
- Enable TCP proxy if available (prevents WebSocket timeouts)
- Set worker memory limits (agents can be memory-intensive)
- Monitor startup time (agents need time to connect)
Docker Deployment:
FROM python:3.11-slim
# Install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
# Copy application
COPY . /app
WORKDIR /app
# Run worker (modern CLI preferred)
CMD ["python", "agent.py", "start"]
Kubernetes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: livekit-agent
spec:
replicas: 3 # Scale based on load
template:
spec:
containers:
- name: agent
image: your-agent:latest
env:
- name: LIVEKIT_URL
valueFrom:
secretKeyRef:
name: livekit-secrets
key: url
resources:
limits:
memory: "512Mi"
cpu: "500m"
Essential Monitoring:
# Agent health endpoint
@app.get("/health")
async def health_check():
return {
"status": "healthy",
"workers_connected": get_worker_count(),
"active_rooms": await count_active_rooms(),
"timestamp": datetime.utcnow()
}
Logging Patterns:
import structlog
logger = structlog.get_logger()
# Log agent lifecycle events
logger.info("agent_started", agent_id=agent_id, room=room_name)
logger.info("agent_health_check", agent_id=agent_id, status="healthy")
logger.error("agent_failed", agent_id=agent_id, error=str(e))
@function_tool
async def call_business_logic(action: str, params: dict) -> str:
"""Generic pattern for calling your business logic from AI agents"""
try:
# 1. Validate input
if not validate_action(action, params):
return "Invalid action or parameters"
# 2. Call your API/service
async with httpx.AsyncClient() as client:
response = await client.post(f"{YOUR_API_URL}/api/{action}", json=params)
response.raise_for_status()
result = response.json()
# 3. Return user-friendly response
return f"Successfully executed {action}: {result['message']}"
except Exception as e:
logger.error(f"Function tool error: {e}")
return f"Sorry, I couldn't complete that action: {str(e)}"
class SessionManager {
constructor() {
this.SESSION_KEY = 'livekit_session';
this.RECONNECT_TIMEOUT = 30000; // 30 seconds
}
saveSession(roomName, identity) {
const session = {
roomName,
identity, // Don't store token - re-request from backend
timestamp: Date.now()
};
localStorage.setItem(this.SESSION_KEY, JSON.stringify(session));
}
async attemptReconnect() {
const sessionData = localStorage.getItem(this.SESSION_KEY);
if (!sessionData) return false;
const session = JSON.parse(sessionData);
const now = Date.now();
// Check if session is still valid
if (now - session.timestamp < this.RECONNECT_TIMEOUT) {
try {
// Re-request fresh token from backend for new permissions
const { token, url } = await this.getTokenFromBackend(
session.roomName,
session.identity
);
await this.room.connect(url, token);
return true;
} catch (error) {
console.log('Reconnect failed:', error);
this.clearSession();
}
}
return false;
}
clearSession() {
localStorage.removeItem(this.SESSION_KEY);
}
async getTokenFromBackend(roomName, identity) {
const response = await fetch('/api/livekit/token', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ room: roomName, identity })
});
return response.json();
}
}
Basic Data Publishing:
import { Room, LocalParticipant } from 'livekit-client';
const room = new Room();
// Publish small control messages or JSON data
await room.localParticipant.publishData(
new TextEncoder().encode(JSON.stringify({ action: 'move', direction: 'left' })),
'reliable' // or 'lossy' for frequent updates
);
// Subscribe to data from other participants
room.on(RoomEvent.DataReceived, (payload, participant) => {
const data = JSON.parse(new TextDecoder().decode(payload));
console.log('Received data:', data, 'from:', participant.identity);
});
For higher-level text/byte/RPC streams, see the Realtime Data docs.
my-livekit-project/
├── backend/
│ ├── main.py # FastAPI server with token endpoint
│ ├── agent.py # LiveKit agent
│ └── requirements.txt # livekit-agents, livekit-api, fastapi
├── frontend/
│ ├── index.html # Basic client
│ ├── app.js # LiveKit client code
│ └── style.css
└── .env # Environment variables
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from livekit import api
import os
app = FastAPI()
class TokenRequest(BaseModel):
room: str
identity: str
@app.post("/token")
async def get_token(request: TokenRequest):
if not all([os.getenv("LIVEKIT_API_KEY"), os.getenv("LIVEKIT_API_SECRET"), os.getenv("LIVEKIT_URL")]):
raise HTTPException(500, "LiveKit credentials not configured")
token = api.AccessToken(
os.getenv("LIVEKIT_API_KEY"),
os.getenv("LIVEKIT_API_SECRET")
).with_identity(request.identity).with_grants(
api.VideoGrants(room_join=True, room=request.room)
)
return {"token": token.to_jwt(), "url": os.getenv("LIVEKIT_URL")}
from livekit import agents
from livekit.agents import Agent, AgentSession, function_tool
class BasicAgent(Agent):
async def start(self, ctx: AgentSession):
await ctx.connect()
@function_tool
async def hello(name: str) -> str:
return f"Hello, {name}!"
async def entrypoint(ctx: agents.JobContext):
session = AgentSession()
await session.start(room=ctx.room, agent=BasicAgent())
if __name__ == "__main__":
agents.cli.run_app(agents.WorkerOptions(entrypoint_fnc=entrypoint))
### 4. Minimal Frontend (`app.js`)
```javascript
import { Room, RoomEvent, Track } from 'livekit-client';
const room = new Room();
// Handle track subscriptions
room.on(RoomEvent.TrackSubscribed, (track, publication, participant) => {
if (track.kind === Track.Kind.Audio) {
const audioElement = track.attach();
document.body.appendChild(audioElement);
}
});
async function connect() {
try {
// Get token from your backend
const response = await fetch('/token', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ room: 'test', identity: 'user1' })
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
const { token, url } = await response.json();
// Connect to LiveKit
await room.connect(url, token);
console.log('Connected to LiveKit!');
} catch (error) {
console.error('Connection failed:', error);
}
}
// Start connection on user interaction
document.addEventListener('click', connect, { once: true });
- Always secure token generation - Never expose API keys to frontend
- Implement connection state management - Handle disconnects gracefully
- Plan for mobile compatibility - Audio context unlock is critical
- Monitor agent health - Implement timeouts and cleanup
- Use function tools - Bridge AI agents to your business logic
- Test across platforms - WebRTC behavior varies by browser
- Plan deployment strategy - Consider WebSocket timeout issues
- Implement logging - Essential for debugging production issues
- Handle errors gracefully - Network issues are common in real-time apps
- Start simple - Add complexity incrementally as needed
- Official Documentation: https://docs.livekit.io/
- JavaScript Quickstart: https://docs.livekit.io/home/quickstarts/javascript/
- Connecting from Clients: https://docs.livekit.io/home/client/connect/
- Event Handling: https://docs.livekit.io/home/client/events/
- Authentication Guide: https://docs.livekit.io/home/get-started/authentication/
- Generating Tokens: https://docs.livekit.io/home/server/generating-tokens/
- JS SDK Reference: https://docs.livekit.io/reference/client-sdk-js/
- Audio Autoplay Events: https://docs.livekit.io/reference/client-sdk-js/enums/RoomEvent.html#AudioPlaybackStatusChanged
- React Components: https://docs.livekit.io/reference/components/react/
- useAudioPlayback Hook: https://docs.livekit.io/reference/components/react/hook/useaudioplayback/
- Agents Landing: https://docs.livekit.io/agents/
- Voice AI Quickstart: https://docs.livekit.io/agents/start/voice-ai/
- Worker Options: https://docs.livekit.io/agents/worker/options/
- Agent Dispatch: https://docs.livekit.io/agents/build/dispatch/
- Telephony Integration: https://docs.livekit.io/agents/start/telephony/
- Realtime Text & Data: https://docs.livekit.io/home/client/data/
- Data Packets: https://docs.livekit.io/home/client/data/packets/
- JavaScript: https://github.com/livekit/client-sdk-js (npm:
livekit-client
) - React Components: https://github.com/livekit/components-js (npm:
@livekit/components-react
) - Swift (iOS/macOS): https://github.com/livekit/client-sdk-swift
- Android (Kotlin): https://github.com/livekit/client-sdk-android
- Flutter: https://github.com/livekit/client-sdk-flutter (pub:
livekit_client
) - Unity: https://github.com/livekit/client-sdk-unity
- Python SDKs: https://github.com/livekit/python-sdks (packages:
livekit
,livekit-api
) - JavaScript Server SDK: https://docs.livekit.io/reference/server-sdk-js/ (npm:
livekit-server-sdk
)
- Recording & Streaming (Egress): https://docs.livekit.io/home/egress/overview/
- Egress API: https://docs.livekit.io/home/egress/api/
- Stream Ingress: https://docs.livekit.io/home/ingress/overview/
- SIP Telephony: https://docs.livekit.io/sip/
- Webhooks: https://docs.livekit.io/home/server/webhooks/
- LiveKit Examples: https://github.com/livekit-examples
- Agent Starter (React): https://github.com/livekit-examples/agent-starter-react
- LiveKit Agents Framework: https://github.com/livekit/agents
Found an issue or want to improve this guide? Contributions are welcome! Please feel free to:
- Report issues or suggest improvements
- Share your own LiveKit patterns and best practices
Author: rahulmanuwas Last Updated: August 2025