name	elevenlabs-voice-integration
description	Add voice input to a web app using ElevenLabs speech-to-text. Use when: adding microphone recording, speech-to-text transcription, voice chat, or ElevenLabs SDK integration to a web project. Triggers: elevenlabs, voice input, speech to text, microphone, voice recording, scribe, voice chat, STT.

ElevenLabs Voice Integration

Add voice input to a web app: record audio in the browser, transcribe via ElevenLabs Scribe, and feed the text into your app.

Key Architecture

Browser                              Server
  |                                    |
  |-- MediaRecorder (audio/webm) ---->  |
  |                                    |-- ElevenLabsClient.speechToText.convert()
  |<--- { text, words } --------------|
  |                                    |
  |-- inject text into chat/form       |

The ElevenLabs API key stays server-side. The browser records audio, sends the blob to a server route, and gets back transcribed text.

Workflow

Step 1: Install the SDK

npm install @elevenlabs/elevenlabs-js

Add your API key to the project's config (never in client code):

{
  "elevenlabs": {
    "api_key": "your-key"
  }
}

Step 2: Create a Server-Side STT Route

The @elevenlabs/elevenlabs-js SDK provides speechToText.convert() which accepts a File object directly from FormData.

SvelteKit example (src/routes/api/speech-to-text/+server.ts):

import { json } from "@sveltejs/kit";
import type { RequestHandler } from "./$types";
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { readFileSync } from "fs";
import { join } from "path";

export const POST: RequestHandler = async ({ request }) => {
  const config = JSON.parse(readFileSync(join(process.cwd(), "config.json"), "utf-8"));
  const elevenlabs = new ElevenLabsClient({ apiKey: config.elevenlabs.api_key });

  const formData = await request.formData();
  const audioFile = formData.get("audio") as File | null;
  if (!audioFile) {
    return json({ error: "No audio file provided" }, { status: 400 });
  }

  const result = await elevenlabs.speechToText.convert({
    file: audioFile,
    modelId: "scribe_v2",
    tagAudioEvents: true,
  });

  return json({ text: result.text, words: result.words });
};

For Express/Node (non-SvelteKit):

import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import multer from "multer";

const upload = multer({ storage: multer.memoryStorage() });
const elevenlabs = new ElevenLabsClient({ apiKey: process.env.ELEVENLABS_API_KEY });

app.post("/api/speech-to-text", upload.single("audio"), async (req, res) => {
  const file = new File([req.file.buffer], req.file.originalname, { type: req.file.mimetype });
  const result = await elevenlabs.speechToText.convert({
    file,
    modelId: "scribe_v2",
    tagAudioEvents: true,
  });
  res.json({ text: result.text, words: result.words });
});

Step 3: Browser Audio Recording

Use MediaRecorder to capture microphone audio. Key details:

Request audio/webm MIME type (broadly supported, accepted by ElevenLabs)
Collect chunks via ondataavailable, assemble on onstop
Always stop media stream tracks after recording to release the microphone

let mediaRecorder: MediaRecorder | null = null;
let audioChunks: Blob[] = [];
let isRecording = false;

async function startRecording() {
  const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
  mediaRecorder = new MediaRecorder(stream, { mimeType: 'audio/webm' });
  audioChunks = [];

  mediaRecorder.ondataavailable = (event) => {
    if (event.data.size > 0) audioChunks.push(event.data);
  };

  mediaRecorder.onstop = async () => {
    // Release microphone
    stream.getTracks().forEach(track => track.stop());
    await processRecording();
  };

  mediaRecorder.start();
  isRecording = true;
}

function stopRecording() {
  if (mediaRecorder && isRecording) {
    mediaRecorder.stop();
    isRecording = false;
  }
}

Step 4: Send Audio to Server and Handle Response

async function processRecording() {
  if (audioChunks.length === 0) return;

  const audioBlob = new Blob(audioChunks, { type: 'audio/webm' });
  const formData = new FormData();
  formData.append('audio', audioBlob, 'recording.webm');

  const response = await fetch('/api/speech-to-text', {
    method: 'POST',
    body: formData,  // No Content-Type header — browser sets multipart boundary
  });

  const data = await response.json();

  if (data.error) {
    // Handle error (show to user)
    return;
  }

  if (data.text) {
    // Inject transcribed text — either into an input field or directly as a chat message
    inputMessage = data.text;
    await sendMessage(); // auto-send is a good UX for voice chat
  }
}

Important: Do NOT set Content-Type on the fetch request. The browser must set it automatically with the multipart boundary string.

Step 5: Voice UI Patterns

Recording button — toggle between start/stop states:

<button
  class="mic-button"
  class:recording={isRecording}
  onclick={isRecording ? stopRecording : startRecording}
  disabled={chatLoading}
>
  {isRecording ? 'Stop' : 'Record'}
</button>

Voice activity log — helpful for debugging and user feedback:

interface VoiceLog {
  timestamp: Date;
  message: string;
  type: 'info' | 'success' | 'error';
}

let voiceLogs: VoiceLog[] = [];

function addVoiceLog(message: string, type: VoiceLog['type'] = 'info') {
  voiceLogs = [...voiceLogs, { timestamp: new Date(), message, type }];
}

// Usage:
addVoiceLog('Requesting microphone access...');
addVoiceLog('Recording started', 'success');
addVoiceLog(`Recording stopped (${duration}s)`);
addVoiceLog('Sending to ElevenLabs...');
addVoiceLog('Transcription complete', 'success');

Display as a collapsible log panel below the chat area.

Disable text input while recording to prevent conflicting inputs:

<textarea disabled={chatLoading || isRecording} ... />

ElevenLabs SDK Notes

scribe_v2 is the current STT model. It accepts webm, mp3, wav, and other common formats.
tagAudioEvents: true adds non-speech event tags (laughter, music, etc.) to the transcript.
The words array in the response contains per-word timestamps — useful for subtitle/karaoke UIs.
The SDK handles auth, retries, and content type automatically. No manual multipart construction needed server-side.

Troubleshooting

"NotAllowedError" from getUserMedia:

The page must be served over HTTPS (or localhost). Microphone access is blocked on plain HTTP.
The user must grant permission in the browser prompt.

Empty transcription / "No speech detected":

Check recording duration — very short recordings (< 0.5s) may not contain usable audio.
Verify audioChunks is not empty before sending.

"TypeError: Failed to fetch" on /api/speech-to-text:

Ensure the server route exists and handles POST.
Check that FormData is sent without a manually-set Content-Type header.

Stopping Points

After Step 2: Test the server route with a pre-recorded audio file (e.g. via curl) before building browser recording
After Step 3: Verify microphone capture works (check audioChunks length) before wiring to the API

Output

A voice input feature that records audio in the browser, transcribes via ElevenLabs Scribe, and injects the text into the app.

name	sveltekit-snowflake-app
description	Build SvelteKit web apps that talk to Snowflake APIs (Cortex Agent, SQL API). Use when: creating a SvelteKit frontend for Snowflake data, proxying Cortex Agent SSE streams through server routes, calling Snowflake SQL API from TypeScript, rendering streamed markdown/mermaid in Svelte. Triggers: sveltekit snowflake, svelte cortex agent, sveltekit api route, SSE proxy, snowflake frontend, svelte app.

name

sveltekit-snowflake-app

description

Build SvelteKit web apps that talk to Snowflake APIs (Cortex Agent, SQL API). Use when: creating a SvelteKit frontend for Snowflake data, proxying Cortex Agent SSE streams through server routes, calling Snowflake SQL API from TypeScript, rendering streamed markdown/mermaid in Svelte. Triggers: sveltekit snowflake, svelte cortex agent, sveltekit api route, SSE proxy, snowflake frontend, svelte app.

SvelteKit + Snowflake App

Build a SvelteKit app that queries Snowflake data and streams Cortex Agent responses to the browser.

Key Architecture

Browser (Svelte 5)
  |
  |-- fetch('/api/chat', POST) -----> +server.ts (proxy) ----> Cortex Agent :run (SSE)
  |-- fetch('/api/data', GET)  -----> +server.ts -----------> Snowflake SQL API
  |
  |   SSE stream forwarded back through the proxy
  |   Client parses SSE with fetch + getReader()

Secrets stay server-side in config.json (read via readFileSync in +server.ts routes). Never expose tokens to the browser bundle.

Workflow

Step 1: Scaffold the SvelteKit Project

npx sv create <PROJECT_NAME>
# Select: SvelteKit minimal, TypeScript, no additional options
cd <PROJECT_NAME>
npm install

Create config.json and config.example.json at project root:

{
  "snowflake": {
    "account": "your-account-identifier",
    "token": "your-bearer-token",
    "warehouse": "XS",
    "database": "MY_DB",
    "schema": "PUBLIC"
  }
}

Add config.json to .gitignore.

Step 2: Create a Cortex Agent Chat Route

Create src/routes/api/chat/+server.ts:

Critical details:

Endpoint pattern: https://<ACCOUNT>.snowflakecomputing.com/api/v2/databases/<DB>/schemas/<SCHEMA>/agents/<AGENT_NAME>:run
Request Accept: text/event-stream to get SSE
Auth via Authorization: Bearer <PAT>
Message format: { role: "user", content: [{ type: "text", text: "..." }] }
Forward the raw SSE stream to the browser using a ReadableStream wrapper

// Key pattern: SSE stream forwarding
import { json } from "@sveltejs/kit";
import type { RequestHandler } from "./$types";
import { readFileSync } from "fs";
import { join } from "path";

export const POST: RequestHandler = async ({ request }) => {
  const config = JSON.parse(readFileSync(join(process.cwd(), "config.json"), "utf-8"));
  const { account, token } = config.snowflake;
  const body = await request.json();

  // Build messages in Cortex Agent format
  const messages = [
    ...body.history.map((msg: { role: string; content: string }) => ({
      role: msg.role,
      content: [{ type: "text", text: msg.content }],
    })),
    { role: "user", content: [{ type: "text", text: body.message }] },
  ];

  const response = await fetch(
    `https://${account}.snowflakecomputing.com/api/v2/databases/MY_DB/schemas/AGENTS/agents/MY_AGENT:run`,
    {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Accept: "text/event-stream",
        Authorization: `Bearer ${token}`,
      },
      body: JSON.stringify({ messages }),
    }
  );

  if (!response.ok) {
    return json({ error: "Agent failed", details: await response.text() }, { status: 500 });
  }

  // Forward SSE stream — do NOT parse it server-side, just pipe through
  const reader = response.body!.getReader();
  const readable = new ReadableStream({
    async pull(controller) {
      const { done, value } = await reader.read();
      if (done) { controller.close(); return; }
      controller.enqueue(value);
    },
    cancel() { reader.cancel(); },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
};

Step 3: Create a SQL API Data Route

For direct SQL queries (e.g. leaderboards, dashboards), use the Snowflake SQL API:

Endpoint: https://<ACCOUNT>.snowflakecomputing.com/api/v2/statements
POST with { warehouse, database, schema, timeout, statement }
Response: { data: [row, row, ...], resultSetMetaData: { rowType: [...] } }
Each row is a positional array matching rowType column order

// In a +server.ts GET handler:
const response = await fetch(`https://${account}.snowflakecomputing.com/api/v2/statements`, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Accept: "application/json",
    Authorization: `Bearer ${token}`,
  },
  body: JSON.stringify({ warehouse, database, schema, timeout: 60, statement: sql }),
});
const result = await response.json();
// result.data is an array of arrays (positional columns)

Step 4: Client-Side SSE Parsing

The browser cannot use EventSource for POST requests with custom headers. Instead, parse SSE manually from fetch.

Critical: Use proper SSE event-boundary parsing. The SSE spec says events are delimited by empty lines, and a single event can have multiple data: lines that get concatenated. The final response event from Cortex Agent is a large JSON payload that may span multiple data: lines. A naive line-by-line parser that tries to JSON.parse() each data: line independently will fail on multi-line payloads.

Critical: The response content array has duplicated items. The agent's final response event contains a content array where text blocks and chart blocks may appear multiple times (the agent includes them during generation and again in the final assembly). You must:

Take the last non-empty text item (not iterate sequentially — the last items are often just "\n\n" spacers that would blank the answer)
Deduplicate charts and tables by tool_use_id using a Set

const response = await fetch('/api/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ message, history }),
});

const reader = response.body!.getReader();
const decoder = new TextDecoder();
let buffer = '';
let currentEvent = '';
let collectedText = '';
let dataLines: string[] = [];
const seenChartIds = new Set<string>();
const seenTableIds = new Set<string>();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || ''; // keep incomplete line

  for (const line of lines) {
    if (line.startsWith('event:')) {
      currentEvent = line.slice(6).trim();
      dataLines = [];
    } else if (line.startsWith('data:')) {
      dataLines.push(line.slice(5));
    } else if (line.trim() === '' && dataLines.length > 0) {
      // Empty line = end of SSE event — process accumulated data lines
      const dataStr = dataLines.join('\n').trim();
      dataLines = [];
      if (dataStr === '[DONE]') continue;
      try {
        const data = JSON.parse(dataStr);

        // Status events — show progress to user
        if (data.status === 'planning') { /* show "Planning..." */ }
        if (data.status === 'executing_tool') { /* show tool type via data.tool_type */ }
        if (data.status === 'proceeding_to_answer') { /* show "Generating..." */ }

        // Text deltas — append to streaming content
        if (currentEvent === 'response.text.delta' && data.text) {
          collectedText += data.text;
        }

        // Chart events — deduplicate by tool_use_id
        if (currentEvent === 'response.chart' && data.chart_spec) {
          const key = data.tool_use_id || data.chart_spec;
          if (!seenChartIds.has(key)) {
            seenChartIds.add(key);
            charts.push({ tool_use_id: data.tool_use_id, chart_spec: data.chart_spec });
          }
        }

        // Table events — deduplicate by tool_use_id
        if (currentEvent === 'response.table' && data.result_set) {
          const key = data.tool_use_id || JSON.stringify(data.result_set.data?.[0]);
          if (!seenTableIds.has(key)) {
            seenTableIds.add(key);
            tables.push({ title: data.title, tool_use_id: data.tool_use_id, result_set: data.result_set });
          }
        }

        // Final assembled response
        if (currentEvent === 'response' && data.content) {
          // Use the LAST non-empty text item (agent duplicates the answer;
          // later items may be "\n\n" spacers that would blank the content)
          let lastText = '';
          for (const item of data.content) {
            if (item.type === 'text' && item.text?.trim()) {
              lastText = item.text;
            }
            if (item.type === 'chart' && item.chart?.chart_spec) {
              const key = item.chart.tool_use_id || item.chart.chart_spec;
              if (!seenChartIds.has(key)) {
                seenChartIds.add(key);
                charts.push({ tool_use_id: item.chart.tool_use_id, chart_spec: item.chart.chart_spec });
              }
            }
            // Skip: thinking, tool_use, tool_result, suggested_queries (see note below)
          }
          if (lastText) collectedText = lastText;
        }
      } catch { /* ignore partial JSON */ }
    }
  }
}

Cortex Agent SSE event types:

Event	Purpose	Key fields
`response.status`	Progress updates	`status`, `message`, `tool_type`
`response.text.delta`	Streaming text chunk	`text`
`response.text`	Final text (sometimes)	`text`
`response.chart`	Vega-Lite chart from Cortex Analyst	`chart_spec` (stringified JSON), `tool_use_id`, `content_index`
`response.table`	Structured result set from Cortex Analyst	`title`, `tool_use_id`, `result_set: { resultSetMetaData: { rowType }, data: string[][] }`
`response`	Complete response object	`content: [{ type, ... }]` (see content types below)

Final response content array item types: The content array in the final response event contains a mix of item types. Not all are useful for display:

`type`	Fields	Action
`text`	`text` (string)	Display — use last non-empty item
`chart`	`chart: { tool_use_id, chart_spec }`	Display — deduplicate by `tool_use_id`
`thinking`	`thinking: { text }`	Skip — internal agent reasoning
`tool_use`	`tool_use: { name, input, tool_use_id }`	Skip — agent's internal tool calls
`tool_result`	`tool_result: { content, status, tool_use_id }`	Skip — results from internal tools
`suggested_queries`	`suggested_queries: [{ query }]`	Optional — display as follow-up suggestion chips

Important: The agent frequently duplicates text and chart items in the content array (once during generation, once in the final assembly). Always deduplicate charts by tool_use_id and use only the last substantial text block.

Chart event details:

chart_spec is a stringified Vega-Lite v5 JSON — you must JSON.parse() it before rendering
Chart specs include inline data in data.values — no external data fetch needed
Charts also appear in the final response content array as { type: "chart", chart: { tool_use_id, chart_spec } }
Not every query returns a chart — the agent decides based on the question

Table event details:

result_set matches the Snowflake SQL API ResultSet schema
resultSetMetaData.rowType is an array of { name, type, length, precision, scale, nullable }
data is a 2D array of strings (all values are string-typed, even numbers)

Step 5: Render Markdown and Mermaid

Install dependencies:

npm install marked mermaid

In your Svelte component:

import { Marked } from 'marked';
import mermaid from 'mermaid';
import { tick } from 'svelte';

mermaid.initialize({ startOnLoad: false, theme: 'neutral' });
const marked = new Marked();

// After adding a new assistant message:
const html = await marked.parse(responseText);
messages = [...messages, { role: 'assistant', content: responseText, html }];

// Then render mermaid blocks:
await tick();
const blocks = document.querySelectorAll('.mermaid:not([data-processed])');
for (const block of blocks) {
  const id = `mermaid-${Math.random().toString(36).substr(2, 9)}`;
  const { svg } = await mermaid.render(id, block.textContent || '');
  block.innerHTML = svg;
  block.setAttribute('data-processed', 'true');
}

Render with {@html msg.html} in the template. Style injected HTML content (from markdown) carefully:

In Svelte <style> blocks (component-scoped): use :global() selectors to target child elements injected via {@html}, e.g. .message-content :global(p) { ... }
In plain CSS files like app.css (already global scope): do NOT use :global() — it's Svelte-specific syntax and causes lightningcss warnings during build. Just write normal selectors: .message-content p { ... }

Step 6: Render Charts and Tables

Install vega-embed (includes vega and vega-lite):

npm install vega-embed vega vega-lite

Chart rendering — parse the chart_spec string and render with vega-embed:

import type { default as VegaEmbed } from 'vega-embed';

let vegaEmbedModule: typeof import('vega-embed') | null = null;

onMount(async () => {
  vegaEmbedModule = await import('vega-embed');
});

async function renderCharts() {
  if (!vegaEmbedModule) return;
  await tick();
  const containers = document.querySelectorAll('.vega-chart:not([data-rendered])');
  for (const el of containers) {
    const specStr = el.getAttribute('data-spec');
    if (!specStr) continue;
    try {
      const spec = JSON.parse(specStr);
      await vegaEmbedModule.default(el as HTMLElement, spec, {
        actions: false,   // hide export/source buttons
        renderer: 'svg',  // crisp rendering, works well inline
      });
      el.setAttribute('data-rendered', 'true');
    } catch {
      el.textContent = 'Failed to render chart';
    }
  }
}

Call renderCharts() after the SSE stream completes and after any chart event during streaming.

In the template — render chart and table blocks after the markdown content:

{#if msg.charts?.length}
  {#each msg.charts as chart}
    <div class="vega-chart" data-spec={chart.chart_spec}></div>
  {/each}
{/if}

{#if msg.tables?.length}
  {#each msg.tables as table}
    <table>
      <thead>
        <tr>
          {#each table.result_set.resultSetMetaData.rowType as col}
            <th>{col.name}</th>
          {/each}
        </tr>
      </thead>
      <tbody>
        {#each table.result_set.data as row}
          <tr>
            {#each row as cell}
              <td>{cell ?? ''}</td>
            {/each}
          </tr>
        {/each}
      </tbody>
    </table>
  {/each}
{/if}

Key notes:

Load vega-embed lazily via dynamic import() — it's a large library (~300KB) and must only load client-side
Use data-spec attribute + DOM query pattern (not direct binding) because vega-embed mutates the container element
Call renderCharts() after tick() to ensure DOM is updated before vega-embed targets the elements
The Snowflake theme skill has Vega-Lite config overrides for brand colors — apply them to spec.config before rendering

Svelte 5 Notes

Use $props() for component props: let { children } = $props();
Use $state() for reactive variables (replaces let x = ... reactivity from Svelte 4)
Layout uses {@render children()} instead of <slot />

Critical: `$state` proxy gotcha with object mutation

In Svelte 5, $state arrays use deep proxies for reactivity tracking. When you create a plain object and push it into a $state array, the array stores a proxy-wrapped copy. Your original local variable still points to the raw, unproxied object. Mutating the local variable bypasses the proxy, so Svelte never detects the changes and the UI won't update.

Wrong — local variable bypasses the proxy:

let messages = $state<Message[]>([]);

const msg = { role: 'assistant', content: '', status: 'Loading...' };
messages = [...messages, msg];

// BUG: `msg` is the raw object, not the proxy in the array.
// This mutation is invisible to Svelte's reactivity:
msg.status = 'Done';    // UI does NOT update
msg.content = 'Hello';  // UI does NOT update

Correct — grab the reference back from the proxied array:

let messages = $state<Message[]>([]);

messages = [...messages, { role: 'assistant', content: '', status: 'Loading...' }];
const msg = messages[messages.length - 1]; // This is the PROXY

// Mutations now go through the proxy and trigger reactivity:
msg.status = 'Done';    // UI updates
msg.content = 'Hello';  // UI updates

This is especially important for streaming UIs where you incrementally mutate an object (e.g., appending SSE tokens to content, updating status during progress events).

Troubleshooting

SSE stream cuts off or hangs:

Ensure the SvelteKit server route returns Content-Type: text/event-stream
Do not attempt to parse/transform the stream server-side — forward raw bytes

Cortex Agent returns 401/403:

PAT may have expired. Regenerate via Snowsight or snow CLI.
Verify the account identifier format (no .snowflakecomputing.com suffix in config)

SQL API returns column data as positional arrays:

This is expected. Map using result.resultSetMetaData.rowType for column names, or use positional indexing: row[0], row[1], etc.

Stopping Points

After Step 1: Confirm project scaffolded and config working
After Step 2: Test chat route returns SSE stream before building UI

Output

A SvelteKit app with server-side API routes proxying Snowflake APIs, and a client-side chat UI with streaming responses.

krisajenkins/elevenlabs-voice-integration.md

Select an option

No results found

Select an option

No results found

ElevenLabs Voice Integration

Key Architecture

Workflow

Step 1: Install the SDK

Step 2: Create a Server-Side STT Route

Step 3: Browser Audio Recording

Step 4: Send Audio to Server and Handle Response

Step 5: Voice UI Patterns

ElevenLabs SDK Notes

Troubleshooting

Stopping Points

Output

SvelteKit + Snowflake App

Key Architecture

Workflow

Step 1: Scaffold the SvelteKit Project

Step 2: Create a Cortex Agent Chat Route

Step 3: Create a SQL API Data Route

Step 4: Client-Side SSE Parsing

Step 5: Render Markdown and Mermaid

Step 6: Render Charts and Tables

Svelte 5 Notes

Critical: `$state` proxy gotcha with object mutation

Troubleshooting

Stopping Points

Output

krisajenkins/elevenlabs-voice-integration.md

ElevenLabs Voice Integration

Key Architecture

Workflow

Step 1: Install the SDK

Step 2: Create a Server-Side STT Route

Step 3: Browser Audio Recording

Step 4: Send Audio to Server and Handle Response

Step 5: Voice UI Patterns

ElevenLabs SDK Notes

Troubleshooting

Stopping Points

Output

SvelteKit + Snowflake App

Key Architecture

Workflow

Step 1: Scaffold the SvelteKit Project

Step 2: Create a Cortex Agent Chat Route

Step 3: Create a SQL API Data Route

Step 4: Client-Side SSE Parsing

Step 5: Render Markdown and Mermaid

Step 6: Render Charts and Tables

Svelte 5 Notes

Critical: $state proxy gotcha with object mutation

Troubleshooting

Stopping Points

Output

Critical: `$state` proxy gotcha with object mutation