Skip to content

Instantly share code, notes, and snippets.

View kwindla's full-sized avatar

Kwindla Hultman Kramer kwindla

View GitHub Profile
@kwindla
kwindla / voice-agents.md
Created June 23, 2025 23:43
Advice on Voice Agents - June 2025

Advice on Voice AI, June 2025

My top three pieces of advice for people getting started with voice agents.

  1. Spend time up front understanding why latency and instruction following accuracy drive voice AI tech choices.

  2. You will need to add significant tooling complexity as you go from proof of concept to production. Prepare for that. Especially important: build lightweight evals as early as you can.

  3. The right path is: start with a proven, "best practices" tech stack -> get everything working one piece at a time -> deploy to real-world users and collect data -> then think about optimizing cost/latency/etc.

@kwindla
kwindla / video-inference-result.md
Created June 13, 2025 16:40
Gemini Pro video understanding
@kwindla
kwindla / inference-note.md
Created June 11, 2025 17:30
Funny inference result captured while recording demo traces

Funny GPT-4o inference result.

Audible in this vide: https://youtu.be/PgyJs0jfp_o?si=43CJgmk954kulmgl&t=863

Output

It sounds like we're on an intriguing mission! I'm going to scan through the grand chandeliers and ornate carpets of the hotel for traces. Be right back with the results!
@kwindla
kwindla / gemini-talk-transcript.py
Created May 6, 2025 15:43
Cleaned up talk transcript matched to onscreen slides
from google import genai
import os
client = genai.Client(api_key=os.getenv("GOOGLE_API_KEY"))
# filename_for_upload = "/Users/khkramer/Downloads/maven-lightning-trimmed.mp4"
# myfile = client.files.upload(file=filename_for_upload)
#
# print("My files:")
@kwindla
kwindla / daily-transport-double-transcription.py
Created April 28, 2025 01:36
Double transcription events test
# double transcription events
# pip install 'pipecat-ai[daily,silero,openai,cartesia]'==0.0.59 dotenv
#
# transcription events as expected
# pip install 'pipecat-ai[daily,silero,openai,cartesia]'==0.0.58 dotenv
import asyncio
import sys
import os
@kwindla
kwindla / gladia-tagalog-mixed.py
Created April 26, 2025 02:16
DailyTransport, Gladia, Tagalog / English mixed
#
# Copyright (c) 2024–2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import asyncio
import os
@kwindla
kwindla / openai-daily-transport-test.py
Created April 26, 2025 01:52
OpenAI STT -> LLM -> TTS
#
# Copyright (c) 2024–2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import asyncio
import os
@kwindla
kwindla / detective-story.py
Created April 23, 2025 22:48
OpenAI voice model detective story
import asyncio
from openai import AsyncOpenAI
from openai.helpers import LocalAudioPlayer
import wave
import numpy as np
openai = AsyncOpenAI()
@kwindla
kwindla / logs.txt
Created April 22, 2025 19:30
Startup logs for 07-interruptible
2025-04-22 12:27:57.538 | INFO | 07-interruptible:run_bot:28 - Starting bot
2025-04-22 12:27:57.538 | DEBUG | pipecat.audio.vad.silero:__init__:111 - Loading Silero VAD model...
2025-04-22 12:27:57.564 | DEBUG | pipecat.audio.vad.silero:__init__:133 - Loaded Silero VAD
2025-04-22 12:27:57.583 | DEBUG | pipecat.processors.frame_processor:link:177 - Linking PipelineSource#0 -> SmallWebRTCInputTransport#0
2025-04-22 12:27:57.583 | DEBUG | pipecat.processors.frame_processor:link:177 - Linking SmallWebRTCInputTransport#0 -> DeepgramSTTService#0
2025-04-22 12:27:57.583 | DEBUG | pipecat.processors.frame_processor:link:177 - Linking DeepgramSTTService#0 -> OpenAIUserContextAggregator#0
2025-04-22 12:27:57.583 | DEBUG | pipecat.processors.frame_processor:link:177 - Linking OpenAIUserContextAggregator#0 -> OpenAILLMService#0
2025-04-22 12:27:57.583 | DEBUG | pipecat.processors.frame_processor:link:177 - Linking OpenAILLMService#0 -> CartesiaTTSService#0
2025-04-22 12:27:57.583 | DEBUG | pip
@kwindla
kwindla / bot.py
Created March 13, 2025 21:18
Gemini Multimodal Live French tutor
import asyncio
import os
import sys
from dataclasses import dataclass
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer