Skip to content

Instantly share code, notes, and snippets.

@kwindla
Created May 6, 2025 15:43
Show Gist options
  • Save kwindla/f97facc6d83a4c00be9eb0a8252f4de0 to your computer and use it in GitHub Desktop.
Save kwindla/f97facc6d83a4c00be9eb0a8252f4de0 to your computer and use it in GitHub Desktop.
Cleaned up talk transcript matched to onscreen slides
from google import genai
import os
client = genai.Client(api_key=os.getenv("GOOGLE_API_KEY"))
# filename_for_upload = "/Users/khkramer/Downloads/maven-lightning-trimmed.mp4"
# myfile = client.files.upload(file=filename_for_upload)
#
# print("My files:")
# for f in client.files.list():
# print(" ", f.name)
#
# sys.exit()
gfilename = "files/5dd2pvzl3w2o"
prompt = """
This video is a talk about current tools and best practices for building Voice AI agents. Transcribe the talk.
Transcribe the full audio.
Clean up the transcript by removing filler words and adjusting the text to be appropriate to publish in written form. Fix obvious grammatical errors. Improve sentence structure.
For each slide, when the slide first appears insert a short section that describes the slide.
"""
file = client.files.get(name=gfilename)
response = client.models.generate_content(
# model="gemini-2.0-flash",
model="gemini-2.5-pro-preview-03-25",
contents=[
file,
prompt,
],
)
print(response.text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment