Skip to content

Instantly share code, notes, and snippets.

@guyromm
Last active August 12, 2024 20:34
Show Gist options
  • Save guyromm/54dbcd034c44c78809416225d28c33ec to your computer and use it in GitHub Desktop.
Save guyromm/54dbcd034c44c78809416225d28c33ec to your computer and use it in GitHub Desktop.
an example on how to save an hour of one's time using openai's speech 2 text & then gpt4-o to summarize and ask questions about the resulting text
#!/usr/bin/env python
from pydub import AudioSegment
from pydub.silence import split_on_silence
import sys
import os
def chunk_audio(input_file, output_prefix, min_silence_len=len(sys.argv)>2 and int(sys.argv[2]) or 1000, silence_thresh=-40, keep_silence=None):
if not keep_silence:
keep_silence = min_silence_len/4
audio = AudioSegment.from_mp3(input_file)
# Split audio where silence is 'silent' for at least min_silence_len ms and lower than silence_thresh dBFS
chunks = split_on_silence(
audio,
min_silence_len=min_silence_len,
silence_thresh=silence_thresh,
keep_silence=keep_silence
)
# Export chunks as separate files
for i, chunk in enumerate(chunks):
chunk_name = f"{output_prefix}{str(i).zfill(3)}.mp3"
chunk.export(chunk_name, format="mp3")
print(f"Exported {chunk_name}")
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: python chunk_audio.py <input_file> [min_silence_len]")
sys.exit(1)
input_file = sys.argv[1]
output_prefix = "output_chunk_"
chunk_audio(input_file, output_prefix)
#!/bin/bash
yt-dlp --extract-audio 'https://www.youtube.com/watch?v=eGPa_omV9WI'
ffmpeg -i *m4a -c:v copy -c:a libmp3lame -q:a 4 input.mp3
chunk_audio.py input.mp3
for fn in output_chunk*mp3 ; do echo " * $fn" ; transcribe.py "$fn" | tee "$fn.txt" ; done
cat output_chunk*txt > output.txt
aider output.txt summary.txt -m 'summarize this into `summary.txt`'
#!/usr/bin/env python
from openai import OpenAI
import sys
import os
# Initialize the OpenAI client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
# Function to transcribe audio using OpenAI's latest API
def transcribe_audio(audio_file_path):
with open(audio_file_path, 'rb') as audio_file:
response = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
return response.text
# Example usage
if __name__ == "__main__":
if len(sys.argv) < 2:
print("Usage: ./transcribe.py <audio_file_path>")
sys.exit(1)
audio_file_path = sys.argv[1]
transcription = transcribe_audio(audio_file_path)
print(transcription)
@guyromm
Copy link
Author

guyromm commented Aug 11, 2024

results in:
summary.txt

Chris Hedges, a Pulitzer Prize-winning journalist, discusses the current state of American politics, focusing on the corporate influence over both major parties. He critiques the Democratic Party's handling of
Biden and Harris, highlighting their ties to corporate interests and the lack of genuine voter influence. Hedges also addresses the rise of Trump, attributing it to widespread despair and betrayal by the
political system. He emphasizes the systemic issues within the U.S., including economic decay, militarism, and the influence of the Israel lobby. Hedges calls for a movement to challenge the political hegemony
from outside the established parties, advocating for a focus on addressing the needs of the vulnerable and ending corporate exploitation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment