Created
April 11, 2024 06:48
-
-
Save lukestanley/305756e1eafead656ced624e07d6193a to your computer and use it in GitHub Desktop.
making podcast summary with Whisper speech to text model on Replicate with ChatGPT 4
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Making podcast summaries when only the audio is available. | |
Find MP3 of the podcast. | |
Use a Whisper API, such as https://replicate.com/thomasmol/whisper-diarization (I logged in via GitHub with my paid account). I put the mp3 URL into the file_url section. I set the num_speakers to 2 (which turned out to be a bit wrong after I heard the podcast, due to additional voices for the adverts). I put 'en' as the language. | |
In the prompt string I pasted an introduction to the podcast to help it be more accurate with names. | |
I pressed "Run". 2 minutes later, I checked out the JSON section on the right hand side, and pressed the copy button. | |
I selected an extract of the start of the JSON, showing the structure, stopped after the first array item that contained the transcript text and speaker ID. I saved it as a JSON file. | |
I provided it to ChatGPT 4. In particular I used this instruction after the extract: | |
`Parse the JSON to get a text transcript. Think step by step sharing very detailed working out. Use Python. we want output.segments.text and output.segments.speaker where SPEAKER_01 should be "Host" and SPEAKER_02 should be "Daniel". | |
We need a string in the format of f"{speaker}: {text}\n"` | |
ChatGPT 4 then made this Python code to parse it into a string: | |
`import json | |
# Load the JSON data from the provided file path | |
file_path = '/mnt/data/privacy_files_daniel_kendraio_decentral.json' | |
with open(file_path, 'r') as file: | |
json_data = json.load(file) | |
# Extract the necessary information and reformat it according to the user's instructions | |
transcript = "" | |
for segment in json_data['output']['segments']: | |
speaker = "Host" if segment['speaker'] == "SPEAKER_01" else "Daniel" | |
text = segment['text'] | |
transcript += f"{speaker}: {text}\n" | |
transcript` | |
It started reading out the text. It was correct but lengthy, so I stopped it and instructed: That's fine, give me a .txt file | |
ChatGPT 4 then produced this Python: | |
`file_path_output = '/mnt/data/privacy_files_episode_transcript.txt' | |
with open(file_path_output, 'w') as file: | |
file.write(transcript) | |
file_path_output` | |
With the transcript.txt then linked to too. | |
I later pasted the text to ChatGPT 4 and asked it to summarise it. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment