A Python script that converts text into natural-sounding speech using the Kokoro TTS engine. The script processes a transcript file, generates speech segments, and merges them into a single audio file.
- Reads text from a transcript file
- Generates speech segments with customizable voice and speed settings
- Saves individual audio segments and their corresponding text
- Merges all audio segments into a single WAV file using FFmpeg
- Organizes output in timestamped directories