Skip to content

Instantly share code, notes, and snippets.

@xdevfaheem
Last active February 22, 2025 15:35
Show Gist options
  • Save xdevfaheem/be48be88efd1eaf9809b0e8f8462d660 to your computer and use it in GitHub Desktop.
Save xdevfaheem/be48be88efd1eaf9809b0e8f8462d660 to your computer and use it in GitHub Desktop.
A powerful text-to-speech and high-fidelity voice cloning application with precise emotional control. Convert your written text (practically unlimited content length, because of the chunked generation+streaming) or documents (TXT, PDF, XLSX, DOCX) into speech using any voice sample, preserving consistent tone.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@xdevfaheem
Copy link
Author

I have created a PR in Zonos repo. Now we can have files and unlimited content support with all the advance sampling and existing functionalities.

To run it,

apt install -y espeak-ng
git clone https://github.com/xdevfaheem/Zonos.git -b files_plus_streaming
cd Zonos
uv sync
uv sync --extra compile # optional but needed to run the hybrid
uv pip install -e .
uv run gradio_interface/main.py

And there you have it, A powerful UI for zero-shot TTS with voice cloning with all these features in your machine

Enjoy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment