AI Voice Cloning / Transfer (eg. RVCv2)

My random collection of notes on AI voice cloning services/models/techniques/etc. Just because something is listed here, doesn't necessarily mean I have tried it, nor endorse it. Use this as a starting point for doing your own further research.

Services
Guides, Colabs, etc
Tools
Character Voices
See Also

Services

https://elevenlabs.io/
- https://elevenlabs.io/speech-synthesis
  - The First Generative Speech Synthesis Platform Generate lifelike speech in any language and voice with the most powerful Text to Speech and Voice Cloning software.
- https://elevenlabs.io/voice-lab
  - Generative Voice AI Clone your voice or create entirely new synthetic voices using the most advanced Generative AI technology ever.
- https://elevenlabs.io/voice-library
  - Discover AI Voices Crafted by the Community Get access to an ever-growing library of high quality AI voices and discover characters that perfectly fit your needs.
- https://elevenlabs.io/professional-voice-cloning
  - Professional Voice Cloning Create the perfect digital replica of your voice using the most advanced voice cloning AI. We create AI models on your voice from the grounds up to offer the most realistic voice cloning experience ever.
- https://elevenlabs.io/projects
  - Your Audiobook Workshop Generate, edit, and customize long-form spoken audio with precision, all within a streamlined workflow.
https://create.musicfy.lol/
- https://create.musicfy.lol/parody
https://lalals.com/voices/
https://www.uberduck.ai/
https://voice.ai/
https://www.myvocal.ai/
https://app.kits.ai/
https://play.ht/voice-cloning/
https://speechify.com/voice-cloning/

Guides, Colabs, etc

Tools

https://github.com/myshell-ai/OpenVoice
- Instant voice cloning by MyShell
- https://research.myshell.ai/open-voice
  - OpenVoice: Versatile Instant Voice Cloning
- https://arxiv.org/abs/2312.01479
  - OpenVoice: Versatile Instant Voice Cloning
  - We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice represents a significant advancement in addressing the following open challenges in the field: 1) Flexible Voice Style Control. OpenVoice enables granular control over voice styles, including emotion, accent, rhythm, pauses, and intonation, in addition to replicating the tone color of the reference speaker. The voice styles are not directly copied from and constrained by the style of the reference speaker. Previous approaches lacked the ability to flexibly manipulate voice styles after cloning. 2) Zero-Shot Cross-Lingual Voice Cloning. OpenVoice achieves zero-shot cross-lingual voice cloning for languages not included in the massive-speaker training set. Unlike previous approaches, which typically require extensive massive-speaker multi-lingual (MSML) dataset for all languages, OpenVoice can clone voices into a new language without any massive-speaker training data for that language. OpenVoice is also computationally efficient, costing tens of times less than commercially available APIs that offer even inferior performance. To foster further research in the field, we have made the source code and trained model publicly accessible. We also provide qualitative results in our demo website. Prior to its public release, our internal version of OpenVoice was used tens of millions of times by users worldwide between May and October 2023, serving as the backend of MyShell.
- https://github.com/camenduru/OpenVoice-colab
  - OpenVoice-colab
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
- Retrieval-based-Voice-Conversion-WebUI
https://github.com/SociallyIneptWeeb/AICoverGen
- A WebUI to create song covers with any RVC v2 trained AI voice from YouTube videos or audio files.
https://github.com/w-okada/voice-changer
- Realtime Voice Changer
https://github.com/yxlllc/DDSP-SVC
- Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)
https://github.com/isletennos/MMVC_Trainer
- Real-time voice changer using AI (Trainer)
https://github.com/svc-develop-team/so-vits-svc
- SoftVC VITS Singing Voice Conversion [Archived]
https://github.com/voicepaw/so-vits-svc-fork
- so-vits-svc fork with realtime support, improved interface and more features
https://github.com/PlayVoice/so-vits-svc-5.0
- so-vits-svc-5.0
- Core Engine of Singing Voice Conversion & Singing Voice Clone
- Variational Inference with adversarial learning for end-to-end Singing Voice Conversion based on VITS
- https://github.com/PlayVoice/so-vits-svc-5.0#data-set
- https://github.com/PlayVoice/so-vits-svc-5.0#code-sources-and-references
https://github.com/PlayVoice/lora-svc
- lora-svc
- Singing voice change based on whisper, and lora for singing voice clone
- Singing Voice Conversion based on Whisper & neural source-filter BigVGAN
- LoRA is not fully implemented in this project, but it can be found here: LoRA TTS & paper
https://github.com/PlayVoice/VI-SVS
- VI-SVS
- Singing Voice Synthesis based on VITS, different from VISinger
- Variational Inference with adversarial learning for end-to-end Singing Voice Synthesis
- Different from VISinger, It is just VITS without MAS and DurationPredictor.
https://github.com/quickvc/QuickVC-VoiceConversion
- QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
https://github.com/auspicious3000/contentvec
- ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
- https://arxiv.org/abs/2204.09224
https://github.com/uberduck-ai/uberduck-ml-dev
- Uberduck Synthetic Speech
- ML models for Uberduck

Character Voices

Bowser

Bowser (Jack Black) (Super Mario Bros.) (RVC V2) 950 Epochs

40k, crepe: https://huggingface.co/yeey5/BowserRVCV2/resolve/main/Bowser.zip 48k, mangio-crepe: https://huggingface.co/yeey5/BowserRVCV2/resolve/main/Bowser48k.zip

(Ref)

Peach

Princess Peach (Anya Taylor Joy) RVC v2 300 Epochs

I decided to create a voice model of her because there's never been a voice model of any taylor joy

https://app.kits.ai/convert/shared/anya-taylor-joy-1

(Ref)

Ellie (The Last of Us, TLOU)

Ellie TLOU RVC V2 300 Epochs

Ellie from the last of us part 1 (RVC V2) 300 epochs trained with a 10 minute dataset , voice lines from the game

https://huggingface.co/TJKAI/EllieLOU/resolve/main/Ellie.rar

(Ref)

Ellie (The Last of Us Part 1) (RVC v2, 119e/8806s)

This model is based on the younger version of Ellie from The Last of Us (video game). This model only exists because I failed to appreciate how different Ellie sounds at different parts of Part 2, but I thought I'd share it anyway because it sounds pretty good. There are other young Ellie voices that I think are better though. You should probably use this one by <@554576184101306368> instead: https://discord.com/channels/1089076875999072296/1128416105229189180/1128416105229189180 Here's my model of the older Ellie from Part 2: https://discord.com/channels/1089076875999072296/1136823620618964992/1136823620618964992

https://huggingface.co/Grimoire-VC/Voices/resolve/main/EllieP1.zip

(Ref)

Ellie (The Last of Us Part 2) (RVC v2, 203e/14210s)

This model is based on the older version of Ellie from the Last of Us Part 2. I'm pretty happy with it. Ellie is usually pretty breathy and I wasn't able to capture that, but I do think I got her tone of voice about right. I don't think there's another Part 2 Ellie on here yet. I also made a Part 1 Ellie that I'm uploading at the same time here: https://discord.com/channels/1089076875999072296/1136823365605261372/1136823365605261372

https://huggingface.co/Grimoire-VC/Voices/resolve/main/EllieP2.zip

(Ref)

Ideas to Create

Zoe (League of Legends)

Voice samples here: https://leagueoflegends.fandom.com/wiki/Zoe/LoL/Audio

(Ref)

Annie (League of Legends)

Voice samples here: https://leagueoflegends.fandom.com/wiki/Annie/LoL/Audio

(Ref)

Vanellope von Schweetz (Wreck it Ralph)

A few sources of sound clips I found from a quick google:

https://www.soundboard.com/sb/VanellopeVonSchweetz

https://www.sounds-resource.com/pc_computer/disneyinfinity/sound/12086/

https://www.sounds-resource.com/ds_dsi/wreckitralph/sound/3249/

https://www.youtube.com/watch?v=jnp2ObnVt9M

(Ref)

ink-splatters/ai-voice-cloning-transfer.md

Select an option

No results found