ICASSP 2024 MMITS-VC CHALENGE SUBMISSION FOR NVIDIA: SCALING NVIDIA’S MULTI-SPEAKER MULTI-LINGUAL TTS SYSTEMS WITH VOICE-CLONING TO INDIC LANGUAGES
Speaker | Initial data points | Duplicates, empty audio or transcripts | Final data points | Initial Durations of final data points (hours) | Final Durations after silence removal (hours) |
---|---|---|---|---|---|
Hindi_M | 17798 | 1 | 17797 | 40.49 | 39.63 |
Hindi_F | 16512 | 29 | 16483 | 40.21 | 38.16 |
Telugu_M | 16939 | 0 |