Hey hey!
We are on a mission to democratise speech, increase the language coverage of current SoTA speech recognition and push the limits of what is possible. Come join us from December 5th - 19th for a community sprint powered by Lambda Labs. Through this sprint, we'll cover 70+ languages, 39M - 1550M parameters & evaluate our models on real-world evaluation datasets.
Register your interest via the Google form here.
The goal of the sprint is to fine-tune Whisper in as many languages as possible and make them accessible to the community. We hope that especially low-resource languages will profit from this event.
The main components of the sprint consist of:
- Open AI’s state-of-the-art Whisper model
- Public datasets like Common Voice 11, VoxPopuli, CoVoST2 and more
- Real-world audio for evaluation
Participants have two weeks to fine-tune as many Whisper checkpoints in as many languages as they want. In general, the model repository on the Hugging Face hub should consist of:
- Fine-tuned Whisper checkpoints (e.g. Whisper-large)
- Pre- and post-processing modules, such as noise-cancelling, spelling correction, …
- Hugging Face space to demo your fine-tuned model
During the event, you will have the opportunity to work on each of these components to build speech recognition systems in your favourite language!
Each Whisper checkpoint will automatically be evaluated on real-world audio (if available for the language). After the fine-tuning week, the best-performing systems of each language will receive 🤗 SWAG.
To participate, simply fill out this short google form. You will also need to create a Hugging Face Hub account here and join our discord here - Make sure to head over to #role-assignment and click on ML for Audio and Speech.
This fine-tuning week should be especially interesting to native speakers of low-resource languages. Your language skills will help you select the best training data, and possibly build the best existing speech recognition system in your language.
More details will be announced in the discord channel. We are looking forward to seeing you there!
- learn how to fine-tune state-of-the-art Whisper speech recognition checkpoints
- free compute to build a powerful fine-tuned model under your name on the Hub
- hugging face SWAG if you manage to build the best-performing model in a language
- GPU hours from lambda labs if you manage to have the best-performing model in a language
Open-sourcely yours,
Sanchit, VB & The HF Speech Team
Thanks for the review, I've posted it on the forum: https://discuss.huggingface.co/t/open-to-the-community-whisper-fine-tuning-event/26681