-
-
Save titu1994/8c8d478a917cf62a6acd0f40af779f77 to your computer and use it in GitHub Desktop.
Yes you will need to have the dataset + it's manifest file in order to calculate the dataset statistics.
Okay, I am basically looking to calculate dataset statistics for "stt_zh_citrinet_1024_gamma_0_25_1.0.0" model, since it has been trained on Multilingual LibriSpeech English corpus (pre-training) and Aishell-2 corpus (fine-tuning), i am not sure where to get the manifest file for it.
For such models, if you don't have the datasets it might be valuable to simply run https://github.com/NVIDIA/NeMo/blob/main/tutorials/asr/Streaming_ASR.ipynb with large buffer sizes (it won't require pre-calculation of the dataset statistics then)
Can this be used for real-time asr with microphone? I am specifically looking for offline microphone asr solution.
Not for realtime. Jarvis would be a proper production toolkit for streaming (real time) ASR. In Nemo we have buffered audio (the notebook above) and streaming audio is not perfect support
I have Jetson xavier and nano, as far as I know Jarvis is not compatible with Jetson for now.
It is not compatible for now. I don't think Nemo supports ASR on Jetson either
Ah seems like a pickle, anyways thank you so much for your assistance.
Hey i've been exploring your notebook to compute normalization statistics for Citrinet model, can you please probably clear how we can use manifest path, do we need to download the datasets or what?