compute_asr_normalization_statistics.ipynb

ridasaleem0 commented Jul 27, 2021

Okay, I am basically looking to calculate dataset statistics for "stt_zh_citrinet_1024_gamma_0_25_1.0.0" model, since it has been trained on Multilingual LibriSpeech English corpus (pre-training) and Aishell-2 corpus (fine-tuning), i am not sure where to get the manifest file for it.

Author

titu1994 commented Jul 27, 2021

For such models, if you don't have the datasets it might be valuable to simply run https://github.com/NVIDIA/NeMo/blob/main/tutorials/asr/Streaming_ASR.ipynb with large buffer sizes (it won't require pre-calculation of the dataset statistics then)

ridasaleem0 commented Jul 27, 2021

Can this be used for real-time asr with microphone? I am specifically looking for offline microphone asr solution.

Author

titu1994 commented Jul 27, 2021

Not for realtime. Jarvis would be a proper production toolkit for streaming (real time) ASR. In Nemo we have buffered audio (the notebook above) and streaming audio is not perfect support

ridasaleem0 commented Jul 27, 2021

I have Jetson xavier and nano, as far as I know Jarvis is not compatible with Jetson for now.

Author

titu1994 commented Jul 27, 2021

It is not compatible for now. I don't think Nemo supports ASR on Jetson either

ridasaleem0 commented Jul 27, 2021

Ah seems like a pickle, anyways thank you so much for your assistance.

titu1994/compute_asr_normalization_statistics.ipynb

ridasaleem0 commented Jul 27, 2021

Uh oh!

titu1994 commented Jul 27, 2021

Uh oh!

ridasaleem0 commented Jul 27, 2021

Uh oh!

titu1994 commented Jul 27, 2021

Uh oh!

ridasaleem0 commented Jul 27, 2021

Uh oh!

titu1994 commented Jul 27, 2021

Uh oh!

ridasaleem0 commented Jul 27, 2021

Uh oh!