Skip to content

Instantly share code, notes, and snippets.

@titu1994
Last active February 2, 2023 10:31
Show Gist options
  • Save titu1994/8c8d478a917cf62a6acd0f40af779f77 to your computer and use it in GitHub Desktop.
Save titu1994/8c8d478a917cf62a6acd0f40af779f77 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ridasaleem0
Copy link

Okay, I am basically looking to calculate dataset statistics for "stt_zh_citrinet_1024_gamma_0_25_1.0.0" model, since it has been trained on Multilingual LibriSpeech English corpus (pre-training) and Aishell-2 corpus (fine-tuning), i am not sure where to get the manifest file for it.

@titu1994
Copy link
Author

For such models, if you don't have the datasets it might be valuable to simply run https://github.com/NVIDIA/NeMo/blob/main/tutorials/asr/Streaming_ASR.ipynb with large buffer sizes (it won't require pre-calculation of the dataset statistics then)

@ridasaleem0
Copy link

Can this be used for real-time asr with microphone? I am specifically looking for offline microphone asr solution.

@titu1994
Copy link
Author

Not for realtime. Jarvis would be a proper production toolkit for streaming (real time) ASR. In Nemo we have buffered audio (the notebook above) and streaming audio is not perfect support

@ridasaleem0
Copy link

I have Jetson xavier and nano, as far as I know Jarvis is not compatible with Jetson for now.

@titu1994
Copy link
Author

It is not compatible for now. I don't think Nemo supports ASR on Jetson either

@ridasaleem0
Copy link

Ah seems like a pickle, anyways thank you so much for your assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment