Created
January 28, 2023 12:04
-
-
Save firexcy/98d94bda7c8328c8fd444be148f03dd9 to your computer and use it in GitHub Desktop.
Transcribe recordings with the C++ port of OpenAI Whisper
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Clone the project | |
git clone https://github.com/ggerganov/whisper.cpp && cd whisper.cpp | |
# Download at least one model, which may be "tiny", "base", "small", "medium", | |
# "large", etc. Usually "medium" is sufficient for recordings in Chinese. | |
bash ./models/download-ggml-model.sh medium | |
# Compile the program | |
make | |
# Prepare the input audio file, since the current version runs only with 16-bit | |
# WAV files. Dependent on ffmpeg. | |
ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le test.wav | |
# Start transcription with the "medium" model (-m, or with another model of | |
# choice) and use Chinese (-l, for the full list of language codes see | |
# https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10). | |
./main -l zh -m models/ggml-medium.bin -f test.wav |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment