This script transcribes audio into text using either a remote Groq API or a local Whisper.cpp model based on user input. Triggered via keybindings, it supports multiple languages.
whpr→ Recognize audio in English.whpr fr→ Recognize audio in French.whpr es→ Recognize audio in Spanish.whpr lang→ Recognize audio in any supported language.
-
Recording:
- Starts recording audio using
recand saves the file to/dev/shmfor faster RAM-based processing. - Stops recording when triggered again.
- Starts recording audio using
-
Processing:
- If using Groq API (default), it sends the audio for transcription. Falls back once if the request fails.
- If using Whisper.cpp locally, it processes the audio using the specified model.
-
Output:
- The transcribed text is passed to the keyboard using
xdotool. - Notifications (
notify-send) provide status updates.
- The transcribed text is passed to the keyboard using
- Groq API KEY: Place it at
~/groq.token.txt. - Local Setup:
- Set
REMOTE=falseto use Whisper.cpp. - Install Whisper.cpp and download the models:
- Multilingual:
/home/user/tmp/whisper.cpp/models/ggml-small.bin - English-only:
/home/user/tmp/whisper.cpp/models/ggml-small.en.bin
- Multilingual:
- Adjust paths as needed.
- Set
rec,xdotool,notify-send(common in Ubuntu/Debian-based systems).- Groq API or Whisper.cpp for transcription.
- Tested on Xubuntu; It should work almost out-of-the-box with ubuntu based and debian based distros, compatibility may vary on other distributions.
- Ensure required tools are installed for
xdotool(keyboard input) andnotify-send(notifications).