Skip to content

Instantly share code, notes, and snippets.

@jooray
Last active January 20, 2025 13:05
Show Gist options
  • Save jooray/22559480fecbe38444080c208b715cce to your computer and use it in GitHub Desktop.
Save jooray/22559480fecbe38444080c208b715cce to your computer and use it in GitHub Desktop.
Use whisper speech to text on an audio or video file regardless of codec, autodetect language
#!/bin/bash
# Usage: whisper-file FILE [LANGUAGE] [OPTIONS...]
# If LANGUAGE is empty, it is set to "auto"
# Any additional options are passed directly to ${WHISPER_BIN}
# General settings (paths) for whisper.cpp
# Note - this uses whisper.cpp, not official whisper. Get it at
# https://github.com/ggerganov/whisper.cpp
WHISPER_MODEL=/Users/test/whisper.cpp/models/ggml-large.bin
WHISPER_BIN=/Users/test/whisper.cpp/main
# File and language setup
FILE="${1}"
RECOGNITION_LANGUAGE="auto"
if [ -n "${2}" ]; then RECOGNITION_LANGUAGE="${2}"; fi
# Collect optional parameters
shift 2
EXTRA_PARAMS="$@"
WAV_RECODED_FILENAME="${FILE}-whisper.wav"
WAV_VTT_FILENAME="${WAV_RECODED_FILENAME}.vtt"
# Check if VTT already exists
if [ -e "${WAV_VTT_FILENAME}" ]; then
echo "${WAV_VTT_FILENAME} already exists"
exit 1
fi
# Check if WAV already exists
if [ -e "${WAV_RECODED_FILENAME}" ]; then
echo "${WAV_RECODED_FILENAME} already exists, using it."
else
ffmpeg -i "${FILE}" -vn -ar 16000 -ac 1 -c:a pcm_s16le "${WAV_RECODED_FILENAME}"
fi
# Run whisper.cpp with additional parameters if provided
${WHISPER_BIN} -m "${WHISPER_MODEL}" -f "${WAV_RECODED_FILENAME}" -otxt -ovtt -pp -l "${RECOGNITION_LANGUAGE}" ${EXTRA_PARAMS} && rm -f "${WAV_RECODED_FILENAME}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment