Skip to content

Instantly share code, notes, and snippets.

@ftk
Created August 18, 2025 08:24
Show Gist options
  • Save ftk/d707730578aa4ddb8621241ea83dd597 to your computer and use it in GitHub Desktop.
Save ftk/d707730578aa4ddb8621241ea83dd597 to your computer and use it in GitHub Desktop.
Launch spectacle in capture region mode and send the catpured screenshot to openai-compatible endpoint to OCR and possibly translate
#!/usr/bin/env bash
## Launch spectacle in capture region mode and send the catpured screenshot to openai-compatible endpoint to OCR and possibly translate
## Example usage: bind 'notify-send img "$(./screenshot_llm.sh 2>&1)"' to a hotkey
## Environment variables to set: OPENAI_API_KEY, optionally: OPENAI_API_BASE, OPENAI_MODEL, PROMPT
set -euo pipefail
[[ $# -ge 1 && $1 == --help ]] && exec sed -n 's/^## //p' -- "$0"
[[ $# -ge 1 && ( $1 == --verbose || $1 == -v ) ]] && set -x
# launch spectacle and allow use to select a region to screenshot, then encode png to b64
screenshot="$(spectacle -o /dev/stdout -r -b -n | base64 -w 0)"
[[ -z $screenshot ]] && exit 1 # user pressed esc
request='{
"model": "'"${OPENAI_MODEL-gpt-4o}"'",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "'"${PROMPT-What is in this image? Translate the text.}"'"
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,'"$screenshot"'"
}
}
]
}
]
}'
# using curl -d @- <<< "$request" instead of curl -d $request because $request can be longer than argument max length
curl --no-progress-meter --fail -X POST \
"${OPENAI_API_BASE-https://openai.com/v1}/chat/completions" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d @- <<< "$request" | jq -r '.choices[0].message.content'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment