Skip to content

Instantly share code, notes, and snippets.

View cimnine's full-sized avatar

Christian Mäder cimnine

View GitHub Profile
@cimnine
cimnine / ocr.sh
Created November 26, 2021 16:31
Script to OCR many PDF files using tessarect and imagemagick
#!/bin/bash
set -e
analyze() {
BASE="$1"
echo "Converting 'input/$BASE.pdf' to 'output/$BASE.tiff'"
convert -density 300 "input/$BASE.pdf" "output/$BASE.tiff" 2>&1
echo "OCR of 'output/$BASE.pdf' to 'output/${BASE}_deu.pdf'"