Skip to content

Instantly share code, notes, and snippets.

@paazmaya
Last active October 28, 2020 09:32
Show Gist options
  • Save paazmaya/3a2a57a0eb71831ff57ec7f95943e1d9 to your computer and use it in GitHub Desktop.
Save paazmaya/3a2a57a0eb71831ff57ec7f95943e1d9 to your computer and use it in GitHub Desktop.
OCR with ocrmypdf in macOS, more than just English
# Install tools
# https://github.com/jbarlow83/OCRmyPDF
# https://github.com/tesseract-ocr/tesseract
brew install tessdata tesseract-lang ocrmypdf
# Not all language data and other important files install initially in the same place
cp /usr/local/Cellar/tesseract/4.1.1/share/tessdata/* /usr/local/Cellar/tesseract-lang/4.0.0/share/tessdata/
# Add to .bash_profile or run here before conversions
export TESSDATA_PREFIX=/usr/local/Cellar/tesseract-lang/4.0.0/share/tessdata/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment