Last active
October 28, 2020 09:32
-
-
Save paazmaya/3a2a57a0eb71831ff57ec7f95943e1d9 to your computer and use it in GitHub Desktop.
OCR with ocrmypdf in macOS, more than just English
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Install tools | |
# https://github.com/jbarlow83/OCRmyPDF | |
# https://github.com/tesseract-ocr/tesseract | |
brew install tessdata tesseract-lang ocrmypdf | |
# Not all language data and other important files install initially in the same place | |
cp /usr/local/Cellar/tesseract/4.1.1/share/tessdata/* /usr/local/Cellar/tesseract-lang/4.0.0/share/tessdata/ | |
# Add to .bash_profile or run here before conversions | |
export TESSDATA_PREFIX=/usr/local/Cellar/tesseract-lang/4.0.0/share/tessdata/ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment