Last active
July 5, 2020 23:39
-
-
Save garcon/8d3c6ff10d1703d455169a184826f774 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# OCR PDF file | |
ocrmypdf -l ces input.pdf output.pdf | |
# -l => language: -l eng+deu, -l ces | |
# --sidecar => Generate text files that contain the same text recognized by OCR | |
# --title TITLE => Set document title (place multiple words in quotes) | |
# --author AUTHOR => Set document author | |
# --subject SUBJECT => Set document subject description | |
# --keywords KEYWORDS => Set document keywords | |
# -r => Automatically rotate pages based on detected text orientation | |
# -q => fewer messages | |
# --remove-background => Attempt to remove background from gray or color pages | |
# --unpaper-args '--layout double --no-noisefilter' => when two pages are scanned together | |
## OCR all PDF files | |
find . -name '*.pdf' | while read pdf; do ocrmypdf "$pdf" "${pdf}_ocr.pdf"; done |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Reduce size of PDF | |
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -sOutputFile=output.pdf input.pdf | |
# -dNOPAUSE => no pause after page | |
# -dBATCH => Exit after last file | |
# -sDEVICE=pdfwrite => PDF writer | |
# -dCompatibilityLevel => 1.5 (compatibility with Preview.app), 1.7 (compatibility with Acrobat) | |
# -dPDFSETTINGS => (small) /screen /ebook /printer /prepress (large) | |
# -g<width>x<height> => page size in pixels | |
# -r<res> => pixels/inch resolution | |
# -q => fewer messages | |
# -sPAPERSIZE => a4, legal… | |
# -dColorConversionStrategy => /Gray | |
# -dProcessColorModel => /DeviceGray | |
## Reduce all PDF files | |
find . -name '*.pdf' | while read pdf; do gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile="${pdf}_new.pdf" "$pdf"; done |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment