-
-
Save BrianZbr/5428fd30800a0f8a4963 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# convert multipage pdf to single page tiff | |
gs -q -dNOPAUSE -dBATCH -sDEVICE=tiffg4 -sOutputFile=%04d.tif source.pdf -c qui | |
# or use -sDEVICE=pgmraw to convert to pgm | |
# unpaper, rotate the logical page 90 degrees, each logical page contained two scanned physical pages, so we use --layout double (for input) and --output-pages 2 since we want to split these two pages. | |
unpaper -v --deskew-scan-deviation 3.0 --border-align top --deskew-scan-range 15 --no-grayfilter --no-blurfilter --no-noisefilter --overwrite --pre-rotate 90 --border-scan-step 4 --layout double --output-pages 2 %04d.pgm.pbm unpaper%04d.pbm | |
# trim the pages and convert the to single-page pdfs | |
find . -name 'unpaper*' | xargs -i -n1 -P6 convert -trim +repage {} {}.pdf | |
# finally reassemble the pdf with ghostcript | |
gs -sDEVICE=pdfwrite -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf *.pdf | |
# (optional) convert pqm to pbm | |
find . -name '*.pgm' | xargs -i -n1 sh -c "pgmtopbm {} > {}.pbm" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment