Skip to content

Instantly share code, notes, and snippets.

@danielbowden
Last active August 29, 2015 14:04
Show Gist options
  • Save danielbowden/1c5c88843681b52f90c8 to your computer and use it in GitHub Desktop.
Save danielbowden/1c5c88843681b52f90c8 to your computer and use it in GitHub Desktop.
Bash script for cleaning and combing multiple PDFs into one
#!/bin/bash
# Note files need to be in correct order to start with. ie. number chapters first
# Required tools:
# * http://www.pdflabs.com/tools/pdftk-server/
# * brew install mupdf
# Works as follows:
# Step 1. Clean all files in dir and write clean file names in order to textfile
# mutool clean <infile.pdf> <outfile.pdf>
# Step 2. Combine all files from textfile into one pdf
# pdftk <files listed in order> cat output output.pdf
CLEANDIR=clean;
echo "CLEANDIR is $CLEANDIR"
mkdir -p "$CLEANDIR";
CLEANFILES=DBCleanFiles.txt
if [ -f "$CLEANFILES" ]; then
echo "Removing existing $CLEANFILES"
rm -rf "$CLEANFILES"
fi
FILENUM=1
find *.pdf | sort -n | while read -r pdffile;
do
echo "Cleaning $pdffile as $FILENUM";
mutool clean "$pdffile" "$CLEANDIR"/"$FILENUM".pdf;
echo "$CLEANDIR"/"$FILENUM".pdf >> $CLEANFILES
((FILENUM++))
done
OUTFILE=${PWD##*/}
echo "Combining files"
pdftk $(cat $CLEANFILES) cat output "$CLEANDIR"/"$OUTFILE".pdf
echo "Finished. created $CLEANDIR/$OUTFILE.pdf"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment