We've all downloaded a PDF from somewhere that some idiot (or Google) put a watermark on, to mark their territory somehow. Here's how you might want to proceed.
You might have to install a few tools, mainly pdftk
, qpdf
and mupdf-tools
.
Assume that commands are chained, so input.pdf
of any command is the
output.pdf
of the previous one (if applicable).
qpdf --decrypt input.pdf output.pdf
This is in case the PDF has encrypted metadata. It might not always be the case.
pdftk input.pdf output output.pdf uncompress
This allows you to edit the PDF with a text editor.
Watermarks have several shapes but in general it's always one object that shows up in all the pages. PDF pages look like this
/NxFm3 1053 0 R
/NxFm0 1055 0 R
/NxFm30 1579 0 R
/NxFm1 1057 0 R
/NxFm8 1058 0 R
/NxFm9 1059 0 R
/NxFm25 1060 0 R
/NxFm24 1061 0 R
/NxFm23 1062 0 R
/NxFm22 1063 0 R
/NxFm29 1580 0 R
The second and third digit are the id's of the object (I know nothing about PDF but the last digit is always 0 for some reason) so typically you should be able to recognise an object that repeats in every page.
If the watermark has selectable text you don't need to check the ID's, you just need to find it in the text with an editor, e.g.
stream
BT
1 0 0 rg
0 0 0 RG
0 w
/GS0 gs
/F0 14 Tf
1 0 0 1 144 158.052 Tm
(https://watermark.url \(something else\))Tj
ET
It should be enough to replace the whole thing with
stream
BT
ET
Which basically replaces the text field with a NULL object.
Chances are that the PDF will be broken. If it doesn't open upon save, run this:
mutool clean -d -g input.pdf output.pdf