How to remove a watermark from a PDF

We've all downloaded a PDF from somewhere that some idiot (or Google) put a watermark on, to mark their territory somehow. Here's how you might want to proceed.

You might have to install a few tools, mainly pdftk, qpdf and mupdf-tools.

Assume that commands are chained, so input.pdf of any command is the output.pdf of the previous one (if applicable).

Decrypt the pdf

qpdf --decrypt input.pdf output.pdf

This is in case the PDF has encrypted metadata. It might not always be the case.

Decompress the pdf

pdftk input.pdf output output.pdf uncompress

This allows you to edit the PDF with a text editor.

Edit the decompressed file

Watermarks have several shapes but in general it's always one object that shows up in all the pages. PDF pages look like this

/NxFm3 1053 0 R
/NxFm0 1055 0 R
/NxFm30 1579 0 R
/NxFm1 1057 0 R
/NxFm8 1058 0 R
/NxFm9 1059 0 R
/NxFm25 1060 0 R
/NxFm24 1061 0 R
/NxFm23 1062 0 R
/NxFm22 1063 0 R
/NxFm29 1580 0 R

The second and third digit are the id's of the object (I know nothing about PDF but the last digit is always 0 for some reason) so typically you should be able to recognise an object that repeats in every page.

If the watermark has selectable text you don't need to check the ID's, you just need to find it in the text with an editor, e.g.

stream
BT
1 0 0 rg
0 0 0 RG
0 w
/GS0 gs
/F0 14 Tf
1 0 0 1 144 158.052 Tm
(https://watermark.url     \(something else\))Tj
ET

It should be enough to replace the whole thing with

stream
BT
ET

Which basically replaces the text field with a NULL object.

Rebuild the metadata

Chances are that the PDF will be broken. If it doesn't open upon save, run this:

mutool clean -d -g input.pdf output.pdf

myyc/pdfwatermark.md

How to remove a watermark from a PDF

Decrypt the pdf

Decompress the pdf

Edit the decompressed file

Rebuild the metadata