Skip to content

Instantly share code, notes, and snippets.

@AlexAtkinson
Last active November 27, 2024 14:06
Show Gist options
  • Save AlexAtkinson/8fafe6d827942097a038c127ee49022d to your computer and use it in GitHub Desktop.
Save AlexAtkinson/8fafe6d827942097a038c127ee49022d to your computer and use it in GitHub Desktop.
A PDF Attachment Extractor -- BC I don't want to install Adobe...
#!/usr/bin/env python
# Copy into /usr/local/bin or as appropriate for your $PATH
import argparse
from pypdf import PdfReader
from pypdf.errors import PdfReadError
parser = argparse.ArgumentParser()
parser.add_argument("pdf", help="The PDF to extract attachments from.", type=str)
args = parser.parse_args()
try:
PdfReader(args.pdf)
except PdfReadError:
print("invalid PDF file")
else:
pass
reader = PdfReader(args.pdf)
for name, content_list in reader.attachments.items():
for i, content in enumerate(content_list):
with open(f"{name}", "wb") as fp:
fp.write(content)
print("Extracted: " + name)
@AlexAtkinson
Copy link
Author

Run like extract-pdf-attachments <source pdf>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment