-
-
Save tovask/6fc2b0ccdd55cc575034067e87f4cf9f to your computer and use it in GitHub Desktop.
Decompress FlateDecode Objects in PDF
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# This script is designed to do one thing and one thing only. It will find each | |
# of the FlateDecode streams in a PDF document using a regular expression, | |
# unzip them, and print out the unzipped data. You can do the same in any | |
# programming language you choose. | |
# | |
# This is NOT a generic PDF decoder, if you need a generic PDF decoder, please | |
# take a look at pdf-parser by Didier Stevens, which is included in Kali linux. | |
# https://tools.kali.org/forensics/pdf-parser. | |
# | |
# Any requests to decode a PDF will be ignored. | |
import re | |
import zlib | |
pdf = open("some_doc.pdf", "rb").read() | |
stream = re.compile(rb'.*?FlateDecode.*?stream(.*?)endstream', re.S) | |
for s in stream.findall(pdf): | |
s = s.strip(b'\r\n') | |
try: | |
print(zlib.decompress(s)) | |
print("") | |
except: | |
pass |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment