Created
July 31, 2016 18:23
-
-
Save NanoDano/e092cf9f219e4b0506743bb64d303452 to your computer and use it in GitHub Desktop.
Extract PNGs from a file using Python
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# extract_pngs.py | |
# Extract PNGs from a file and put them in a pngs/ directory | |
import sys | |
with open(sys.argv[1], "rb") as binary_file: | |
binary_file.seek(0, 2) # Seek the end | |
num_bytes = binary_file.tell() # Get the file size | |
count = 0 | |
for i in range(num_bytes): | |
binary_file.seek(i) | |
eight_bytes = binary_file.read(8) | |
if eight_bytes == b"\x89\x50\x4e\x47\x0d\x0a\x1a\x0a": # PNG signature | |
count += 1 | |
print("Found PNG Signature #" + str(count) + " at " + str(i)) | |
# Next four bytes after signature is the IHDR with the length | |
png_size_bytes = binary_file.read(4) | |
png_size = int.from_bytes(png_size_bytes, byteorder='little', signed=False) | |
# Go back to beginning of image file and extract full thing | |
binary_file.seek(i) | |
# Read the size of image plus the signature | |
png_data = binary_file.read(png_size + 8) | |
with open("pngs/" + str(i) + ".png", "wb") as outfile: | |
outfile.write(png_data) |
dahmedk1999
commented
May 20, 2021
Byte order in png files is bigendian. And I think the data after the signature is the size of the IHDR, not the whole png.
Here's a working version:
# extract_pngs.py
# Extract PNGs from a file and put them in a pngs/ directory
import sys
import os
try:
os.mkdir("pngs")
except:
pass
with open(sys.argv[1], "rb") as binary_file:
binary_file.seek(0, 2) # Seek the end
num_bytes = binary_file.tell() # Get the file size
count = 0
for i in range(num_bytes):
binary_file.seek(i)
eight_bytes = binary_file.read(8)
if eight_bytes == b"\x89\x50\x4e\x47\x0d\x0a\x1a\x0a": # PNG signature
count += 1
print("Found PNG Signature #%d at 0x%08x" % (count,i))
with open("pngs/%08x.png" % i, "wb") as outfile:
outfile.write(eight_bytes)
while True:
sizeData = binary_file.read(4)
size = 4+int.from_bytes(sizeData, byteorder='big', signed=False)
chunk = binary_file.read(4)
outfile.write(sizeData)
outfile.write(chunk)
data = binary_file.read(size)
outfile.write(data)
if chunk == b'IEND':
break
Thanks. I’m certain I would have tested this….now I will have to go back and try
The two bugs in the original script kind of canceled out: reading the length as low-endian basically took the length of IHDR and made a very large number out of it, and then it copied a long file that probably would include all of the original png. As a result, you got a usable png but with a lot of junk at the end.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment