Created
September 13, 2020 18:18
-
-
Save SpiralOutDotEu/d8a390f52b0a05bb1e634da1ffe3d584 to your computer and use it in GitHub Desktop.
python script to identify those fault jpg images in a directory.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import glob | |
import os | |
import re | |
import logging | |
import traceback | |
filelist=glob.glob("/path/to/*.jpg") | |
for file_obj in filelist: | |
try: | |
jpg_str=os.popen("file \""+str(file_obj)+"\"").read() | |
if (re.search('PNG image data', jpg_str, re.IGNORECASE)) or (re.search('Png patch', jpg_str, re.IGNORECASE)): | |
print("Deleting jpg as it contains png encoding - "+str(file_obj)) | |
os.system("rm \""+str(file_obj)+"\"") | |
except Exception as e: | |
logging.error(traceback.format_exc()) | |
print("Cleaning jps done") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It deletes all PNG images that are stored with .jpg extension.
Most specialized image viewers can handle this and show the image but in other uses it causes troubles.