Skip to content

Instantly share code, notes, and snippets.

@lobstrio
Last active November 1, 2024 09:23
Show Gist options
  • Save lobstrio/8010d0a21c48b8c807f0c3820467ee0c to your computer and use it in GitHub Desktop.
Save lobstrio/8010d0a21c48b8c807f0c3820467ee0c to your computer and use it in GitHub Desktop.
Solving (simple) Captcha, using PyTesseract, PIL, and Python 3
#!/usr/bin/python3
# coding: utf-8
import pytesseract
import os
import argparse
try:
import Image, ImageOps, ImageEnhance, imread
except ImportError:
from PIL import Image, ImageOps, ImageEnhance
def solve_captcha(path):
"""
Convert a captcha image into a text,
using PyTesseract Python-wrapper for Tesseract
Arguments:
path (str):
path to the image to be processed
Return:
'textualized' image
"""
image = Image.open(path).convert('RGB')
image = ImageOps.autocontrast(image)
filename = "{}.png".format(os.getpid())
image.save(filename)
text = pytesseract.image_to_string(Image.open(filename))
return text
if __name__ == '__main__':
argparser = argparse.ArgumentParser()
argparser.add_argument("-i", "--image", required=True, help="path to input image to be OCR'd")
args = vars(argparser.parse_args())
path = args["image"]
print('-- Resolving')
captcha_text = solve_captcha(path)
print('-- Result: {}'.format(captcha_text))
@whidy
Copy link

whidy commented Nov 28, 2023

Can only recognize some simple graphic/digit numbers, such as

captcha-solver

but, this image below failed:

text

@fukemy
Copy link

fukemy commented Nov 1, 2024

pytesseract is bad for capcha resolver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment