Skip to content

Instantly share code, notes, and snippets.

@nabilm
Last active September 8, 2019 12:06
Show Gist options
  • Save nabilm/ddf2b43b6774037c7c0f996028704e1f to your computer and use it in GitHub Desktop.
Save nabilm/ddf2b43b6774037c7c0f996028704e1f to your computer and use it in GitHub Desktop.
Extract email text from image using pytesseract , this will include how to install tesseract and pytesseract
$brew install tesseract
$pip install pytesseract
$pip install Pillow
$pip install expynent
```
from pytesseract import image_to_string
import pytesseract
from PIL import Image
import PIL.Image
email_image = '<path_to_image>'
output = pytesseract.image_to_string(PIL.Image.open(email_image).convert("RGB"), lang='eng')
import re
from expynent.patterns import EMAIL_ADDRESS
email_search = re.search(EMAIL_ADDRESS, output)
email = email_search.group()
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment