Skip to content

Instantly share code, notes, and snippets.

@fpaupier
Created August 6, 2021 06:31
Show Gist options
  • Save fpaupier/510e3b557e8a5ce45d5766d35240248d to your computer and use it in GitHub Desktop.
Save fpaupier/510e3b557e8a5ce45d5766d35240248d to your computer and use it in GitHub Desktop.
OCR over an openCV image with tesseract - extract from https://github.com/fpaupier/gRPC-multiprocessing
def get_text_from_image(img: bytes) -> str:
"""
Perform OCR over an image.
Args:
img (bytes) : a pickled image - encoded with openCV.
Returns:
The text found in the image by the OCR module.
"""
# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
img_rgb = cv2.cvtColor(pickle.loads(img), cv2.COLOR_BGR2RGB)
return pytesseract.image_to_string(img_rgb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment