Skip to content

Instantly share code, notes, and snippets.

@johnidm
Last active January 17, 2025 19:15
Show Gist options
  • Save johnidm/09e794ac19626186766a58bb9d7681a2 to your computer and use it in GitHub Desktop.
Save johnidm/09e794ac19626186766a58bb9d7681a2 to your computer and use it in GitHub Desktop.
Surya and docTR: Powerful OCR Toolkits for Document Processing

Surya and docTR: Powerful OCR Toolkits for Document Processing

Surya is a Python-based document OCR toolkit designed for flexibility and ease of use in processing and extracting text from scanned documents. Developed as a lightweight and customizable solution, it allows developers to work with OCR tasks seamlessly, making it a great choice for building tailored document processing workflows. Surya supports multiple OCR engines and focuses on accessibility for diverse use cases. GitHub

from PIL import Image
from surya.ocr import run_ocr
from surya.model.detection.model import load_model as load_det_model, load_processor as load_det_processor
from surya.model.recognition.model import load_model as load_rec_model
from surya.model.recognition.processor import load_processor as load_rec_processor

class Surya(OCR):
    def to_text(self, filename: str) -> str:
        image = Image.open(filename)

        langs = ["pt"]
        det_processor, det_model = load_det_processor(), load_det_model()
        rec_model, rec_processor = load_rec_model(), load_rec_processor()

        predictions = run_ocr([image], [langs], det_model, det_processor, rec_model, rec_processor)
        for prediction in predictions:
            for line in prediction.text_lines:
                print(line.text)

docTR (Document Text Recognition) is a cutting-edge library for OCR-related tasks powered by deep learning. It provides state-of-the-art accuracy for text detection and recognition in scanned documents, images, and PDFs. With pre-trained models, seamless integration, and a focus on performance and accessibility, docTR simplifies advanced OCR tasks for developers working on intelligent document processing systems. GitHub

Both libraries empower developers to enhance document workflows, with Surya excelling in customization and docTR offering high-performance deep learning capabilities.

from doctr.io import DocumentFile
from doctr.models import ocr_predictor


class DOCTR():
    def to_text(self, filename: str) -> str:
        model = ocr_predictor(pretrained=True)
        doc = DocumentFile.from_images(filename)
        result = model(doc)
        for page in result.pages:
            for block in page.blocks:

                blocks = [word.value for line in block.lines for word in line.words ]
                print(blocks)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment