Skip to content

Instantly share code, notes, and snippets.

@developersharif
Last active January 23, 2025 16:17
Show Gist options
  • Save developersharif/57584db7d456ef4591db105c643007f5 to your computer and use it in GitHub Desktop.
Save developersharif/57584db7d456ef4591db105c643007f5 to your computer and use it in GitHub Desktop.
A web-based Optical Character Recognition (OCR) application using Flask and pytesseract.

OCR Web API Project

Overview

A web-based Optical Character Recognition (OCR) application using Flask and pytesseract, supporting multiple languages and flexible image input methods.

Prerequisites

  • Python 3.8+
  • pip package manager

Installation

1. Install System Dependencies

Ubuntu/Debian

sudo apt-get update
sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-ben  # For Bengali language support

Windows

  1. Download Tesseract OCR installer from official GitHub
  2. Add Tesseract to system PATH

Create requirements.txt

flask
flask-cors
opencv-python-headless
pytesseract
numpy

3. Install Python Dependencies

python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate
pip install -r requirements.txt

Running the Application

Start Flask Server

python server.py
  • Server runs on http://localhost:5000
  • Open example-use.html in browser

Features

  • Image upload via file selection
  • Image extraction via URL
  • Multi-language OCR support
  • Base64 image preview
  • Error handling

Supported Languages

  • English
  • Bengali
  • Configurable via API parameters

API Endpoint

  • /extract-text (POST)
    • Accepts: Image file or image URL
    • Returns: Extracted text and base64 image

Doc generated by help of AI

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>OCR Web Client</title>
<style>
body {
font-family: Arial, sans-serif;
max-width: 800px;
margin: auto;
}
#imagePreview {
max-width: 100%;
}
#extractedText {
width: 100%;
min-height: 200px;
}
</style>
</head>
<body>
<h1>OCR Text Extraction</h1>
<div>
<h2>Upload Image</h2>
<input type="file" id="imageUpload" accept="image/*" />
<input type="text" id="imageUrl" placeholder="Or paste image URL" />
<button onclick="extractText()">Extract Text</button>
</div>
<div>
<h2>Preview</h2>
<img id="imagePreview" src="" alt="Image Preview" />
</div>
<div>
<h2>Extracted Text</h2>
<textarea id="extractedText" readonly></textarea>
</div>
<script>
async function extractText() {
const fileInput = document.getElementById("imageUpload");
const urlInput = document.getElementById("imageUrl");
const preview = document.getElementById("imagePreview");
const textArea = document.getElementById("extractedText");
const formData = new FormData();
// Handle file upload
if (fileInput.files.length > 0) {
formData.append("image", fileInput.files[0]);
}
// Handle URL input
else if (urlInput.value) {
formData.append("image_url", urlInput.value);
} else {
alert("Please upload an image or provide an image URL");
return;
}
try {
const response = await fetch("http://localhost:5000/extract-text", {
method: "POST",
body: formData,
});
const data = await response.json();
if (data.image) {
preview.src = `data:image/png;base64,${data.image}`;
}
textArea.value = data.text;
} catch (error) {
console.error("Error:", error);
alert("Failed to extract text");
}
}
</script>
</body>
</html>
from flask import Flask, request, jsonify
from flask_cors import CORS
import cv2
import numpy as np
import pytesseract
import base64
app = Flask(__name__)
CORS(app)
@app.route('/extract-text', methods=['POST'])
def extract_text():
if 'image' not in request.files and 'image_url' not in request.form:
return jsonify({"error": "No image provided"}), 400
try:
if 'image' in request.files:
image_file = request.files['image']
image_np = cv2.imdecode(np.frombuffer(
image_file.read(), np.uint8), cv2.IMREAD_GRAYSCALE)
elif 'image_url' in request.form:
import urllib.request
image_url = request.form['image_url']
with urllib.request.urlopen(image_url) as url:
image_np = cv2.imdecode(np.frombuffer(
url.read(), np.uint8), cv2.IMREAD_GRAYSCALE)
languages = request.form.get('languages', 'ben+eng')
ocr_config = request.form.get('config', '--psm 6 --oem 3')
extracted_text = pytesseract.image_to_string(
image_np,
lang=languages,
config=ocr_config
)
_, buffer = cv2.imencode('.png', image_np)
encoded_image = base64.b64encode(buffer).decode('utf-8')
return jsonify({
"text": extracted_text,
"image": encoded_image
}), 200
except Exception as e:
return jsonify({"error": str(e)}), 500
if __name__ == '__main__':
app.run(debug=True, port=5000)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment