Skip to content

Instantly share code, notes, and snippets.

@rinogo
rinogo / optimal-resize.py
Created March 31, 2021 18:50
Optimally resize an image so that its line height is approximately 32 pixels (Keywords: OpenCV, Tesseract, OCR)
#Optimally resize `img` according to the bounding boxes specified in `boxes` (which is simply the (pruned) results from `pytesseract.image_to_data()`).
#Tesseract performs optimally when capital letters are ~32px tall (https://groups.google.com/g/tesseract-ocr/c/Wdh_JJwnw94/m/24JHDYQbBQAJ). Smaller text obviously can't be OCR'd as accurately, but weirdly enough, larger text causes problems as well. So, this function uses the bounding boxes we've found and resizes the image so that the median line height should be ~32px.
def optimal_resize(img, boxes):
median_height = np.median(boxes["height"])
target_height = 32 #See https://groups.google.com/g/tesseract-ocr/c/Wdh_JJwnw94/m/24JHDYQbBQAJ
scale_factor = target_height / median_height
print("Scale factor: " + str(scale_factor))
#If the image is already within `skip_percentage` percent of the target size, just return the original image (it's better to skip resizing if we can)
skip_percentage = 0.07
@justengel
justengel / office_to_html.py
Last active April 23, 2023 19:15
Convert Microsoft Office documents to html or pdf files that can be viewed in a web browser.
"""Convert office document to file formats that may be visible in a web browser. This file uses microsoft office to
convert the files, so Windows OS is assumed and required!
Requirements:
* pywin32>=228 # Not available for Python3.8 at this time
Server Requirements:
* uvicorn>=0.11.5
* fastapi>=0.58.0
* python-multipart>=0.0.5
@vijaywm
vijaywm / frappe-reference-guide.md
Last active November 17, 2024 23:43
Frappe Reference Guide
@RobinXL
RobinXL / convert_dpi.py
Last active April 12, 2025 05:22
Convert image with correct DPI for tesseract OCR purpose
from subprocess import Popen,PIPE, check_output
import os, sys
import tempfile
import uuid
import cv2
import pytesseract as pt
from PIL import Image
def convert_dpi(input_image):
path = tempfile.gettempdir()
@zzpmaster
zzpmaster / formatBytes.dart
Last active April 21, 2025 17:26
convert bytes to kb mb in dart
static String formatBytes(int bytes, int decimals) {
if (bytes <= 0) return "0 B";
const suffixes = ["B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB"];
var i = (log(bytes) / log(1024)).floor();
return ((bytes / pow(1024, i)).toStringAsFixed(decimals)) +
' ' +
suffixes[i];
}
@lorey
lorey / markdown_to_text.py
Last active March 12, 2025 17:33
Markdown to Plaintext in Python
from bs4 import BeautifulSoup
from markdown import markdown
import re
def markdown_to_text(markdown_string):
""" Converts a markdown string to plaintext """
# md -> html -> text since BeautifulSoup can extract text cleanly
html = markdown(markdown_string)
@seanbehan
seanbehan / app.py
Last active March 13, 2023 04:55
Flask with Django ORM
'''
Run the following commands (bc. gists don't allow directories)
pip install flask django dj-database-url psycopg2
mkdir -p app/migrations
touch app/__init__.py app/migrations/__init__.py
mv models.py app/
python manage.py makemigrations
python manage.py migrate
@mauri870
mauri870 / manjaro-avell-g1513.md
Last active March 2, 2022 02:23
Installation of Manjaro 17 and nvidia/bumblebee drivers on Avell G1513

After a weekend of research, stress and pain I finally figure out how to install manjaro 17 and configure the nvidia/bumblebee drivers on my avell laptop

Here's my notebook specs:

$ inxi -MGCNA

Machine:   Device: laptop System: Avell High Performance product: 1513
           Mobo: N/A model: N/A v: 0.1 UEFI: American Megatrends v: N.1.02 date: 09/28/2016
Battery    BAT0: charge: 44.0 Wh 100.0% condition: 44.0/44.0 Wh (100%)
// https://raw.githubusercontent.com/donnut/typescript-ramda/master/ramda.d.ts
// https://raw.githubusercontent.com/DefinitelyTyped/DefinitelyTyped/master/lodash/lodash.d.ts
declare namespace fp {
interface Dictionary<T> {
[index: string]: T;
}
interface CurriedFunction1<T1, R> {
@jondaiello
jondaiello / _arrows.scss
Last active February 22, 2024 15:18
SASS @mixin for Arrows
// Demo at http://codepen.io/jondaiello/full/YWRbOx/
/* This mixin is for generating CSS arrows on a box */
@mixin box-arrow($arrowDirection, $arrowColor, $arrowSize: 10px) {
position: relative;
z-index: 10;
&::after {
content: '';
width: 0;
height: 0;