Skip to content

Instantly share code, notes, and snippets.

@dannguyen
dannguyen / README.md
Last active September 10, 2024 19:41
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@bishboria
bishboria / springer-free-maths-books.md
Last active May 10, 2025 04:28
Springer made a bunch of books available for free, these were the direct links
@stanwmusic
stanwmusic / Drag-And-Throw-3D-Card-Pile.markdown
Created February 17, 2015 11:21
Drag And Throw 3D Card Pile

Drag And Throw 3D Card Pile

Throw to discard the top card on the pile. You can drag to the left or right to see the ones behind. When there are no more left, the cards return. This is just some preliminary work I'm doing along with some Material Design layout tests. The text is nonsene (you may have noticed!).

Forked from Chris Gannon's Pen Drag And Throw 3D Card Pile.

A Pen by Stan Williams on CodePen.

License.