Skip to content

Instantly share code, notes, and snippets.

View chrisjmendez's full-sized avatar
🎯
Focusing

Chris Mendez chrisjmendez

🎯
Focusing
View GitHub Profile
@chrisjmendez
chrisjmendez / README.md
Created March 24, 2016 20:55 — forked from dannguyen/README.md
Using Google Cloud Vision API to OCR scanned documents to extract structured data

Using Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

What I Wish I'd Known About Equity Before Joining A Unicorn

Disclaimer: This piece is written anonymously. The names of a few particular companies are mentioned, but as common examples only.

This is a short write-up on things that I wish I'd known and considered before joining a private company (aka startup, aka unicorn in some cases). I'm not trying to make the case that you should never join a private company, but the power imbalance between founder and employee is extreme, and that potential candidates would