Serge Simard agileone

Learning FP the hard way: Experiences on the Elm language

by Ossi Hanhinen, @ohanhi

with the support of Futurice 💚.

Licensed under CC BY 4.0.

Editorial note

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

The list is now hosted on a repository so you can PR -> https://github.com/jeroenvdgulik/awesome-talks/blob/master/README.md

The list

Alberto Brandolini - Chasing Elephants https://youtu.be/klsksbDJOhI
Erik Meijer - One Hacker Way https://youtu.be/FvMuPtuvP5w
Erik Meijer - Category Theory, The essence of interface-based design https://youtu.be/JMP6gI5mLHc
Gary Bernhardt - Wat https://www.destroyallsoftware.com/talks/wat
Konstantin Kudryashov - Min-maxing Software Costs https://youtu.be/uQUxJObxTUs
Marco Pivetta - Extremely Defensive PHP https://youtu.be/8d2AtAGJPno

	###
	###
	### UPDATE: For Win 11, I recommend using this tool in place of this script:
	### https://christitus.com/windows-tool/
	### https://github.com/ChrisTitusTech/winutil
	### https://www.youtube.com/watch?v=6UQZ5oQg8XA
	### iwr -useb https://christitus.com/win \| iex
	###
	### OR take a look at
	### https://github.com/HotCakeX/Harden-Windows-Security

	export NEO4J_HOME=${NEO4J_HOME-~/Downloads/neo4j-community-3.0.1}

	if [ ! -f data-csv.zip ]; then
	curl -OL https://cloudfront-files-1.publicintegrity.org/offshoreleaks/data-csv.zip
	fi

	export DATA=${PWD}/import

	rm -rf $DATA

	export NEO4J_HOME=${NEO4J_HOME-~/Downloads/neo4j-community-3.0.1}

	if [ ! -f data-csv.zip ]; then
	curl -OL https://cloudfront-files-1.publicintegrity.org/offshoreleaks/data-csv.zip
	fi

	export DATA=${PWD}/import

	unzip -o -j data-csv.zip -d $DATA

	# Docker Machine for Consul
	docker-machine \
	create \
	-d virtualbox \
	consul-machine

	# Start Consul
	docker $(docker-machine config consul-machine) run -d --restart=always \
	-p "8500:8500" \
	-h "consul" \

	#Install the required plugins
	addon-install-from-git --url https://github.com/forge/wildfly-swarm-addon.git
	addon-install-from-git --url https://github.com/forge/keycloak-addon.git

	# Create the project and configure the WildFly Swarm maven plugin
	project-new --named demo --stack JAVA_EE_7 --type wildfly-swarm

	# Create the JPA entity
	jpa-new-entity --named Customer

	CREATE TABLE json_store (
	name varchar(255) not null,
	data text,
	name_uuid PRIMARY KEY default gen_random_uuid()
	);

	ALTER TABLE json_store OWNER TO fusionpbx;

	CREATE TABLE agents (
	name character varying(255),

	# if error writing `export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"`


	import tensorflow as tf
	from tensorflow.examples.tutorials.mnist import input_data

	def init_weights(shape, name):
	return tf.Variable(tf.random_normal(shape, stddev=0.01), name=name)

	# Step 1 - Add some items to graph section of Tensorboard