Start up a lambda-like docker container:
docker run -i -t -v /tmp:/var/task lambci/lambda:build /bin/bash
Install some dependencies inside the container:
yum install gperf freetype-devel libxml2-devel git libtool -y
easy_install pip
ssh -i keyfile.pem ubuntu@<ip>
sudo apt -y update && sudo apt -y upgrade
sudo apt install -y p7zip-full build-essential linux-image-extra-virtual linux-source
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
# to activate latest kernel
Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs
import Crawler from 'crawler'; | |
import url from 'url'; | |
const BASE_ADDRESS = 'https://en.wikipedia.org/'; | |
const COUNTRY_PATTERN = /.*?Visa_requirements_for_(.*?)_citizens.*?/i; | |
const VISA_REQUIRED_PATTERN = /.*?visa\s+required.*?/i; | |
const VISA_NOT_REQUIRED_PATTERN = /.*?visa\s+not\s+required.*?/i; | |
const visaRequirements = {}; |
var AWS = require('aws-sdk'); | |
var http = require('http'); | |
var httpProxy = require('http-proxy'); | |
var express = require('express'); | |
var bodyParser = require('body-parser'); | |
var stream = require('stream'); | |
if (process.argv.length != 3) { | |
console.error('usage: aws-es-proxy <my-cluster-endpoint>'); | |
process.exit(1); |
""" | |
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy) | |
BSD License | |
""" | |
import numpy as np | |
# data I/O | |
data = open('input.txt', 'r').read() # should be simple plain text file | |
chars = list(set(data)) | |
data_size, vocab_size = len(data), len(chars) |
Python syntax here : 2.7 - online REPL
Javascript ES6 via Babel transpilation - online REPL
import math
# Change YOUR_TOKEN to your prerender token | |
# Change example.com (server_name) to your website url | |
# Change /path/to/your/root to the correct value | |
server { | |
listen 80; | |
server_name example.com; | |
root /path/to/your/root; | |
index index.html; |
When the directory structure of your Node.js application (not library!) has some depth, you end up with a lot of annoying relative paths in your require calls like:
const Article = require('../../../../app/models/article');
Those suck for maintenance and they're ugly.
1) Start with only one known domain from a botnet: qwmrxczhrcmbcagehqwxlvsnj.ru | |
2) Get the intersection of names looked up by the IPs having looked up this domain. It takes less than 1 minute. | |
$ curl https://sgraph.umbrella.com/dnsdb/clientlookups/i/name/qwmrxczhrcmbcagehqwxlvsnj.ru | sort -rn > /tmp/a | |
3) Remove popular domains | |
cut -f2 /tmp/a | filter-popular > /tmp/aa |