- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
Standard for US Driver's Licenses defines 9 different barcode standards (AAMVA versions) with over 80 different fields encoded inside a barcode. Some fields exist on all barcode standards, other exist only on some. To standardize the API, we have structured the fields in the following sections:
#cloud-config | |
coreos: | |
etcd: | |
# generate a new token for each unique cluster from https://discovery.etcd.io/new | |
discovery: https://discovery.etcd.io/<token> | |
# multi-region deployments, multi-cloud deployments, and droplets without | |
# private networking need to use $public_ipv4 | |
addr: $private_ipv4:4001 | |
peer-addr: $private_ipv4:7001 |
Using WebSockets, React and Reflux together can be a beautiful thing, but the intial setup can be a bit of a pain. The below examples attempt to offer one (arguably enjoyable) way to use these tools together.
This trifect works well if you think of things like so:
- Reflux Store: The store fetches, updates and persists data. A store can be a list of items or a single item. Most of the times you reach for
this.state
in react should instead live within stores. Stores can listen to other stores as well as to events being fired. - Reflux Actions: Actions are triggered by components when the component wants to change the state of the store. A store listens to actions and can listen to more than one set of actions.
#!/usr/bin/env python | |
import random | |
import struct | |
import sys | |
# Most of the Fat32 class was cribbed from https://gist.github.com/jonte/4577833 | |
def ppNum(num): | |
return "%s (%s)" % (hex(num), num) |
Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs
package main | |
import ( | |
"net/http" | |
"compress/gzip" | |
"io/ioutil" | |
"strings" | |
"sync" | |
"io" | |
) |
default['sshd']['sshd_config']['AuthenticationMethods'] = 'publickey,keyboard-interactive:pam' | |
default['sshd']['sshd_config']['ChallengeResponseAuthentication'] = 'yes' | |
default['sshd']['sshd_config']['PasswordAuthentication'] = 'no' |
This is a collection of the things I believe about software development. I have worked for years building backend and data processing systems, so read the below within that context.
Agree? Disagree? Feel free to let me know at @JanStette.
Keep it simple, stupid. You ain't gonna need it.