Skip to content

Instantly share code, notes, and snippets.

View chriszs's full-sized avatar

Chris Zubak-Skees chriszs

View GitHub Profile
@veltman
veltman / README.md
Created October 10, 2016 16:08
Geosupport w/ JS and node-ffi

Geocoding 10,000 addresses a second with NYC's Geosupport library and Node FFI

Following on Chris Whong's excellent writeup of how to make calls directly to NYC's Geosupport client and this first attempt at generalizing it, here's a way that let me geocode about 10,000 addresses a second on Ubuntu using Node FFI.

Note: this assumes Ubuntu - other Linux is probably fine but may need adjustments.

First, install the basics:

# Update, install Node and unzip (if needed)
@dannguyen
dannguyen / README.md
Last active July 29, 2025 14:26
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@migurski
migurski / Compare G-Econ.ipynb
Last active February 15, 2016 22:23
Testing Compare G-Econ.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@themeteorchef
themeteorchef / format.js
Created January 22, 2016 14:47
RFC-822 formatting for Moment.js
// RFC-822 formatting string for Moment.js.
let timestamp = 'ddd, DD MMM YYYY hh:mm:ss';
// Note, the "z" character responsible for setting the three-character timezone for a date in Moment.js is now deprecated.
// An easy workaround for this is to generate your date/time using the string above, appending the timezone manually.
//
// e.g. return `${ timestamp } CST`;
//
// Here, "CST" would need to be calculated on your own and would likely be a variable unless timezone is fixed.
@apolloclark
apolloclark / Twitter API with Curl
Last active December 15, 2024 00:46
Twitter API with Curl
# create an account, create an app
# @see https://apps.twitter.com/
# retrieve the access tokens
# @see https://dev.twitter.com/oauth/reference/post/oauth2/token
# create the file ~/twitter_api
nano ~/twitter_api
Authorization: OAuth oauth_consumer_key="XXXXXX", oauth_nonce="11111111", oauth_signature="XXXXXX", oauth_signature_method="HMAC-SHA1", oauth_timestamp="1450728725", oauth_token="99999-XXXXXX", oauth_version="1.0"
@fivetentaylor
fivetentaylor / gist:9595d55a434c326bfd16
Created June 23, 2015 23:26
Streaming unzip of s3 resource back into s3
aws s3 cp s3://<bucket_name>/<resource>.gz - | pv | zcat | aws s3 cp - s3://<bucket_name>/<resource>
@clhenrick
clhenrick / get-bbls.py
Last active November 15, 2017 05:41
grab BBL numbers from a list of street addresses using the NYC Geo Client API
# Read CSV of rent stabilized properties and grab BBL from NYC's GeoClient API
# takes an input CSV file name and output CSV file name as argv
# first two columns of input csv must be address number and address name
# hardcoded for manhattan only, will update in the future
# run script by doing: python geo-client-api-test.py input.csv output.csv
from sys import argv
from nyc_geoclient import Geoclient
import csv
import json
@summer4096
summer4096 / fancyCanvas.js
Created November 29, 2014 20:20
Hill shading w/ leaflet!
L.tileLayer.fancyCanvas = function(url){
var layer = L.tileLayer.canvas({async: true});
layer.setUrl(url);
var dataSource = function(x, y, z, done){
var url = layer.getTileUrl({x: x, y: y, z: z});
d3.xhr(url).responseType('arraybuffer').get(done);
};
layer.data = function(fn){
dataSource = fn;
@briantjacobs
briantjacobs / storytelling_from_space.md
Last active August 28, 2024 07:14
Storytelling from Space

Storytelling from Space: Tools/Resources

This list of resources is all about acquring and processing aerial imagery. It's generally broken up in three ways: how to go about this in Photoshop/GIMP, using command-line tools, or in GIS software, depending what's most comfortable to you. Often these tools can be used in conjunction with each other.

Acquiring Landsat & MODIS

Web Interface

  • Landsat archive
@onyxfish
onyxfish / README.md
Last active May 6, 2025 21:05
Google Spreadsheets script to generate slugs from a range of cells

This script for Google Spreadsheets allows you to generate slugs for your data such as might be used for creating unique urls.

Use it like this!

# A B C
1 a b slug
2 foo baz bing =slugify(A2:B4)
3 bar BAZ
4 FOO baz-bing