Skip to content

Instantly share code, notes, and snippets.

View schwanksta's full-sized avatar
💭
(┛ಠ_ಠ)┛彡┻━┻

Ken Schwencke schwanksta

💭
(┛ಠ_ಠ)┛彡┻━┻
View GitHub Profile
@schwanksta
schwanksta / parse_scotus.py
Created March 27, 2013 20:41
Quick and dirty script to parse a Supreme Court transcript and output a JSON file of [speaker, words]
import re
import json
ws_re = re.compile("\s+")
line_num_re = re.compile("\s\d+\s{2,}", re.M)
# first, pdftotext -layout <pdf> <text>
with open("12-307_jnt1.txt", "r") as f:
data = f.read()
@schwanksta
schwanksta / byline_re
Created June 21, 2013 23:16
Extract authors from byline string
import re
def _extract_authors(byline):
"""
Takes one of our bylines returned from the Oxygen API, and
tries to split it out into individual authors. Only works up
to a triple byline.
"""
fallback_byline = re.compile("([\-\w.&; ]+)")
single_byline = re.compile("By ([\-\w.&; ]+)")
double_byline = re.compile("By ([\-\w.&; ]+) and ([\-\w.&; ]+)")
@schwanksta
schwanksta / index.html
Created August 15, 2013 18:40
Quick dump of downtown into OSMBuildings
This file has been truncated, but you can view the full file.
<link rel="stylesheet" href="http://osmbuildings.org/js/leaflet-0.5.1/leaflet.css">
<script src="http://osmbuildings.org/js/leaflet-0.5.1/leaflet.js"></script>
<script src="http://osmbuildings.org/js/OSMBuildings-Leaflet.js"></script>
<div id="map" style="width:100%; height:100%"></div>
<script type="text/javascript">
var data = {"type": "FeatureCollection", "features": [{"geometry": {"type": "MultiPolygon", "coordinates": [[[[-118.22371230636624, 34.014919515466296], [-118.22370767851801, 34.01478070140346], [-118.22387331923136, 34.0147768774469], [-118.22387794727932, 34.014915663556124], [-118.22371230636624, 34.014919515466296]]]]}, "type": "Feature", "properties": {"ain": "5168023018", "height": 29.06}}, {"geometry": {"type": "MultiPolygon", "coordinates": [[[[-118.22353078318748, 34.01503069999131], [-118.22352664599238, 34.01493969843404], [-118.22363417872324, 34.0149363123561], [-118.22363831495248, 34.01502731391164], [-118.22353078318748, 34.01503069999131]]]]}, "type": "Feature", "properties": {
@schwanksta
schwanksta / parse_inspections.py
Last active December 25, 2015 16:08
Parse LA County restaurant inspection data
import mechanize
from BeautifulSoup import BeautifulSoup
def parse_table(html):
"""
Returns a list of lists. Each list is a row in the table.
"""
soup = BeautifulSoup(html)
# The 9th table down the page is the one with all the info.
table = soup.findAll('table')[9]
@schwanksta
schwanksta / index.html
Last active December 27, 2015 10:39
high homicide tracts
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="http://cdn.leafletjs.com/leaflet-0.6.4/leaflet.css" />
<!--[if lte IE 8]>
<link rel="stylesheet" href="http://cdn.leafletjs.com/leaflet-0.6.4/leaflet.ie.css" />
<![endif]-->
<script src="http://cdn.leafletjs.com/leaflet-0.6.4/leaflet.js"></script>
<script type="text/javascript">
geojson = {"type":"FeatureCollection","features":[{"type":"Feature","id":651,"properties":{"GEOID10":"06037600304","CT10":"600304","LABEL":"6003.04","X_Center":6472482.0,"Y_Center":1798989.0,"Shape_area":4798305.86046,"Shape_len":9662.240288860000874,"PNTCNT":15.0,"SumLev":"140","AreaName":"CensusTract6003.04","State":"06","AreaLand":"445759","AreaWatr":"0","TotPop":"3424","Wht":"50","Blk":"1330","Asn":"7","Hisp":"1986","Oth":"51","NAMELSAD10":"CensusTract6003.04","per_capita":0.0043808411,"PopInt":3424},"geometry":{"type":"Polygon","coordinates":[[[-118.291637000512452,33.939880999892225],[-118.291639000350102,33.939056000142166],

Smoke, mirrors and madlibs: How to break news while you sleep

Around 6:25 a.m. I was awakened by a jolt from slipping tectonic plates. The tremor didn't last very long, and as soon as my window stopped rattling my first thought was to check for an email.

Here it was:

L.A. Now: Ready for copyedit: Earthquake: 4.7 quake strikes near Westwood, California

This is a robopost from your friendly earthquake robot. Please copyedit & publish the

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@schwanksta
schwanksta / parallelcopy.py
Created February 25, 2016 20:57
parallelized COPY in postgres
import multiprocessing
import subprocess
from glob import glob
import os
fields = ["field1", "field2", "field3"]
def work(fname):
print "starting %s" % fname
fname = os.path.abspath(fname)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.