Skip to content

Instantly share code, notes, and snippets.

View schwanksta's full-sized avatar
💭
(┛ಠ_ಠ)┛彡┻━┻

Ken Schwencke schwanksta

💭
(┛ಠ_ಠ)┛彡┻━┻
View GitHub Profile
@schwanksta
schwanksta / byline_re
Created June 21, 2013 23:16
Extract authors from byline string
import re
def _extract_authors(byline):
"""
Takes one of our bylines returned from the Oxygen API, and
tries to split it out into individual authors. Only works up
to a triple byline.
"""
fallback_byline = re.compile("([\-\w.&; ]+)")
single_byline = re.compile("By ([\-\w.&; ]+)")
double_byline = re.compile("By ([\-\w.&; ]+) and ([\-\w.&; ]+)")
@schwanksta
schwanksta / parse_scotus.py
Created March 27, 2013 20:41
Quick and dirty script to parse a Supreme Court transcript and output a JSON file of [speaker, words]
import re
import json
ws_re = re.compile("\s+")
line_num_re = re.compile("\s\d+\s{2,}", re.M)
# first, pdftotext -layout <pdf> <text>
with open("12-307_jnt1.txt", "r") as f:
data = f.read()
@schwanksta
schwanksta / gist:1596749
Created January 11, 2012 21:03
Setting up python-elections from scratch
This should get you there:
easy_install virtualenv
easy_install pip
virtualenv --no-site-packages elections
cd elections
. bin/activate
mkdir project
cd project
pip install python-elections
@schwanksta
schwanksta / angie.txt
Created November 4, 2011 04:29
angie
https://fbcdn-sphotos-a.akamaihd.net/hphotos-ak-snc6/254057_10101100830683301_2024910_79469509_7840255_n.jpg
@schwanksta
schwanksta / DC_orgs.txt
Created November 1, 2011 18:25
The organizations in DocumentCloud
The Boston Globe
1105 Government Information Group
11Alive
13WMAZ News
60 Minutes
A.Q. Miller School of Journalism and Mass Communications
Accent Online
ACLU Massachusetts
ACLU National Security Project
Adirondack Daily Enterprise
Message from Randy Michaels/Lee Abrams Suspended
Tribune Communications
Sent: Wednesday, October 13, 2010 11:28 AM
I want to let you know that today we made the decision to suspend Lee Abrams from his position as Tribune’s Chief Innovation Officer. He will remain on suspension indefinitely and without pay while we review the circumstances surrounding the email and video link he distributed on Monday. We’re in the process of determining further disciplinary action.
Lee recognizes that the video was in extremely bad taste and that it offended employees—he has also apologized publicly. He reiterated those feelings again to me privately today. But, this is the kind of serious mistake that can’t be tolerated; we intend to address it promptly and forcefully.
As I said last week, a creative culture must be built on a foundation of respect. Our culture is not about being offensive or hurtful. We encourage employees to speak up when they see or hear something that they find offensive, as a number o
@schwanksta
schwanksta / All-caps Django comments filter
Created July 6, 2010 19:49
Counts the # of ALL-CAPS words and blocks if > than a certain percent.
# in toolbox/text.py:
def count_capital_words(str):
"""
Counts the number of capital words in a string and returns a % of ALL-CAPS WORDS. Issues with this:
"words" made up of non-letters get counted as all-caps (including emoticons: :(, :), 8===D, etc). Also,
CoMmEnTs lIkE tHiZ don't get counted. Perhaps a better solution would be to to use a regex to count all occurrences of
[A-Z]. This works decently though.
"""
count = 0
words = str.split()
Good day!
Check out
a marvelous search engine –
Warning: mysql_connect(): Too many connections in /var/www/html/helper.php on line 136
P.S. Yahoo – everything will be found! Google: nothing was really lost…
Bye to everyone!
@schwanksta
schwanksta / apdate_to_datetime.py
Created May 5, 2010 00:10
Converts an AP Style date string to a Python datetime object.
def apdate_to_datetime(date):
"""
Takes an AP-formatted date string and returns a Python datetime
object. This will also work on any date formatted as either
'%b. %d, %Y' or '%B %d, %Y'
Examples:
>>> apdate_to_datetime("Sept. 4, 1986")
datetime.datetime(1986, 9, 4, 0, 0)
>>> apdate_to_datetime("Sep. 4, 1986")
@schwanksta
schwanksta / js-in_array.js
Created March 31, 2010 23:56
Javascript doesn't have an "in" operator, so you can use this in_array function instead.
function in_array(val, arr, sep){
/* Takes an array, joins it using sep (default = '|'), then
* uses indexOf to test if val is in arr. Returns true or false.
*/
if(!sep){
var sep = "|";
}
var join = ["", arr.join(sep), ""].join(sep);