Skip to content

Instantly share code, notes, and snippets.

View amontalenti's full-sized avatar

Andrew Montalenti amontalenti

View GitHub Profile
import re
from pymongo import MongoReplicaSetClient
from pymongo.read_preferences import ReadPreference
REPLICA_SET = "ptrack-mongo1" # or, parsely_articles for old CrawlDB
conn = MongoReplicaSetClient(read_preference=ReadPreference.SECONDARY, replicaSet=REPLICA_SET)
# optional collection filter for crawlDB in particular
database = "parsely_articles"
collections = range(10) # populate with collections similarly to before
script_cmds = []
for collection in collections:
cmd =\
"echo restoring {db}/{col}"\
" && "\
"lbzip2 --decompress {db}-{col}.bson.bz2"\
" && "\
"mongorestore --d {db} -c {col} {db}-{col}.bson".format(
@amontalenti
amontalenti / livereload-server.py
Created February 10, 2014 19:34
Example server using livereload 2.0, Flask, and formic to monitor filesystem changes, re-build files, and run a simple static web server
#!/usr/bin/env python
#
# simple static Flask fileserving app
# with livereload integration
#
from flask import Flask
STATIC_FOLDER = "."
<script>
/*
* $sf.ext.meta looks up metadata from parent window. Here's
* the relevant documentation from the SafeFrame standard:
*
* "Use to retrieve metadata about the SafeFrame position that
* was specified by the host. The host may specify additional
* metadata about this 3rd party content. The host specifies
* this metadata using the $sf.host.PosMeta class."
*
function getCurrentDateString() {
// JavaScript Date API does not return zero-padded date strings, so we need this utility
var pad = function(number) {
if (number < 10) {
return '0' + number;
}
return number;
};
// get the current day and format it as YYYY-MM-DD-HH
var date = new Date(),
@amontalenti
amontalenti / urltools.py
Last active August 29, 2015 13:57
example take home assignment using string splitting, joining, tuples, lists, dictionaries, and basic functions
"""urltools.py - parse and format web URLs.
HINT:
>>> "http://google.com".split("://")
["http", "google"]
>>> "google.com/hangout/parsely.com/am".split("/")
["google.com", "hangout", "parsely.com", "am"]

Hello again, growing Pythonistas!

Pleased and excited

So, first of all, I have to say that I was simultaneously pleased & impressed with the quality of submissions I got from you guys for our little Python take-home assignment. Pleased, because the problem seemed to be accessible enough that each of you could work on it with little background in Python beyond the bits you've been exposed to over the last few months. Impressed, because all of your submissions passed all of my test cases!

I was also happy with how different the submissions were. We Pythonistas often like to brag that, unlike in other languages, in Python there is "rarely more than one way to do it". However, this was a simple example, yet the solutions provided varied widely. (And indeed, in programming, no matter how simple the language, there's always more than one way to do it.) This therefore gives us a nice window into various understandings of Python code style, architecture, and idioms.

Patterns among code

@amontalenti
amontalenti / letter_editor.py
Created March 19, 2014 02:08
muckhacker letter to editor analysis for fun (NLTK, Pandas, utilities)
from nltk import FreqDist
from nltk.corpus import stopwords
from nltk import wordpunct_tokenize
# my little NLP utility library
import nlp2
import pandas as pd
df = pd.read_csv("hook_letters.csv")
from sst.actions import *
import time
import json
from settings import DASH_USERNAME, DASH_PASSWORD, APIKEYS
envs = {
"bri": "dash.parsely.com",
"ue1a": "ue1a-dash-web1.cogtree.com",
}
import time
import random
import requests
words = [word.strip() for word in open("/usr/share/dict/words")]
names = ["John", "Bob", "Peter", "Joe", "Sarah", "Clare"]
sections = ["politics", "entertainment", "life", "sports"]
apikeys = ["arstechnica_com", "foxnews_com"]
def measured_query(apikey, q):