| NAME | EXPLANATION | EXAMPLES |
| Common Name | The fully qualified domain name (FQDN) of your server. This must match exactly what you type in your web browser or you will receive a name mismatch error. | |
| *.google.com |
| #List unique values in a DataFrame column | |
| pd.unique(df.column_name.ravel()) | |
| #Convert Series datatype to numeric, getting rid of any non-numeric values | |
| df['col'] = df['col'].astype(str).convert_objects(convert_numeric=True) | |
| #Grab DataFrame rows where column has certain values | |
| valuelist = ['value1', 'value2', 'value3'] | |
| df = df[df.column.isin(value_list)] |
| #!/bin/sh | |
| # WARNING: REQUIRES /bin/sh | |
| # | |
| # Install Puppet with shell... how hard can it be? | |
| # | |
| # 0.0.1a - Here Be Dragons | |
| # | |
| # Set up colours | |
| if tty -s;then |
| [MASTER] | |
| profile=no | |
| persistent=yes | |
| ignore=migrations | |
| cache-size=500 | |
| [BASIC] | |
| # Regular expression which should only match correct module names | |
| module-rgx=([a-z][a-z0-9_]*)$ |
| import pprint | |
| import requests | |
| def get_blackhawks_schedule(): | |
| url = "http://blackhawks.nhl.com/schedule/full.csv" | |
| response = requests.get(url) | |
| if response.status_code == 200: | |
| data = filter(None, response.text.split('\r\n')) | |
| headers = data[0].split(',') | |
| data = [dict((headers[i], d) for i, d in enumerate(dt.split(','))) for dt in data] |
| from dateutil import rrule, relativedelta | |
| from django.utils.timezone import now, get_default_timezone | |
| import pytz | |
| def main(datetime, weekdays): | |
| tz_day = datetime.weekday() | |
| print "TZ Day:", tz_day | |
| utc_day = datetime.astimezone(pytz.utc).weekday() | |
| print "UTC Day:", utc_day | |
| weekdays = [eval('rrule.%s.weekday' % day) for day in weekdays] |
Here are the areas I've been researching, some things I've read and some open source packages...
Nearly all text processing starts by transforming text into vectors: http://en.wikipedia.org/wiki/Vector_space_model
Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms): http://en.wikipedia.org/wiki/Tf%E2%80%93idf
Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order: http://matpalm.com/blog/2011/10/22/collocations_1/
| import requests | |
| import time | |
| def run(): | |
| output = [] | |
| for x in range(100): | |
| resp = requests.get("http://perf.herokuapp.com") | |
| output.append(resp.text) | |
| print output |
| // grab your file object from a file input | |
| $('#fileInput').change(function () { | |
| sendFile(this.files[0]); | |
| }); | |
| // can also be from a drag-from-desktop drop | |
| $('dropZone')[0].ondrop = function (e) { | |
| e.preventDefault(); | |
| sendFile(e.dataTransfer.files[0]); | |
| }; |
| ### BEGIN INIT INFO | |
| # Provides: nginx | |
| # Required-Start: $all | |
| # Required-Stop: $all | |
| # Default-Start: 2 3 4 5 | |
| # Default-Stop: 0 1 6 | |
| # Short-Description: starts the nginx web server | |
| # Description: starts nginx using start-stop-daemon | |
| ### END INIT INFO |