Rohit Sankaran sankroh


NAME	EXPLANATION	EXAMPLES
Common Name	The fully qualified domain name (FQDN) of your server. This must match exactly what you type in your web browser or you will receive a name mismatch error.
*.google.com

Here are the areas I've been researching, some things I've read and some open source packages...

Nearly all text processing starts by transforming text into vectors: http://en.wikipedia.org/wiki/Vector_space_model

Often it uses transforms such as TFIDF to normalise the data and control for outliers (words that are too frequent or too rare confuse the algorithms): http://en.wikipedia.org/wiki/Tf%E2%80%93idf

Collocations is a technique to detect when two or more words occur more commonly together than separately (e.g. "wishy-washy" in English) - I use this to group words into n-gram tokens because many NLP techniques consider each word as if it's independent of all the others in a document, ignoring order: http://matpalm.com/blog/2011/10/22/collocations_1/

	#List unique values in a DataFrame column
	pd.unique(df.column_name.ravel())

	#Convert Series datatype to numeric, getting rid of any non-numeric values
	df['col'] = df['col'].astype(str).convert_objects(convert_numeric=True)

	#Grab DataFrame rows where column has certain values
	valuelist = ['value1', 'value2', 'value3']
	df = df[df.column.isin(value_list)]

	#!/bin/sh
	# WARNING: REQUIRES /bin/sh
	#
	# Install Puppet with shell... how hard can it be?
	#
	# 0.0.1a - Here Be Dragons
	#

	# Set up colours
	if tty -s;then

	[MASTER]
	profile=no
	persistent=yes
	ignore=migrations
	cache-size=500

	[BASIC]
	# Regular expression which should only match correct module names
	module-rgx=([a-z][a-z0-9_]*)$

	import pprint
	import requests

	def get_blackhawks_schedule():
	url = "http://blackhawks.nhl.com/schedule/full.csv"
	response = requests.get(url)
	if response.status_code == 200:
	data = filter(None, response.text.split('\r\n'))
	headers = data[0].split(',')
	data = [dict((headers[i], d) for i, d in enumerate(dt.split(','))) for dt in data]

	from dateutil import rrule, relativedelta
	from django.utils.timezone import now, get_default_timezone
	import pytz

	def main(datetime, weekdays):
	tz_day = datetime.weekday()
	print "TZ Day:", tz_day
	utc_day = datetime.astimezone(pytz.utc).weekday()
	print "UTC Day:", utc_day
	weekdays = [eval('rrule.%s.weekday' % day) for day in weekdays]

	import requests
	import time

	def run():
	output = []
	for x in range(100):
	resp = requests.get("http://perf.herokuapp.com")
	output.append(resp.text)

	print output

	// grab your file object from a file input
	$('#fileInput').change(function () {
	sendFile(this.files[0]);
	});

	// can also be from a drag-from-desktop drop
	$('dropZone')[0].ondrop = function (e) {
	e.preventDefault();
	sendFile(e.dataTransfer.files[0]);
	};

	### BEGIN INIT INFO
	# Provides: nginx
	# Required-Start: $all
	# Required-Stop: $all
	# Default-Start: 2 3 4 5
	# Default-Stop: 0 1 6
	# Short-Description: starts the nginx web server
	# Description: starts nginx using start-stop-daemon
	### END INIT INFO