Sam Zeitlin szeitlin

Mapping Infected Users On Khan Academy

Introduction

Using a graph model to represent Users on the site, we can test what happens when we roll out changes so that associated Users will all see the same version of the site. In other words, we can make changes such that one User gets 'infected' with a new version of the site, and some or all of their connected Users will receive that same version.

Fork your own Gist

This is a bookmarklet that adds a fully functional Fork button to your own Gist.

If a Fork button is already present in the page, this bookmarklet will set focus to it instead of adding another one.

The change is temporary and the button will disappear as soon as you navigate away from that Gist (clicking the Fork button does this for you as well).

TraitSync k-Nearest Neighbors and Cosine Similarity

Introduction

Using k-nearest neighbors, similarities are calculated between each element in the data set using some distance / similarity metric ^^[1]^ that the researcher chooses (there are many distance / similarity metrics), where the distance / similarity between any two elements is calculated based on the two elements' attributes. A data element’s k-NN are the k closest data elements according to this distance / similarity.

1. A distance metric measures distance; the higher the distance the further apart the neighbors. A similarity metric measures similarity; the higher the similarity the closer the neighbors.

	dlist = [{'Bilbo':'Ian','Frodo':'Elijah'},
	{'Bilbo':'Martin','Thorin':'Richard'}]

	k = 'Bilbo'

	#this works as expected
	for i in dlist:
	... if k in i:
	... i[k]
	... else:

	__author__ = 'szeitlin'

	import argparse
	import os
	import re
	import sys

	'''

	Helper function for converting python files exported from IPython notebooks into actual programs.

	def remove_symbols(name):
	""" Remove symbols from string and return as one word.
	Replace '&' with '_and_'
	Replace '/' with '_or_'
	Remove spaces
	Replace '(' with _
	Remove ')'

	(str) -> (str)

	def remove_spaces(name):
	'''Helper function removes spaces from string and returns it as a single word.

	(str) -> (str)

	>>> remove_spaces('Poly CO2')
	PolyCO2

	'''
	nname = name.replace(" ", "")

	def drop_max_value(df, column):
	'''
	Re-usable function that finds the max value and drops it.

	(df, column) -> df

	>>> drop_max_value(df, steep)
	df
	'''
	dropthis = df[column].max()

	["I may opt for a top yam for amy ."]
	["Elvis lives on a dirty dorm room floor over the moor ."]

	__author__ = 'szeitlin'

	#helper script to designate allowed range for grouping

	def delta_range(delta):
	'''
	Takes a delta and applies it to generate a reference list for grouping items.

	(int) --> list of ints (except zero)