Skip to content

Instantly share code, notes, and snippets.

@bmschmidt
bmschmidt / gist:1eedbc2bcd21e059af4d
Last active August 29, 2015 14:19
Use Python instead of Apache to host a Bookworm
mkdir GUI
make linechartGUI webDirectory=GUI
git clone http://github.com/Bookworm-Project/BookwormAPI GUI/cgi-bin
git clone http://github.com/bmschmidt/BookwormD3 GUI/BookwormD3
cd GUI; python -m CGIHTTPServer 8000
all:
find ~ | parallel -n 1 -P 4 python dummy.py
@bmschmidt
bmschmidt / lazierLoad.R
Last active September 11, 2015 16:35 — forked from zkamvar/lazierLoad.R
Does a lazy load across all files when provided a knitr cache directory.
#' Performs lazy load on a directory
#'
#' @param path a filepath containing the necessary files for lazy loading
#' @return NULL
#'
#' @details This function will go into a directory, search for all the files
#' that seem like they can be lazily loaded and attempt to load them.
#'
lazierLoad <- function(path){
files <- dir(path)

An idea that I proved unable to express in the number of characters on Twitter:

Train two word2vec models on the same corpus with 100 dimensions apiece; one with window size 5, and one with window size 15 (say).

Now you have 2 100-dimensional vector spaces with the same words in each.

That's the same as 1 200-dimensional vector space: you just append each of the vectors to each other.

That vector space has all the information from each of the original models in it: you can just use linear algebra to flatten it out along either of the original 100 degree vectors.

import re
import gzip
import sys
def stripBadText(string):
if string==None:
return ""
# No html tags
string = re.sub("<[^>]+>","",string)
# People don't talk in [brackets] or (inside parentheses), so I strip them.
@bmschmidt
bmschmidt / slopegraph.R
Created December 9, 2016 20:11
A slopegraph in R for use with with the wordVectors package
library(ggplot2)
library(wordVectors)
slopegraph = function(
set1 = "RMP"
,
set2 = "genderless_RMP"
,
word = "she"
,
library(birdnik)
random_word <- function(key,
pos="adjective", min_count=100, n=1,
min_length = 5, max_length = 15){
param <- paste0("words.json/randomWords?hasDictionaryDef=true",
"&minCorpusCount=",min_count,
"&minLength=",min_length,
"&maxLength=",max_length,
@bmschmidt
bmschmidt / .block
Last active May 24, 2017 16:32 — forked from mbostock/.block
County Spheres a la Guy-Harold Smith
license: gpl-3.0
scrolling: yes
@bmschmidt
bmschmidt / jupyter_shortcuts.md
Created March 20, 2018 19:55 — forked from kidpixo/jupyter_shortcuts.md
Keyboard shortcuts for ipython notebook 3.1.0 / jupyter

Toc

Keyboard shortcuts

The IPython Notebook has two different keyboard input modes. Edit mode allows you to type code/text into a cell and is indicated by a green cell border. Command mode binds the keyboard to notebook level actions and is indicated by a grey cell border.

MacOS modifier keys:

  • ⌘ : Command
We can't make this file beautiful and searchable because it's too large.
CIPFamily,CIPCode,Type,General,TextChange,CIPTitle,CIPDefinition,CrossReferences,Examples
01,01,Agriculture,Pre-professional,no,"AGRICULTURE, AGRICULTURE OPERATIONS, AND RELATED SCIENCES.","Instructional programs that focus on agriculture and related sciences and that prepare individuals to apply specific knowledge, methods, and techniques to the management and performance of agricultural operations.",,
01,01.00,Agriculture,Pre-professional,no,"Agriculture, General.",Instructional content is defined in code 01.0000.,,
01,01.0000,Agriculture,Pre-professional,no,"Agriculture, General.","A program that focuses on the general principles and practice of agricultural research and production and that may prepare individuals to apply this knowledge to the solution of practical agricultural problems. Includes instruction in basic animal, plant, and soil science; animal husbandry and plant cultivation; soil conservation; and agricultural operations such as farming, ranching, and agricultural business.",14.0301 - Agric