mutedial gregdl

There are packages for this now!

2017-08-03: Since I wrote this in 2014, the universe, specifically Kirill Müller (https://github.com/krlmlr), has provided better solutions to this problem. I now recommend that you use one of these two packages:

rprojroot: This is the main package with functions to help you express paths in a way that will "just work" when developing interactively in an RStudio Project and when you render your file.
here: A lightweight wrapper around rprojroot that anticipates the most likely scenario: you want to write paths relative to the top-level directory, defined as an RStudio project or Git repo. TRY THIS FIRST.

I love these packages so much I wrote an ode to here.

I use these packages now instead of what I describe below. I'll leave this gist up for historical interest. 😆

	# Ambient experiment for Sonic Pi (http://sonic-pi.net/)
	#
	# The piece consists of three long loops, each of which plays one of
	# two randomly selected pitches. Each note has different attack,
	# release and sleep values, so that they move in and out of phase
	# with each other. This can play for quite awhile without
	# repeating itself :)

	live_loop :note1 do
	use_synth :hollow

	from whoosh.index import open_dir
	from whoosh.index import create_in
	from whoosh.fields import *
	from whoosh.qparser import QueryParser
	import glob
	import os


	# USER SET PARAMETERS ############

	"
	Basic Text Classifier
	- Takes a csv with a text column, and column of labels
	- Splits into train and test
	- Preprocesses text using tm/bag-of-words, 1/2-order Markov
	- Uses SVM and Lasso

	@author: Gaurav Sood

	"

	#I have a list of strings in a text file (`mylist.txt`). I want to search for these strings in a tsv file (`somestuff.tsv`) and make a new file that contains only the lines in which the strings appear. Some strings in the text file will not appear in the tsv file.

	#see https://gist.github.com/MartinPaulEve/c0610fa89da4df4d546a

	#!/usr/bin/env python
	output = []

	# use a "with" block to automatically close I/O streams
	with open('mylist.txt') as word_list:

	# Serrano, Boguna, Vespigani backbone extractor
	# from http://www.pnas.org/content/106/16/6483.abstract
	# Thanks to Michael Conover and Qian Zhang at Indiana with help on earlier versions
	# Thanks to Clay Davis for pointing out an error

	import networkx as nx
	import numpy as np

	def extract_backbone(g, weight='weight', alpha=.05):
	backbone_graph = nx.Graph()

	# sources:
	# http://www.jgoodwin.net/?p=1223
	# http://orgtheory.wordpress.com/2012/05/16/the-fragile-network-of-econ-soc-readings/
	# http://nealcaren.web.unc.edu/a-sociology-citation-network/
	# http://kieranhealy.org/blog/archives/2014/11/15/top-ten-by-decade/
	# http://www.jgoodwin.net/lit-cites.png


	###########################################################################
	# This first section scrapes content from the Web of Science webpage. It takes

	# coding=UTF-8
	import nltk
	from nltk.corpus import brown

	# This is a fast and simple noun phrase extractor (based on NLTK)
	# Feel free to use it, just keep a link back to this post
	# http://thetokenizer.com/2013/05/09/efficient-way-to-extract-the-main-topics-of-a-sentence/
	# Create by Shlomi Babluki
	# May, 2013

	# Set working directory
	dir <- "C:\\" # adjust to suit
	setwd(dir)

	# configure variables and filenames for MALLET
	## here using MALLET's built-in example data and
	## variables from http://programminghistorian.org/lessons/topic-modeling-and-mallet

	# folder containing txt files for MALLET to work on
	importdir <- "C:\\mallet-2.0.7\\sample-data\\web\\en"

	#Script tags POS and NER[Named Entity Recognition] for a supplied text file.
	#Date: Nov 2 2012
	#Author: Hota Sobhan

	import nltk

	f = open('C:\Python27\Test_File.txt')
	data = f.readlines()

	#Parse the text file for NER with POS Tagging