Skip to content

Instantly share code, notes, and snippets.

View gregdl's full-sized avatar

mutedial gregdl

  • tokyo
View GitHub Profile
@darinwilson
darinwilson / ambient1
Created August 14, 2015 19:46
Ambient experiment using Sonic Pi
# Ambient experiment for Sonic Pi (http://sonic-pi.net/)
#
# The piece consists of three long loops, each of which plays one of
# two randomly selected pitches. Each note has different attack,
# release and sleep values, so that they move in and out of phase
# with each other. This can play for quite awhile without
# repeating itself :)
live_loop :note1 do
use_synth :hollow
anonymous
anonymous / findsimilar.py
Created June 9, 2015 17:16
from whoosh.index import open_dir
from whoosh.index import create_in
from whoosh.fields import *
from whoosh.qparser import QueryParser
import glob
import os
# USER SET PARAMETERS ############
@soodoku
soodoku / text_classifier.R
Last active December 15, 2016 17:44
Basic Text Classifier
"
Basic Text Classifier
- Takes a csv with a text column, and column of labels
- Splits into train and test
- Preprocesses text using tm/bag-of-words, 1/2-order Markov
- Uses SVM and Lasso
@author: Gaurav Sood
"
@drjwbaker
drjwbaker / 2015-04-17_IOR.py
Last active August 29, 2015 14:19
Python script to query a tsv file using strings in a txt file and make a new tsv with the lines that contain matches
#I have a list of strings in a text file (`mylist.txt`). I want to search for these strings in a tsv file (`somestuff.tsv`) and make a new file that contains only the lines in which the strings appear. Some strings in the text file will not appear in the tsv file.
#see https://gist.github.com/MartinPaulEve/c0610fa89da4df4d546a
#!/usr/bin/env python
output = []
# use a "with" block to automatically close I/O streams
with open('mylist.txt') as word_list:
@jennybc
jennybc / 2014-10-12_stop-working-directory-insanity.md
Last active August 7, 2025 01:00
Stop the working directory insanity

There are packages for this now!

2017-08-03: Since I wrote this in 2014, the universe, specifically Kirill Müller (https://github.com/krlmlr), has provided better solutions to this problem. I now recommend that you use one of these two packages:

  • rprojroot: This is the main package with functions to help you express paths in a way that will "just work" when developing interactively in an RStudio Project and when you render your file.
  • here: A lightweight wrapper around rprojroot that anticipates the most likely scenario: you want to write paths relative to the top-level directory, defined as an RStudio project or Git repo. TRY THIS FIRST.

I love these packages so much I wrote an ode to here.

I use these packages now instead of what I describe below. I'll leave this gist up for historical interest. 😆

@brianckeegan
brianckeegan / backbone_extractor.py
Last active August 2, 2023 19:26
Given a networkx graph containing weighted edges and a threshold parameter alpha, this code will return another networkx graph with the "backbone" of the graph containing a subset of weighted edges that fall above the threshold following the method in Serrano et al. 2008.
# Serrano, Boguna, Vespigani backbone extractor
# from http://www.pnas.org/content/106/16/6483.abstract
# Thanks to Michael Conover and Qian Zhang at Indiana with help on earlier versions
# Thanks to Clay Davis for pointing out an error
import networkx as nx
import numpy as np
def extract_backbone(g, weight='weight', alpha=.05):
backbone_graph = nx.Graph()
@benmarwick
benmarwick / citation-analysis-sketch.R
Last active February 1, 2025 15:00
sketch of citation analysis
# sources:
# http://www.jgoodwin.net/?p=1223
# http://orgtheory.wordpress.com/2012/05/16/the-fragile-network-of-econ-soc-readings/
# http://nealcaren.web.unc.edu/a-sociology-citation-network/
# http://kieranhealy.org/blog/archives/2014/11/15/top-ten-by-decade/
# http://www.jgoodwin.net/lit-cites.png
###########################################################################
# This first section scrapes content from the Web of Science webpage. It takes
# coding=UTF-8
import nltk
from nltk.corpus import brown
# This is a fast and simple noun phrase extractor (based on NLTK)
# Feel free to use it, just keep a link back to this post
# http://thetokenizer.com/2013/05/09/efficient-way-to-extract-the-main-topics-of-a-sentence/
# Create by Shlomi Babluki
# May, 2013
@benmarwick
benmarwick / R2MALLET.r
Last active April 12, 2021 10:27
R code to operate MALLET entirely from within R. Set variables, send commands to Windows' command console and get MALLET's result back into R for further analysis.
# Set working directory
dir <- "C:\\" # adjust to suit
setwd(dir)
# configure variables and filenames for MALLET
## here using MALLET's built-in example data and
## variables from http://programminghistorian.org/lessons/topic-modeling-and-mallet
# folder containing txt files for MALLET to work on
importdir <- "C:\\mallet-2.0.7\\sample-data\\web\\en"
@gupul2k
gupul2k / pos_tagging.py
Created November 2, 2012 13:32
NER and POS Tagging with NLTK and Python
#Script tags POS and NER[Named Entity Recognition] for a supplied text file.
#Date: Nov 2 2012
#Author: Hota Sobhan
import nltk
f = open('C:\Python27\Test_File.txt')
data = f.readlines()
#Parse the text file for NER with POS Tagging