dpressel’s gists

dpressel / gist:d2961a8775b3ed7798d0

Created February 14, 2015 21:12

Distance between points on the earth (Vincenty algorithm) in JS

	// Vincenty distance in JS
	(function (geof) {
	var A = 6378137.0 // Meters
	// Flattening
	var F = 1/298.257223563
	// Semi-minor axis
	var B = A * (1.0 - F) // 6356752.31424518
	// First ellipticity squared
	var ESQR = 2F - (FF)
	var METERS_TO_MILES = 0.000621371192

dpressel / counts.py

Last active August 29, 2015 14:19

PMI,Sent score for AffLex/NegLex s140 lexicon

	import os.path
	from collections import Counter
	from csv import *
	import math
	import sys
	LOG2_SCALE = 1 / math.log(2)

	NEG = 'neg'
	POS = 'pos'

dpressel / counts.py

Last active August 29, 2015 14:19

Sentiment score calculation as described in State-of-the-Art in Sentiment Analysis ofShort Informal Texts

	import os.path
	from collections import Counter
	from csv import *
	import math
	import sys
	LOG2_SCALE = 1 / math.log(2)

	NEG = 'neg'
	POS = 'pos'

dpressel / counts-single.py

Last active August 29, 2015 14:19

Actual sentiment score calculation performed for State-of-the-Art in Sentiment Analysis of Short Informal Texts. Does not actually start from two corpora that are split, but rather just assumes negated contexts are suffixed in a single.

	import os.path
	from collections import Counter
	from csv import *
	import math
	import sys
	LOG2_SCALE = 1 / math.log(2)

	"""
	This version of counts assumes a single corpus, with negated contexts already written.
	Here we generate a single lexicon file and

dpressel / ConsumerProducerQueue.h

Created September 16, 2015 14:42

C++ 11 Consumer Producer Buffer with a single Condition Variable

	#ifndef __CONSUMERPRODUCERQUEUE_H__
	#define __CONSUMERPRODUCERQUEUE_H__

	#include <queue>
	#include <mutex>
	#include <condition_variable>

	/*
	* Some references in order
	*

dpressel / SumWordVecDatasetReader.java

Created November 12, 2015 13:26

Turn a set of space delimited words and a label into a sum of dense (word vector) representations using medallia's Word2Vec impl.

	package org.n3rd.util;

	import com.google.common.collect.ImmutableList;
	import com.medallia.word2vec.Searcher;
	import com.medallia.word2vec.Word2VecModel;
	import org.sgdtk.ArrayDouble;
	import org.sgdtk.DenseVectorN;
	import org.sgdtk.FeatureVector;

	import java.io.BufferedReader;

dpressel / OrderedEmbeddedDatasetReader.java

Created November 12, 2015 13:30

Read in a sentence of word vectors with optional padding for conv. using medallia's Word2vec impl.

	package org.n3rd.util;

	import com.google.common.collect.ImmutableList;
	import com.medallia.word2vec.Searcher;
	import com.medallia.word2vec.Word2VecModel;
	import org.sgdtk.DenseVectorN;
	import org.sgdtk.FeatureVector;

	import java.io.BufferedReader;
	import java.io.File;

dpressel / train_from_json.py

Created April 1, 2016 19:33

Train on preprocessed politeness input

	# Check performance of baseline on a held out set that is decimation sampled
	# Over the ranks descending
	#
	# http://www.mpi-sws.org/~cristian/Politeness_files/politeness.pdf
	#
	import sys
	import cPickle
	import numpy as np
	import nltk.data
	import json

dpressel / train_from_csv.py

Created April 1, 2016 19:43

Train on preprocessed politeness corpus sentences to make Linear SVM BoW (baseline)

	# Check performance of baseline on a held out set that is decimation sampled
	# Over the ranks descending
	#
	# http://www.mpi-sws.org/~cristian/Politeness_files/politeness.pdf
	#
	#-----------------------------------------------------------------
	# Sample output with and without Punkt sentence processing seems
	# to not have much effect
	import cPickle
	import numpy as np

dpressel / polite-deps.js

Created April 1, 2016 19:49

Preprocess TSV files using Stanford Core NLP to generate dependency and sentence features required to train politeness API

	// Nashorn (JS) script to create input documents from post-processed TSV files
	// from Stanford Politeness corpus.
	// TSV looks like
	// 1\|-1\tText content
	//
	// Target data looks much like the data described here:
	// https://github.com/sudhof/politeness
	//
	// Original paper is here
	// http://www.mpi-sws.org/~cristian/Politeness_files/politeness.pdf

Daniel Pressel dpressel