Skip to content

Instantly share code, notes, and snippets.

@madaan
madaan / FeatureGenFromRaw.java
Last active August 29, 2015 14:04
Preprocessing + Feature generation
package edu.washington.multir.experiment;
import java.io.IOException;
import java.util.List;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.util.CoreMap;
import edu.stanford.nlp.util.Pair;
@madaan
madaan / leastSquares.sce
Last active August 29, 2015 14:04
Least Squares Fitting : CS 215
data=evstr(read_csv("~/cs215/session_22_07/data.txt"))
X=data(:, 1:2);
X=[ones(size(X, 1), 1) X];
y=data(:, 3);
W = inv(X' * X) * X' * y;
predictions = X * W;
plot(predictions, 'r');
plot(y, 'g');
legend(["Predictions"; "True Value"]);
title("Linear Regression")
@madaan
madaan / titanic.sce
Last active August 29, 2015 14:04
CS 215 Session : Titanic Data Analysis Script
//0. Read the dataset!
//1. loading dataset
data=read_csv("~/cs215/session_22_07/train.csv");
header = data(1, :); //save the header for description
data=evstr(data(2:$, :));
ageIn = 4;
genderIn = 3;
classIn = 2;
fareIn = 7;
@madaan
madaan / aida_download_monitor.sh
Created January 22, 2014 14:40
Aida download monitoring script
#sg
command='curl -O -C - http://www.mpi-inf.mpg.de/yago-naga/aida/download/entity-repository/AIDA_entity_repository_2010-08-17v5-1.sql.bz2'
restarts=0
CLICK_TIME=5
until `$command`;do
echo "Download stopped!"
#keep taking snapshots
if [ `echo "$restart % $CLICK_TIME" | bc` -eq 0 ];then
cp AIDA_entity_repository_2010-08-17v5-1.sql.bz2 aida_backup
@madaan
madaan / gist:8400899
Last active January 3, 2016 03:18
Simple tester for our package.
// shf.go
package main
import (
"github.com/madaan/shuffler"
)
func main() {
shuffler.Shuffle_sentence("All work and no play makes jack a dull boy")
}
@madaan
madaan / gist:8116195
Created December 24, 2013 17:51
KNN with Adaboose
def cvalidate():
from sklearn import cross_validation
trainset = np.genfromtxt(open('train.csv','r'), delimiter=',')[1:]
X = np.array([x[1:8] for x in trainset])
y = np.array([x[8] for x in trainset])
#print X,y
import math
for i, x in enumerate(X):
@madaan
madaan / gist:7947316
Created December 13, 2013 16:52
Using svm as a weak learner for Adaboost. Working edition.
def main():
targetset = np.genfromtxt(open('trainLabels.csv','r'), dtype='f16')
target = [x for x in targetset]
trainset = np.genfromtxt(open('train.csv','r'), delimiter=',', dtype='f16')
train = np.array([x for x in trainset])
testset = np.genfromtxt(open('test.csv','r'), delimiter = ',', dtype='f16')
train, testset = decomposition_pca(train, testset)
@madaan
madaan / gist:7946708
Created December 13, 2013 16:17
Using adaboost with svm as weak learner [fails]
def cvalidate():
targetset = np.genfromtxt(open('trainLabels.csv','r'), dtype='f16')
y = [x for x in targetset]
trainset = np.genfromtxt(open('train.csv','r'), delimiter=',', dtype='f16')
X = np.array([x for x in trainset])
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size = 0.3, random_state = 0)
X_train, X_test = decomposition_pca(X_train, X_test)