Skip to content

Instantly share code, notes, and snippets.

View vene's full-sized avatar
🏴
ahoy

Vlad Niculae vene

🏴
ahoy
View GitHub Profile
@vene
vene / socket.io.js
Last active March 8, 2019 09:27
socketio-client 0.9.16 fork
/*! Socket.IO.js build:0.9.16, development. Copyright(c) 2011 LearnBoost <[email protected]> MIT Licensed */
/* Modifications by Vlad Niculae <[email protected]>
Available at https://gist.github.com/vene/c0657d854ae74a4511d2
Forked from https://raw.githubusercontent.com/Automattic/socket.io-client/ \
0.9.16/dist/socket.io.js
Changes:
@vene
vene / siegel.py
Last active August 29, 2015 14:09
# Author: Vlad Niculae <[email protected]>
# License: 2-clause BSD
"""2D implementation of the robust Siegel Repeated Median slope estimator
This estimator tolerates corruption of up to 50% of the input points in either
the X or the Y dimension.
Vectorized implementation, and a naive implementation for sanity-check.
"""
@vene
vene / keybase.md
Last active February 12, 2019 19:25

Keybase proof

I hereby claim:

  • I am vene on github.
  • I am vladn (https://keybase.io/vladn) on keybase.
  • I have a public key ASDuOJyHfNOqEqi3_3T0noSsAbKFt2dTowwoihXfRoguwAo

To claim this, I am signing this object:

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
deaths = [596577, 142942, 73831, 41374, 39518, 21176, 7683, 6849]
money = [54.1, 7, 4.2, 257.85, 3.2, 147, 14, 22.9]
names = ["Heart disease", "COPS", "Diabetes", "Breast cancer",
"Suicide", "Prostate cancer", "HIV/AIDS", "Motor neuron disease"]
sns.set_style("white")
@vene
vene / LICENSE
Last active August 29, 2015 14:05
TweetNLP POS tagger with Stanford WordToSentence joining
The full tagger software package is licensed as GPL version 2.
src/ -- All original code we've written -- the files in src/ with one
exception below -- we license under the Apache License version 2.0. However,
we have several GPL'd dependencies that we include in this package, which,
as we understand it, force the full package to be GPL.
src/cmu/arktweetnlp/impl/OWLQN.java -- is licensed GPL, originally from the
Stanford POS Tagger version 2010-05-26.
@vene
vene / lemmatize.pl
Last active August 29, 2015 14:05
Lemmatize CONLL-style (tabular) POS-tagged file using Treex
#!/usr/bin/env perl
# Lemmatize CONLL-style (tabular) POS-tagged file using Treex
# Prerequisites: cpan -i -f Treex::Tool::EnglishMorpho::Lemmatizer
# (I think the -f is needed because some tests are failing)
# Usage example:
# $ echo "1\tgoes\t_\tVBZ\n" > example
# $ <example ./lemmatize.pl
# 1 goes go VBZ
#
# Author: Vlad Niculae <[email protected]>
# Licence: BSD
from __future__ import division, print_function
import numpy as np
from sklearn.utils import check_random_state
class SquaredLoss(object):
def loss(self, y, pred):
@vene
vene / lbfgs_l1logistic.py
Last active January 14, 2023 20:30
Solving L1-regularized problems with l-bfgs-b
"""l-bfgs-b L1-Logistic Regression solver"""
# Author: Vlad Niculae <[email protected]>
# Suggested by Mathieu Blondel
from __future__ import division, print_function
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
from __future__ import print_function
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.grid_search import GridSearchCV
from sklearn.pipeline import make_pipeline
from sklearn.dummy import DummyClassifier
from sklearn.cross_validation import LeaveOneOut
docs = ["the cat lives in the hat", "the quick brown fox jumps over a dog",
"a clockwork orange"]
import re
from collections import OrderedDict
import numpy as np
from sklearn.base import BaseEstimator, TransformerMixin
class LexicalSetVectorizer(BaseEstimator, TransformerMixin):
def __init__(self, word_sets=None, normalize=False, lower=False,
token_pattern=ur'(?u)\b\w\w+\b'):
self.word_sets = word_sets
self.normalize = normalize