Skip to content

Instantly share code, notes, and snippets.

View ddofer's full-sized avatar

Dan Ofer ddofer

View GitHub Profile
@karpathy
karpathy / min-char-rnn.py
Last active December 31, 2025 01:12
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
from lasagne.layers import Layer
class HighwayLayer(Layer):
def __init__(self, incoming, layer_class, gate_nonlinearity=None,
**kwargs):
super(HighwayLayer, self).__init__(incoming)
self.H_layer = layer_class(incoming, **kwargs)
if gate_nonlinearity:
@tokestermw
tokestermw / preprocess-twitter.py
Last active January 2, 2023 07:16
Python version of Ruby script to preprocess tweets for use in GloVe featurization http://nlp.stanford.edu/projects/glove/
"""
preprocess-twitter.py
python preprocess-twitter.py "Some random text with #hashtags, @mentions and http://t.co/kdjfkdjf (links). :)"
Script for preprocessing tweets by Romain Paulus
with small modifications by Jeffrey Pennington
with translation to Python by Motoki Wu
Translation of Ruby script to create features for GloVe vectors for Twitter data.
@kastnerkyle
kastnerkyle / conv_deconv_vae.py
Last active October 19, 2024 08:20
Convolutional Variational Autoencoder, modified from Alec Radford at (https://gist.github.com/Newmu/a56d5446416f5ad2bbac)
# Alec Radford, Indico, Kyle Kastner
# License: MIT
"""
Convolutional VAE in a single file.
Bringing in code from IndicoDataSolutions and Alec Radford (NewMu)
Additionally converted to use default conv2d interface instead of explicit cuDNN
"""
import theano
import theano.tensor as T
from theano.compat.python2x import OrderedDict
import seaborn as sns
from scipy.optimize import curve_fit
# Function for linear fit
def func(x, a, b):
return a + b * x
# Seaborn conveniently provides the data for
# Anscombe's quartet.
df = sns.load_dataset("anscombe")
@larsmans
larsmans / supervised_tf.py
Created October 9, 2014 11:19
Supervised tf (tf-chi², tf-rf) for scikit-learn
import numpy as np
#from scipy.special import chdtrc
from scipy.sparse import spdiags
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.preprocessing import LabelBinarizer
def _chisquare(f_obs, f_exp, reduce):
"""Replacement for scipy.stats.chisquare with custom reduction.
@roycoding
roycoding / forest.md
Last active January 23, 2018 08:05
Beat the Becnhmark: Forest Cover Type Prediction

Beating the Forest Cover Type Prediction benchmark

Day 4 of the Beat 5 Kaggle Benchmarks in 5 Days challenge

For the Forest Cover Type Prediction competition on Kaggle, the goal is to predict the predominant type of trees in a given section of forest. The score is based on average classification accuracy for the 7 different tree cover classes.

To beat the all fir/spruce benchmark I obviously tried a random forest. Using the default settings of scikit-learn's RandomForestClassifier, I was able to beat the benchmark with an accuracy score of 0.72718 on the competition leaderboard. By using 100 estimators (versus the default of 10), I was able to raise that accuracy score up to 0.75455.

Random Forest Cover Types

Using pandas I loaded the train and test data sets into Python.

@syhw
syhw / dnn.py
Last active October 19, 2024 08:20
A simple deep neural network with or w/o dropout in one file.
"""
A deep neural network with or w/o dropout in one file.
License: Do What The Fuck You Want to Public License http://www.wtfpl.net/
"""
import numpy, theano, sys, math
from theano import tensor as T
from theano import shared
from theano.tensor.shared_randomstreams import RandomStreams
@bsweger
bsweger / useful_pandas_snippets.md
Last active December 21, 2025 05:17
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

@syhw
syhw / dnn_compare_optims.py
Created July 21, 2014 09:07
comparing SGD vs SAG vs Adadelta vs Adagrad
"""
A deep neural network with or w/o dropout in one file.
"""
import numpy
import theano
import sys
import math
from theano import tensor as T
from theano import shared