Skip to content

Instantly share code, notes, and snippets.

@dgrtwo
dgrtwo / mnist_pairs.R
Created May 31, 2017 18:56
Comparing pairs of MNIST digits based on one pixel
library(tidyverse)
# Data is downloaded from here:
# https://www.kaggle.com/c/digit-recognizer
kaggle_data <- read_csv("~/Downloads/train.csv")
pixels_gathered <- kaggle_data %>%
mutate(instance = row_number()) %>%
gather(pixel, value, -label, -instance) %>%
extract(pixel, "pixel", "(\\d+)", convert = TRUE)
@raineorshine
raineorshine / human-readable-hash-comparisons.md
Last active July 21, 2024 20:31
An aesthetic comparison of a few human-readable hashing functions.

An Aesthetic Comparison of Human-Readable
Hashing Functions

The following compares the output of several creative hash functions designed for human readability.

sha1's are merely used as arbitrary, longer, distributed input values.

input 1 word output 2 word output 3 word output
@calstad
calstad / TDA_resources.md
Last active February 6, 2025 02:27
List of resources for TDA

Quick List of Resources for Topological Data Analysis with Emphasis on Machine Learning

This is just a quick list of resourses on TDA that I put together for @rickasaurus after he was asking for links to papers, books, etc on Twitter and is by no means an exhaustive list.

Survey Papers

Both Carlsson's and Ghrist's survey papers offer a very good introduction to the subject

Other Papers and Web Resources

@mblondel
mblondel / multiclass_svm.py
Last active March 3, 2023 07:57
Multiclass SVMs
"""
Multiclass SVMs (Crammer-Singer formulation).
A pure Python re-implementation of:
Large-scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex.
Mathieu Blondel, Akinori Fujino, and Naonori Ueda.
ICPR 2014.
http://www.mblondel.org/publications/mblondel-icpr2014.pdf
"""
Wordlist ver 0.732 - EXPECT INCOMPATIBLE CHANGES;
acrobat africa alaska albert albino album
alcohol alex alpha amadeus amanda amazon
america analog animal antenna antonio apollo
april aroma artist aspirin athlete atlas
banana bandit banjo bikini bingo bonus
camera canada carbon casino catalog cinema
citizen cobra comet compact complex context
credit critic crystal culture david delta
dialog diploma doctor domino dragon drama
@omangin
omangin / nmf_kl.py
Last active August 9, 2019 06:30
Non-negative matrix factorization for I divergence
""" Non-negative matrix factorization for I divergence
This code was implements Lee and Seung's multiplicative updates algorithm
for NMF with I divergence cost.
Lee D. D., Seung H. S., Learning the parts of objects by non-negative
matrix factorization. Nature, 1999
"""
# Author: Olivier Mangin <[email protected]>
@dmglab
dmglab / git_bible.md
Last active March 9, 2024 02:59
how to git

Note: this is a summary of different git workflows putting together to a small git bible. references are in between the text


How to Branch

try to keep your hacking out of the master and create feature branches. the [feature-branch workflow][4] is a good median between noobs (i have no idea how to branch) and git veterans (let's do some rocket sience with git branches!). everybody get the idea!

Basic usage examples

@debasishg
debasishg / gist:8172796
Last active April 20, 2025 12:45
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&amp;rep=rep1&amp;t
@jboner
jboner / latency.txt
Last active April 23, 2025 18:02
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
@agramfort
agramfort / nngarotte.py
Created April 10, 2012 12:31
Non Negative Garotte
"""
Non-Negative Garotte implementation with the scikit-learn
"""
# Author: Alexandre Gramfort <[email protected]>
# Jaques Grobler (__main__ script) <[email protected]>
#
# License: BSD Style.
import numpy as np