Skip to content

Instantly share code, notes, and snippets.

@michiexile
michiexile / gap.py
Created May 23, 2013 10:59
A Python implementation of the Gap Statistic from Tibshirani, Walther, Hastie to determine the inherent number of clusters in a dataset with k-means clustering.
# gap.py
# (c) 2013 Mikael Vejdemo-Johansson
# BSD License
#
# SciPy function to compute the gap statistic for evaluating k-means clustering.
# Gap statistic defined in
# Tibshirani, Walther, Hastie:
# Estimating the number of clusters in a data set via the gap statistic
# J. R. Statist. Soc. B (2001) 63, Part 2, pp 411-423

Keybase proof

I hereby claim:

  • I am michiexile on github.
  • I am michiexile (https://keybase.io/michiexile) on keybase.
  • I have a public key whose fingerprint is 7E64 04C5 2F0F B8E7 7449 5D3D B549 A90C C07C CCCD

To claim this, I am signing this object:

@michiexile
michiexile / RStanExamplesToPyMC3.ipynb
Last active November 30, 2019 21:32
Translating the MTH594 RStan examples to PyMC3
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@michiexile
michiexile / fakecancellations.md
Created August 10, 2024 15:21
Fake Cancellations that Work

Did you ever notice that $\frac{16}{64} = \frac{1}{4}$? In fact, $\frac{1\not 6}{\not 64} = \frac{1}{4}$.

What other such examples are there?

In [7]: nummaker = lambda x,y: 10*x+y

In [8]: nummakers = [lambda x,y: nummaker(x,y), lambda x,y: nummaker(y,x)]

In [9]: [(a,b,c,top(a,b), bottom(b,c)) for a in range(1,10) for b in range(1,10) for c in range(1,10) for top in nummakers for bottom in nummakers if top(a,b) != bottom(b,c) if top(a,b)*c == bottom(b,c)*a]