Created
August 16, 2012 01:03
-
-
Save vchahun/3365224 to your computer and use it in GitHub Desktop.
Never use scipy.stats!!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import scipy.stats as ss | |
import math | |
import numpy as np | |
from timeit import timeit | |
np_log_poisson = lambda k, l: -l + k * np.log(l) - math.lgamma(k+1) | |
log_poisson = lambda k, l: -l + k * math.log(l) - math.lgamma(k+1) | |
%timeit -n 10000 ss.poisson.logpmf(5, 4) # 164.0 us | |
%timeit -n 10000 ss.poisson._logpmf(5, 4) # 23.1 us | |
%timeit -n 10000 np_log_poisson(5, 4) # 11.2 us | |
%timeit -n 10000 log_poisson(5, 4) # 1.2 us |
In [34]: x=np.array(flatten(np.arange(100) for _ in range(1000)))
In [36]: %timeit ss.poisson.logpmf(x,4)
100 loops, best of 3: 18.6 ms per loop
In [37]: %timeit [np_log_poisson(y,4) for y in x]
1 loops, best of 3: 1.59 s per loop
(to be pendantic about it)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
If you don't need all the features, but just want the function for a single scalar, then of course you don't want to pay the cost of all the checking and the vectorization. It is well known that math.log is faster than numpy.log on a single scalar. It's an exaggeration and misleading to say never use scipy.stats. Perhaps a nice interface to fast scalar-equivalents would be a useful thing, though.