See: https://github.com/arhourigan/graveler
This is the simplest improvement to the speed of the original.
Some of the improvements turned out to be micro-optimisations, such as replacing repeat(None, 231)
with range(231)
and only counting the ones. The biggest effect was replacing random.choice([1, 2, 3, 4])
with math.floor(random.random() * 4)
. Initially, I'd assumed that random.randint(0, 3)
would be faster than using random.choice()
, but it turned out to be slightly slower, which was a surprise. There's quite a bit of checking, which is the reason why. After reading the implementation of the random
module, I remembered random.getrandbits()
. Which gives you n-bits of randomness back. As this is a four-sided die, that means we need two bits of data, which means random.getrandbits(2)
is sufficient.
[Aside: I've read that the reason for how convoluted random.randint()
is has to do with ensuring it has a uniform distribution, but given the implementation of random.uniform()
, I have my doubts on that.]
Original:
keith@cessair ~> time ./graveler_orig.py
Highest Ones Roll: 88
Number of Roll Sessions: 100000
________________________________________________________
Executed in 10.45 secs fish external
usr time 10.39 secs 0.00 millis 10.39 secs
sys time 0.05 secs 1.50 millis 0.05 secs
Using if math.floor(random.random() * 4) == 0
(about 3x faster):
keith@cessair ~> time ./graveler.py
Highest ones roll: 87
Number of roll sessions: 100000
________________________________________________________
Executed in 3.37 secs fish external
usr time 3.28 secs 0.71 millis 3.28 secs
sys time 0.08 secs 1.17 millis 0.08 secs
Using if random.getrandbits(2) == 0
(just over 6x faster):
keith@cessair ~> time ./graveler.py
Highest ones roll: 90
Number of roll sessions: 100000
________________________________________________________
Executed in 1.71 secs fish external
usr time 1.62 secs 939.00 micros 1.62 secs
sys time 0.10 secs 965.00 micros 0.10 secs
There's almost certainly some cleverness with NumPy somebody could do. Running everything on multiple threads might speed things up, but the GIL will probably be a factor.
I stuck to 100,000 because I'm still using a laptop I bought about a decade ago. That's enough to get a decent benchmark though.
I wrote a version in Go that's essentially identical to the Python version that gave the 3x speedup. It was actually nowhere near as much faster than the Python version as I'd expected. Here it is with 1,000,000 rolls:
keith@cessair ~> time ./graveler
Highest one rolls: 91
Number of roll sessions: 1000000
________________________________________________________
Executed in 3.04 secs fish external
usr time 3.04 secs 624.00 micros 3.04 secs
sys time 0.01 secs 184.00 micros 0.01 secs
That puts it at about 6x faster than my final Python version. It would've spat the answer out for 1,000,000,000 in about 50mins on my machine, given those numbers.