I've recently encountered an article that showed some ways to speed up one's Python code. Some of these methods seemed rather peculiar, so I decided to do some extra complex data analysis to try to understand whether these methods actually work.
Python version: 3.7.1
Here's a quick summary of the methods proposed in that article that I found odd.
The article said that, since for
loops are "dynamic" (not sure what this means), they're slower than while
loops. I compared the following two loops and found that, on average, the for
loop was about 2.5 times faster than the corresponding while
loop:
# Fast!
for _ in range(50):
pass
# Slow...
i = 0
while i < 50:
i += 1
Indeed, in the following code, assignment via tuple unpacking was on average 1.47 times faster than consecutive assignment:
# Fast!
a, b, c, d = 2, 3, 5, 7
# Slow...
a = 2
b = 3
c = 5
d = 7
The article claimed that using while 1
instead of while True
"will reduce some runtime". Testing an infinite loop doesn't seem feasible because doing so will take quite some time, so I tested if
statements instead, because a while
loop is basically an if
statement with a jump. Turns out, if 1
is indeed faster, but the difference is small:
# About 5% faster
if 1:
pass
if True:
pass
If you run this test with test.py
and then perform the Welch Two Sample t-test on the data in R, you'll find that the 95% confidence interval is (-0.0005672652 -0.0005465111)
, which does not include zero (so the difference is significant), but is rediculously tiny and close to zero:
> one_true = read.csv("one_true.csv")
> t.test(one_true$one, one_true[["true"]])
Welch Two Sample t-test
data: one_true$one and one_true[["true"]]
t = -105.19, df = 14599, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.0005672652 -0.0005465111
sample estimates:
mean of x mean of y
0.01025956 0.01081645