Skip to content

Instantly share code, notes, and snippets.

@mikeckennedy
Created January 20, 2020 22:46
Show Gist options
  • Save mikeckennedy/fc491bd359e2f490af9da8d4064957d7 to your computer and use it in GitHub Desktop.
Save mikeckennedy/fc491bd359e2f490af9da8d4064957d7 to your computer and use it in GitHub Desktop.
Compare the speed of a complex multiple if test and set containment replement.
import datetime
def main():
count = 1000
run_sets(count)
run_ifs(count)
def run_sets(times):
t0 = datetime.datetime.now()
data = list(range(1, 21))
for n in range(0, times):
if n in set(data):
pass
dt = datetime.datetime.now() - t0
print(f"Sets done in {dt.total_seconds() * 1000:,} ms.")
def run_ifs(times):
t0 = datetime.datetime.now()
for n in range(0, times):
if (n == 1 or n == 2 or n == 3 or n == 4 or n == 5 or n == 6 or n == 7 or n == 8 or n == 9 or n == 10
or n == 11 or n == 12 or n == 13 or n == 14 or n == 15 or n == 16 or n == 17 or n == 18 or n == 19 or n == 20):
pass
dt = datetime.datetime.now() - t0
print(f"Ifs done in {dt.total_seconds() * 1000:,} ms.")
if __name__ == '__main__':
main()
@mikeckennedy
Copy link
Author

Note, on Python 3.8 on macOS, I get these results:

  • Sets done in 0.966 ms.
  • Ifs done in 0.518 ms.

Making sets close to 50% slower.

@valorien
Copy link

valorien commented Jan 21, 2020

These examples aren't equivalent, as you're creating a set object with every iteration of run_sets(), while creating no object in the run_ifs() loop. When the set creation is done before entering the loop, the in check is significantly faster: See here

@mikeckennedy
Copy link
Author

Hi @valorien, that's exactly why it's slower. Yes, in this case, it's a set of constants. But if you had any fluctuation, for example:

def choose(n, m):
    if (m == n +1 or m == n + 7 or m == n + 21): ...

You'd have no choice but to recreate the set, hence the slower in the general case.

@valorien
Copy link

valorien commented Jan 28, 2020

Hi @mikeckennedy,
Yes, that's exactly why it's slower, and exactly the reason why it isn't a good example.
Needlessly creating a set from an unchanging sequence with every iteration doesn't mean that comparing values is faster than set in checks, especially when dealing with cases where we have no "fluctuations".
This is also inconsistent with the example shown in your video and I've already demonstrated the claim made in it is incorrect.
As for "fluctuating" cases, the time delta is almost non-existent: https://colab.research.google.com/drive/1smZC3vbTSYk_Lt5D21cuHWdYBchivSxW

I do hope you consider correcting that part of the course (which I do enjoy very much).
Thanks for your time and thank you for making such awesome content.
Have a great day :-)

@mikeckennedy
Copy link
Author

Hi,

The iteration is just to properly time it since doing fast things once is really hard to get any accuracy around. I guess the take away is is your test for truly fixed data then set is fine for perf, but if you have to take inputs, then the sudden it's slower. That is often the case which is why I said, generally speaking, the set style may be slower.

@valorien
Copy link

I understand - the use of timeit in the notebook I've linked to is done for the same reasons.
As shown in the example I linked, the time difference is negligible. In addition, if you take into account the time it takes to write or maintain the x==y approach, the use of set() becomes even more appealing.

Thanks again for your time, much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment