Skip to content

Instantly share code, notes, and snippets.

@tiagocardosos
Created April 18, 2019 18:45
Show Gist options
  • Save tiagocardosos/63fa6f9a9a94322547dbe8d0ebfa8473 to your computer and use it in GitHub Desktop.
Save tiagocardosos/63fa6f9a9a94322547dbe8d0ebfa8473 to your computer and use it in GitHub Desktop.
What's the fastest way to recursively search for files in python?
# https://stackoverflow.com/questions/50948391/whats-the-fastest-way-to-recursively-search-for-files-in-python
import os
import glob
def walk():
pys = []
for p, d, f in os.walk('.'):
for file in f:
if file.endswith('.py'):
pys.append(file)
return pys
def iglob():
pys = []
for file in glob.iglob('**/*', recursive=True):
if file.endswith('.py'):
pys.append(file)
return pys
def iglob2():
pys = []
for file in glob.iglob('**/*.py', recursive=True):
pys.append(file)
return pys
# I also tried pathlib.Path.glob but it was slow and error prone, sadly
%timeit walk()
3.95 s ± 13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit iglob()
5.01 s ± 19.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit iglob2()
4.36 s ± 34 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Using GNU find (4.6.0) on cygwin (4.6.0-1)
"""
$ time find . -name '*.py' > /dev/null
real 0m8.827s
user 0m1.482s
sys 0m7.284s
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment