Skip to content

Instantly share code, notes, and snippets.

@danvk
Created October 24, 2014 20:36
Show Gist options
  • Save danvk/6ac21423945e5067f9f4 to your computer and use it in GitHub Desktop.
Save danvk/6ac21423945e5067f9f4 to your computer and use it in GitHub Desktop.
import sys
si = sys.stdin
CHUNK_SIZE = 100000
n = 0
while True:
b = si.read(CHUNK_SIZE)
if not b:
break
n += b.count('\n')
if len(b) < CHUNK_SIZE:
break
print n
@danvk
Copy link
Author

danvk commented Oct 24, 2014

This is about as fast as the C version of wc -l:

$ time yes | head -10000000 | python wc.py
10000000
yes  0.76s user 0.01s system 58% cpu 1.319 total
head -10000000  1.29s user 0.01s system 98% cpu 1.318 total
python wc.py  0.04s user 0.02s system 4% cpu 1.319 total
$ time yes | head -10000000 | wc -l
 10000000
yes  0.76s user 0.01s system 59% cpu 1.294 total
head -10000000  1.28s user 0.01s system 99% cpu 1.293 total
wc -l  0.02s user 0.01s system 2% cpu 1.292 total

@ryan-williams
Copy link

very cool. let's speed out the ol' streaming version then! feel free to file issues there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment