Created
November 13, 2015 22:13
-
-
Save amosr/8989c5c65e684bf76c2d to your computer and use it in GitHub Desktop.
beating grep
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ ls -lah asx.psv | |
-rw-r--r-- 1 amos staff 59G 13 Nov 18:07 asx.psv | |
$ time grep -v EntryError asx.psv > /dev/null | |
real 14m33.885s | |
user 14m10.081s | |
sys 0m22.390s | |
$ time icicle-bench data/example/AsxDictionary.toml asx.psv output.psv | |
icicle-bench: starting compilation | |
icicle-bench: compilation time = 34.08s | |
icicle-bench: starting snapshot | |
icicle-bench: snapshot time = 278.51s (217.80MB/s) | |
real 5m12.648s | |
user 4m41.321s | |
sys 0m24.160s | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
this is for computing a bunch of means, and a bunch of "Pearson's product-moment correlation coefficient" on the different fields. the data is scraped from ASX stock prices.
the point is that, even spending 34 seconds compiling and optimising the query program, we are three times faster than grep. I guess for significantly smaller files the compilation overhead would dominate though.