- Download all the .py scripts and run.sh
pip install json-streamsh run.sh
run.sh calls the run_*.py scripts, which will run gen_json.py to generate three JSON test files of varying size.
The generate JSON looks like:
{
"0": {"foo": "bar"},
"1": {"foo": "bar"},
"2": {"foo": "bar"},
"3": {"foo": "bar"},
...
}for 100_000, 1_000_000, and 10_000_000 {foo: bar} objects.
read_json.py and transform_json.py will then read and transform the generated JSON.
Those three .py runners will be calling /usr/bin/time and capturing the output to
get a rough metric for run time and memory usage of the two different methods,
standard and stream.
The stats show that json-stream has a flat memory curve for processing 100_000, 1_000_000, and 10_000_000 objects. It does take more time to read and transform, though:
Generate
| Method | Items | Real (s) | User (s) | Sys (s) | Mem (MB) |
|---|---|---|---|---|---|
| standard | 1e+05 | 0.19 | 0.17 | 0.01 | 45.84 |
| standard | 1e+06 | 2.00 | 1.93 | 0.06 | 372.97 |
| standard | 1e+07 | 21.67 | 20.46 | 1.03 | 3480.29 |
| stream | 1e+05 | 0.18 | 0.15 | 0.00 | 7.28 |
| stream | 1e+06 | 1.43 | 1.41 | 0.02 | 7.69 |
| stream | 1e+07 | 14.41 | 14.07 | 0.20 | 7.58 |
Read
| Method | Items | Real (s) | User (s) | Sys (s) | Mem (MB) |
|---|---|---|---|---|---|
| standard | 1e+05 | 0.05 | 0.04 | 0.01 | 48.28 |
| standard | 1e+06 | 0.58 | 0.50 | 0.05 | 390.17 |
| standard | 1e+07 | 7.69 | 6.73 | 0.80 | 3875.81 |
| stream | 1e+05 | 0.32 | 0.31 | 0.01 | 7.70 |
| stream | 1e+06 | 2.96 | 2.94 | 0.02 | 7.69 |
| stream | 1e+07 | 29.88 | 29.65 | 0.17 | 7.77 |
Transform
| Method | Items | Real (s) | User (s) | Sys (s) | Mem (MB) |
|---|---|---|---|---|---|
| standard | 1e+05 | 0.19 | 0.17 | 0.01 | 48.05 |
| standard | 1e+06 | 1.83 | 1.75 | 0.07 | 388.83 |
| standard | 1e+07 | 20.16 | 19.15 | 0.91 | 3875.49 |
| stream | 1e+05 | 0.63 | 0.61 | 0.01 | 7.61 |
| stream | 1e+06 | 6.06 | 6.02 | 0.03 | 7.92 |
| stream | 1e+07 | 61.44 | 60.89 | 0.35 | 8.44 |