Every executable is compiled 5 times, extreme values are thrown away and compile time is average of three times.
Encap versions are wrapped in do []
block.
Source code size of CSV codec is 10181 bytes for full version and 4988 bytes for lite version (~49%).
Lite version supports block of records only as Red format, full version suppoorts block of maps and map of columns also, plus some additional features like header handling.
name | compile time (ms) | % of original compile time | difference (ms) | size (bytes) | % of original size | difference (bytes) |
---|---|---|---|---|---|---|
nocsv | 31791.67 | 100% | 0 | 1116924 | 100% | 0 |
csv (master) | 34828.67 | 109.55% | 3037 | 1158352 | 103.71% | 41428 |
csv-encap | 35373.0 | 111.26% | 3581.33 | 1136208 | 101.73% | 19284 |
csv-lite | 33141.33 | 104.25% | 1349.67 | 1136512 | 101.75% | 19588 |
csv-line-encap | 32980.67 | 103.74% | 1189 | 1126532 | 100.86% | 9308 |
Speed is tested on block of 343 records, each with 343 values (columns), each value is 343 bytes.
name | load-csv | to-csv |
---|---|---|
csv (master) | 0.675 | 4.459 |
csv-encap | 0.544 | 4.234 |
csv-lite | 0.626 | 4.129 |
csv-lite-encap | 0.573 | 4.312 |
Compiling has no noticable effect on speed, actually encapped load-csv
seems to be bit (10-20%) faster, which is interesting.
As expected, using lite version has no impact on speed.o
CSV codec supports four different storage methods:
- block of blocks
- flat block
- block of maps
- map of columns
See following tables how different methods compare in terms of speed and memory usage. Each test was done for three different data sources:
- wide table: 1000 columns, 50 rows, each record has 100 bytes
- tall table: 50 columns, 1000 rows, each record has 100 bytes
- huge table: 500 columns, 500 rows, each record has 100 bytes
Code | Time (sec) | Memory (bytes) | Time difference | Memory difference |
---|---|---|---|---|
load-csv | 0.162 | 46'389'348 | 100% | 100% |
to-csv | 0.645 | 51'443'740 | 100% | 100% |
load-csv/flat | 0.161 | 46'383'544 | 99.38% | 99.99% |
to-csv/skip | 0.647 | 53'837'040 | 100.31% | 104.65% |
load-csv/as-columns | 0.185 | 49'444'008 | 114.20% | 106.58% |
to-csv (columns) | 0.688 | 56'936'872 | 106.67% | 110.68% |
load-csv/as-records | 0.193 | 64'854'576 | 119.14% | 139.80% |
load-csv (records) | 0.748 | 77'702'712 | 115.97% | 151.04% |
Code | Time (sec) | Memory (bytes) | Time difference | Memory difference |
---|---|---|---|---|
load-csv | 0.165 | 46'722'984 | 100% | 100% |
to-csv | 0.667 | 49'366'684 | 100% | 100% |
load-csv/flat | 0.165 | 46'722'832 | 100% | 100% |
to-csv/skip | 0.672 | 51'652'420 | 100.75% | 104.63% |
load-csv/as-columns | 0.189 | 49'160'020 | 114.54% | 105.22% |
to-csv (columns) | .704 | 56'862'556 | 105.55% | 114.58% |
load-csv/as-records | .210 | 67'397'440 | 127.27% | 144.25% |
to-csv (records) | .730 | 70'747'636 | 109.45% | 143.31% |
Code | Time (sec) | Memory (bytes) | Time difference | Memory difference |
---|---|---|---|---|
load-csv | 0.712 | 231'456'924 | 100% | 100% |
to-csv | 3.094 | 237'046'516 | 100% | 100% |
load-csv/flat | 0.728 | 231'456'772 | 102.25% | 100.00% |
to-csv/skip | 3.328 | 248'988'520 | 107.56% | 105.04% |
load-csv/as-columns | 0.873 | 243'786'988 | 122.61% | 105.33% |
to-csv (columns) | 3.366 | 262'646'188 | 108.80% | 110.80% |
load-csv/as-records | 0.861 | 323'046'568 | 120.93% | 139.57% |
to-csv (records) | 4.487 | 366'363'296 | 145.02% | 154.55% |
As you can see, the basic format, block of blocks is the most efficient in terms of both speed and memory usage, together with flat block. Conversion speed of map of columns to CSV is almost as fast, being only about 5% slower. Loading CSV to columns is 15-20% slower, memory is just about 5% higher than for blocks. The slowest and most memory hungry option is block of maps (records). Loading from CSV can take 10-45% time more, conversion to CSV around 20%. Memory usage is between 40-55% higher. OTOH this format is most user friendly.
This is awesome!