The cbor-x's packed implementation only packs whole strings that occur multiple times, it does not search for repeated prefixes or postfixes, as they would almost certainly be vastly more expensive. Strings are packed if they occur multiples in a data structures. When using packed + record tags, strings as keys are not searched for string repetition (since it assumed repetition will mostly be eliminated by the structure reuse).
The table shows encoded size for each technique, and the encoding and decoding performance. The last column also includes the gzipped size for comparison sake (no gzip performance, but generally is about 2-4x slower with gzipping in my tests). The table compares plain CBOR encoding, packed, record structures with a 1+1 definition tag and 1+2 tag, and the combination of packed and record structures.
The first comparison test uses an 8KB JSON data structure from our database of medical studies, that has a fairly complicated and dynamic structure: https://github.com/kriszyp/cbor-x/blob/master/tests/example4.json
Method | size | encode/sec | decode/sec | gzip size |
---|---|---|---|---|
CBOR | 6376 | 140000 | 99900 | 2308 |
CBOR Packed | 4734 | 37300 | 103800 | 2456 |
CBOR with record tags (1+1) | 5227 | 105000 | 113000 | 2425 |
CBOR with record tags (1+2) | 5243 | 105000 | 113000 | 2429 |
CBOR Packed + records | 4515 | 48000 | 110400 | 2440 |
CBOR with stringrefs | 5138 | 99000 | 101600 |
The second comparison test uses an 25KB JSON data structure from Twitter's example response from their search API, which is much more homogenous and repetitive in structure: https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/api-reference/get-search-tweets
Method | size | encode/sec | decode/sec | gzip size |
---|---|---|---|---|
CBOR | 12213 | 76000 | 54000 | 3000 |
CBOR Packed | 6795 | 23000 | 63000 | 3260 |
CBOR with record tags (1+1) | 7633 | 82000 | 62000 | 3081 |
CBOR with record tags (1+2) | 7643 | 80000 | 62000 | 3084 |
CBOR Packed + records | 6008 | 39000 | 62000 | 3076 |
CBOR with stringrefs | 7295 | 65000 | 63000 |