Some thoughts...
- What about putting the
typein theContent-Typefield? Something likex-crdt-g-set-v1 - What's the rationale for representing things as a hash with lists as values instead of two hashes (in the OR and LWW set for example)? This to me seems closer to how it'll be represented in code (or at least how I've chosen to do with knockbox).
On JSON vs. Protobuf:
JSON has the advantage of easily being used in the browser, so you could actually use the browser's timestamp for some of the operations. This can be useful if for a single key, the single browser is the main actor doing writes. JSON also works easily with javascript mapreduce. If size is a concern, I'd be curious to see the size difference between compressed (snappy maybe?) JSON and Protobuf. I'm not saying Protobuf is a bad choice, just things to consider.
Sets should be able to store arbitrary elements, not just strings. As all JSON structures have an equality relation, sets like:
{ type: '2p-set', a: [{foo: 1}, {foo: 2}], r: [] }
are well-defined.
What I like about storing the elements in LWW/OR sets this way (in addition to space savings, which can be sizeable) is that you can build either in-memory representation efficiently. You'll be paying for the iteration over all keys regardless, so you can build two maps, multi-keyed maps, or a map/vector representation. My gut feeling is that a vector of tags is probably the fastest for small numbers of changes to each key, since you benefit from cache coherency... but I'll probably use all three in my Ruby implementation.