reiddraper/meangirls.md

Created January 5, 2012 14:37

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/reiddraper/1565520.js"></script>
Save reiddraper/1565520 to your computer and use it in GitHub Desktop.

Download ZIP

Raw

meangirls.md

Some thoughts...

What about putting the type in the Content-Type field? Something like x-crdt-g-set-v1
What's the rationale for representing things as a hash with lists as values instead of two hashes (in the OR and LWW set for example)? This to me seems closer to how it'll be represented in code (or at least how I've chosen to do with knockbox).

On JSON vs. Protobuf:

JSON has the advantage of easily being used in the browser, so you could actually use the browser's timestamp for some of the operations. This can be useful if for a single key, the single browser is the main actor doing writes. JSON also works easily with javascript mapreduce. If size is a concern, I'd be curious to see the size difference between compressed (snappy maybe?) JSON and Protobuf. I'm not saying Protobuf is a bad choice, just things to consider.

aphyr commented Jan 5, 2012

What about putting the type in the Content-Type field? Something like x-crdt-g-set-v1

I was aiming for this originally, but I think composability may be more important. If we place type hints in the structure, you can walk any json object recognizing crdt types for merge:

{
'name': {'type': 'lww-register', ...},
'following': {'type': '2p-set', ...},
...
}

So, I would vote for optional subtype params for indicating the JSON object can be parsed with CRDT types inside.

What's the rationale for representing things as a hash with lists as values instead of two hashes (in the OR and LWW set for example)? This to me seems closer to how it'll be represented in code (or at least how I've chosen to do with knockbox).

Can't use hashes to store arbitrary keys.
Symmetry between OR and LWW-sets.
Cuts serialized size significantly for OR sets.
Constant-space parsing, doesn't require building two intermediate maps to efficiently resolve member?. Cuts in-memory size too, I suspect. The in-memory structure I'd use is something like

{
element [[insert-tag1, insert-tag2], [remove-tag1]]
}

Might need to adapt to sets for large numbers of inserts and deletes, but that's a metaprogramming issue.

Allows for interesting tricks, like solving member? and subset? in O(n) and constant space. Think mapreduce queries.
Leaving myself wiggle room for order-preserving lists.

JSON vs Protobuf:

Why not both? I plan to export a .proto file for these types as well; you can drop them into your objects wherever. Downside is you can't type the sub-object, so I guess we'd have to leave the elements, tags, etc. as binary. We also have to standardize issues like string equality, sort orders, etc., instead of just "do whatever ecmascript does".

Author

reiddraper commented Jan 6, 2012

I was aiming for this originally, but I think composability may be more important. If we place type hints in the structure, you can walk any json object recognizing crdt types for merge:

Yeah this is a good point. +1 for the optional subtype params

Can't use hashes to store arbitrary keys.

Maybe I'm misremembering JSON/Javascript, but I believe hashes can have arbitrary Strings as keys, just like the values in your lists. Looking here briefly seems to support my understanding.

respresentation

I'm not convinced of any real speed differences in representation. One method requires two hash lookups (O(1)), and the other requires one hash lookup and accessing a vector (in javascript accessing a vector is really a hash with integers as keys, iirc). Maybe I'm totally off base here though?

Having both protobuf and JSON is hard to argue with :)

aphyr commented Jan 6, 2012

Sets should be able to store arbitrary elements, not just strings. As all JSON structures have an equality relation, sets like:

{ type: '2p-set', a: [{foo: 1}, {foo: 2}], r: [] }

are well-defined.

What I like about storing the elements in LWW/OR sets this way (in addition to space savings, which can be sizeable) is that you can build either in-memory representation efficiently. You'll be paying for the iteration over all keys regardless, so you can build two maps, multi-keyed maps, or a map/vector representation. My gut feeling is that a vector of tags is probably the fastest for small numbers of changes to each key, since you benefit from cache coherency... but I'll probably use all three in my Ruby implementation.

Author

reiddraper commented Jan 6, 2012

Sets should be able to store arbitrary elements, not just strings. As all JSON structures have an equality relation, sets like:

Good point. This alone may have me convinced to use lists.

I'll hold off on speculating more about performance :)

seancribbs commented Jan 6, 2012 via email

How do you guys feel about using object/hash/maps in counters? What are the downsides of requiring string actor IDs?

Author

reiddraper commented Jan 6, 2012

I can't immediately think of any downsides to requiring strings.

reiddraper/meangirls.md

Select an option

No results found

Select an option

No results found

aphyr commented Jan 5, 2012

Uh oh!

reiddraper commented Jan 6, 2012

Uh oh!

aphyr commented Jan 6, 2012

Uh oh!

reiddraper commented Jan 6, 2012

Uh oh!

seancribbs commented Jan 6, 2012 via email

Uh oh!

reiddraper commented Jan 6, 2012

Uh oh!