Building a distributed database/api on an eventually consistent distribution mechanism like scuttlebutt.
There are users and (possibly anonymous) posts with permalinks that only take strings/buffers.
Think of information you...
- need to store permanently, possibly in multiple countries at the same time
- need to crud in realtime, possibly collaboratively
- storing lists like timelines in k/v-pairs
- how to connect nodes
- following people vs. rss feeds
- can replication delay be kept under 2s (time to send someone a link)
This is a server/node that stores, replicates and merges data.
It emits its
- current revision
- mean response time
- geolocation
- error rate
- heartbeat
A seperate DB used to check the availablity of unique names like usernames or slugs consistently across the network. Its data model is a simple set that contains all registered names and can say if one name is in it and if not, add it. Queried on every registration and anonymous post creation.
If we don't have this we can only use sha's like git does, no usernames and stuff.
This needs to be a seperate DB.
Requirements:
- Atomic: ask and write in an atomic step
- Consistent: that's the whole point of it
The main http/websocket/whatever frontend (clustered).
Picks a host to route to based on score s
, calculated with values from host h
, dataset d
and client c
:
- replication delay
- mean reponse time
- proximity
- error rate
and parameters i,j,k,l
:
s = 1/(i*h.replication_delay(d) + j*h.mean_reponse_time + k*|(client.location - h.location)| + l*h.error_rate)
- Checks each host's heartbeat
- Creates & kills hosts on demand
- Connects hosts
Can nodes (re)connect randomly or is monitor required?
a <-> c <-> b <-> d c = 1|2
- if a node in the middle fails, the graph is split and monitor has to connect nodes
a <-> c <-> b <-> d c = 2
\---------------/
- if a node in the middle fails, the graph degrades to the chain above
/> b <\
a | d c = 2|3
\> c </
/> b <\
a <-+-> d c = 3
\> c </
/> b <\
/ | \
a <- c -> e c = 3|4
\ | /
\> d </
a n = 1
a <-> b n = 2
a <-> b n = 3
\ c /
/ d \
a <(+)> b n = 4
\ c /
var name = 'juliangruber';
var password = 'foobar';
if (!namestore.register('user:'+name)) return false;
model.set('user:'+name, {
name: name,
password: sha(password)
});
Post URL: /<user>/<title>/
.
var user = 'juliangruber';
var title = title || new Date;
var msg = 'my post';
var path = 'user:'+user+':post:'+title;
if (model.get(path)) return false;
model.set(path, { msg: msg });
// Add to public timeline
// Add to user timeline
// Add to follower inboxes
Post URL: /<slug>/
.
var slug = slug || generateSHA();
var msg = 'my post';
// check title or regenerate sha
var path = 'post:'+slug;
while (!namestore.register(path)) {
if (!isSHA(slug)) return false;
path = 'post:'+(slug = generateSHA());
}
model.set(path, { msg: msg });
// Add to public timeline
thinking about stuff like this too, have a look at https://github.com/dominictarr/crdt
It's a subclass of scuttlebutt but has more a sophisticated model, that has partial updates, sets, and ordered sets.
I really like this approach: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html working towards something like this.