Skip to content

Instantly share code, notes, and snippets.

@fire
Last active August 29, 2015 14:07
Show Gist options
  • Save fire/ba7d511122ec73aa683a to your computer and use it in GitHub Desktop.
Save fire/ba7d511122ec73aa683a to your computer and use it in GitHub Desktop.
Are the better key value stores in elixir/erlang
Session: Sun 28 Sep 2014 10:46:01 PDT
<iFire> My question is actually what are the better key value stores in elixir/erlang
<asonge> iFire: if you just want storage engines, you can't go wrong just grabbing shit from basho and basho-labs' github. there's an "eleveldb" nif from basho for leveldb, as well as bitcask (a log-structured kv store)
<asonge> also, there's an innostore wrapper somewhere in there too.
<iFire> asonge: I mean I was tempted to just use postgres
<asonge> iFire: it really depends on what you need it for
<pigmej> it depends what kind of key/value do you need
<tristan3> and why not ets?
<asonge> dets and ets exist
<tristan3> (unless I missed that)
<asonge> no reason to pass that up
<iFire> They don't seem to be 64bit though
<asonge> you need to store more than 4GB of data and you can't use consistent hashing to partition it? :)
<iFire> is it actually 4GB or is it more like 2GB?
<asonge> iirc, dets is 4GB
<pigmej> 4G because it's 32bit integer offset position limit
<iFire> Is there progress on 64bit dets?
<edub> Anyone know of a way to open iex in emacs like when you run it with 'iex -S mix' in the terminal?
<beamie> To start an interactive shell with your mix project loaded in it, run `iex -S mix`
<pigmej> iFire: that's not 64bit dets, its just a file offset
<pigmej> also, big files are slower than small ones
<pigmej> so if you really need to keep more than 4GB in one file with key/value you probably have wrong design
<asonge> yeah, i can't see a situation where you wouldn't just want to partition it
<asonge> that's a good way to increase the probability of file corruption
<asonge> that's a good way to increase the probability of file corruption
<iFire> can you have multiple dets in a single server?
<asonge> you can have as many dets tables as you want.
<pigmej> why not? It's just a file
<pigmej> iFire: you can use mnesia partitioning (works quite ok)
<iFire> I'll just think of it as file splitting then
<pigmej> iFire: if you can predict / hardcode number of partitions, then it's very easy
<pigmej> just one function that will assign key to correct 'partition'
<asonge> consistent hashing is something that there's a lot of easy how-to literature on now
<asonge> not that much magic voodoo
<iFire> asonge: is it hard to do rebalancing?
<pigmej> iFire: it all depends on requirements
<pigmej> offline rebalance == easy
<pigmej> online => problematic but not hardcore
<pigmej> also it depends what do you expect from consistency, etc
<asonge> iFire: one solution is to create virtual partitions then move them around piecemeal so that rebalancing aren't devastating
<iFire> I presume this is a solved problem?
<asonge> pretty much.
<asonge> there's different strategies with different tradeoffs, and it seems mostly well-known
<pigmej> iFire: you might also get http://erlang.org/doc/apps/mnesia/Mnesia_chap5.html#5.3
<asonge> iFire: the way riak does it is that it has (by default, n=64) N partitions called vnodes, and you have some number of hosts that's smaller than N that claim 1/64th pieces of the ring...when you add a node, you simply reallocate the vnodes among the hosts approximately equally, and slowly move the data over in some safe semi-coordinated way. that's one of the more complex examples.
<asonge> so you basically map "virtual" partitions onto a much smaller practical number of nodes.
<pigmej> asonge: the most important thing is, what excactly iFire requires from CAP
<asonge> pigmej: you could probably meet sane CAP guarantees with a variety of strategies :P
<asonge> mostly talking about A
<pigmej> depends which combination of CAP :)
<pigmej> A is quite easy ;)
<asonge> well, A has different definitions.
<pigmej> yeah, depending if AP or AC :)
<asonge> well, rather, A has a parameter in its definition...A has to do with getting a successful response back for an operation in a specific timeframe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment