- https://dancres.github.io/Pages/
- https://ferd.ca/a-distributed-systems-reading-list.html
- http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/
- https://github.com/palvaro/CMPS290S-Winter16/blob/master/readings.md
- http://muratbuffalo.blogspot.com/2015/12/my-distributed-systems-seminars-reading.html
- http://christophermeiklejohn.com/distributed/systems/2013/07/12/readings-in-distributed-systems.html
- http://michaelrbernste.in/2013/11/06/distributed-systems-archaeology-works-cited.html
- http://rxin.github.io/db-readings/
- http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html
- http://pdos.csail.mit.edu/dsrg/papers/
Now that we live in the Big Data, Web 3.14159 era, lots of people want to build databases that are too big to fit on a single machine. But there's a problem in the form of the CAP theorem, which states that if your network ever partitions (a machine goes down, or part of the network loses its connection to the rest) then you can keep consistency (all machines return the same answer to
riak-admin force-remove
should not exist.
It's Friday evening, almost time to head out of the office for a nice long weekend. Your Riak cluster has been running along, everything fine. All of a sudden, the SSD in one of your Riak nodes decides to go up in a ball of flames. So you, being the good sysadmin that you are, immediately hop on the phone with your hardware vendor and order a new SSD. They tell you that you'll have it on Monday morning. Clearly you can't leave a broken node in your Riak environment, so you'll want to remove it from the cluster. You sit down at your terminal, hop on to a working Riak node and type
riak-admin force-remove [email protected]
NOOOOOOOOOOOOOOOOOOOOOOOOO!!!!
Here's where I understand the state of the art to be:
- In this INRIA tech report, Shapiro, Preguiça, Baquero and Zawirski (SPBZ) prove, amongst other things, that a sufficient condition for CRDTs to achieve eventual consistency on networks which may reorder and duplicate packets (which I'll call flaky networks, henceforth) is that
- the underlying datatype forms a semilattice,
- messages are full states,
- incoming messages are combined with the node's current state using the least-upper-bound operation in the semilattice.
- It's possible to relax condition 2 and still achieve eventual consistency over flaky networks by fragmenting the state into independent parts and transmitting updates to each part separately. For instance, in the G-Set CRDT (an add-only bitset) one can transmit only the index of the element to be added.
- In [these slides from a talk at Dagstuhl](http://www.dagstuhl.de/mat/Files/13/13081/13081.BaqueroCarlos.Sl
- Starting: https://github.com/basho/riak_kv/blob/1.4.2/src/riak_kv_wm_object.erl#L619
- We create a new
riak_object
and populate the various fields with the headers, metadata supplied by the client. - Big suprise, we eventually call
riak_client:put
: https://github.com/basho/riak_kv/blob/1.4.2/src/riak_client.erl#L143 - If/when the client returns any errors these are handled in
handle_common_errors
and it is nice to return human readable errors to client :)
Entry point for all object operations: https://github.com/basho/riak_kv/blob/1.4.2/src/riak_kv_wm_object.erl
delete_resource/2 takes RequestData(Request header, ex: vclock) and Context(Record containing: Bucket, Key, Client): https://github.com/basho/riak_kv/blob/1.4.2/src/riak_kv_wm_object.erl#L888
-module(riak_metrics). | |
-compile(export_all). | |
main([NodeName0, Cookie, Length, Command]) -> | |
LocalName = '[email protected]', | |
NodeName = list_to_atom(NodeName0), | |
case net_kernel:start([LocalName]) of | |
{ok, _} -> | |
erlang:set_cookie(node(), list_to_atom(Cookie)), | |
case net_kernel:hidden_connect_node(NodeName) of |
I've had one question about subpar performance in Riak 2.0's CRDTs. I thought I'd write this so that people can more easily diagnose these issues, without the CRDT team having to step in every time.
An Example: A Client was having problems with the performance fetching and updating sets. The issue manifested itself with poor fetch performance.
So, how do you go about debugging/diagnosing this?
- Note: always set
umask 022
for system-shared libraries - See http://blog.equanimity.nl/blog/2014/02/09/erlang-r17-rc1-on-osx-with-wx-and-a-working-observer/ for the details
- wxWidgets 3.0.0 works the same in R16B03-1 and 17.0-rc2
- Note well:
wx:demo()
on OS X 10.9.2 with wxWidgets 3.0.0 is still unstable, thoughobserver:start()
is more stable. - If you really don't have time, try Erlang Solutions' 32bit (not 64bit) distribution at https://www.erlang-solutions.com/downloads/download-erlang-otp to use it as a debugging console.
- Update 28-FEB-2014 0230UTC: Leo Liu reports
brew install wxmac --disable-monolithic
will do. See http://erlang.org/pipermail/erlang-questions/2014-February/077952.html.
-module(fj). | |
-export([parallel/2]). | |
%%%----------------------------------------------------------------------------- | |
%%% @doc Executes the given function on every task in tasks in parallel. | |
%%% @spec parallel(Function, Tasks) -> Results | |
%%% where Results is a list matching the arity of input list Tasks | |
%%% but contains the result of invoking Function on those tasks | |
%%% @end |