Note
For the actual food for mind see https://gist.github.com/sa2ajj/5323326#file-summary-rst
This document _tries_ to outline important items that need to be covered in order to get riak running with docker.
Please note that this is an outline of what I'm trying to do, not a step-by-step instruction (though it might become one day).
(It's possible that this document will end up somewhere else, but for now it just lives here @ gist.github.com)
This document is vaguely based on the excellent documentation offered by basho.
Overview
docker offers a different approach for using linux containers.
The main difference is docker's container does not have to have a full installation of a guest os: you may install only as little as necessary.
Docker is being very actively developed, so grab the best version from github:
$ git clone git://github.com/dotcloud/docker.git
Please bear in mind that this might be a suboptimal setup.
The goal is to create a riak cluster with 5 nodes (as the minimum number of nodes recommended for riak).
Prepare the host directory structure:
$ mkdir ~/riak-cluster $ cd ~/riak-cluster $ for i in seq 1 5; do mkdir -p node$i/{etc,data,log}; done
Things to check:
- directory ownership
- host and container user "mapping"(?)
One way to start is to use base
image available at the docker's registry. (Please note that at least for now, registry
server does not offer any fancy UI.)
Warning
The command below will actually fail as riak packages are not available from standard repositories.
Next version of the document will address it properly.
$ docker pull base $ C_ID=`docker run base apt-get install riak` $ docker commit -m 'added riak package' $C_ID my/riak-base
Riak has two configurations files: /etc/riak/app.config
and
/etc/riak/vm.args
. Both files have parameters that you'd probably like to
share among all nodes, as well as node specific ones. (Detailed information
about available configuration parameters can be found at
http://docs.basho.com/riak/latest/references/Configuration-Files/.)
The way how you perform the actual configuration is not covered here (for now),
for example, you have a magic script that magically appeared in your image
called riak-magic
that does all the configuration for you. After you run
it, create a new image:
$ C_ID1=`docker run my/riak-base /usr/sbin/riak-magic` $ docker commit -m 'common configuration is applied' $C_ID1 my/riak-configured
At this point, I'd like to extract the configuration files to the host (as I do not really know how to maintain them otherwise):
TODO
And place the files in the host hierarchy:
$ for i in seq 1 5; do cp /tmp/app.config /tmp/vm.args node$i/etc/; done
BIG QUESTION: is it necessary to be done inside container??
Important (for this use case) parameters are various directories where riak stores data.
Most notable are /var/lib/riak
and /var/log/riak
. These do not have to
be changed (less changes, easier to maintain).
The other important parameter is the IP address. You can make riak listen for
connections coming from anywhere (which is not a problem if you run it in a
dedicated network): use 0.0.0.0
as the IP address for various service:
- http(s) interface (
{riak_core, http | https}
) - protobuffer api interface (
{riak_api, pb_ip}
)
Erlang VM allows to establish communication between nodes provided those nodes
have a common cookie set up for them, hence the -setcookie
parameter is the
most important common one.
If you put 0.0.0.0
as an address to accept connections to, nothing needs to be done at this step.
-name
parameter specifies the node's name. It mentions node's IP address,
so if each node has own IP address only this part can be modified. If some
nodes share the same IP address, then the name part (before @
) must be
modified as well.
So modify the extracted vm.args
for each node in node<I>/etc/vm.args
.
Start the first node:
$ NODE1=`docker run -volume rw:/var/lib/riak=$(PWD)/riak-cluster/node1/data \ -volume rw:/var/log/riak=$(PWD)/riak-cluster/node1/log \ -d my/riak-configured ...`
Good question: how do I get the container's IP address.
Another good question: do I really need that address? Maybe I could resort to locally resolvable FQDN? (In this case, how docker would handle this??)
For each other node:
Start the container:
$ NODEX=`docker run ...`
Add the node to the cluster:
$ riak-admin cluster join riak@first-node-ip-address
After all nodes are added, review and commit your changes to the cluster:
$ riak-admin cluster plan $ riak-admin cluster commit
Now it should be set...
Just run the thing:
$ docker run -d ...
The important bit is that we need to retain certain things between runs:
- IP address
- content of
/var/lib/riak
(or other location that was specified to store riak's data)
Nothing special:
- Stop the node
- Upgrade
my/riak-configured
- If necessary, update common configuration
- Start the node
Have you given any thought about the actual setup of the Erlang cluster via EPMD? I think this is non-trivial: each erlang node (in a container) would need to connect to EPMD and the EPMDs need to be able to find eachother.
You couldn't let all containers run their own EPMD on the default port (because that would conflict on the same host). So ideally you would run a container dedicated to EPMD and have some orchestration going on OR you would run an epmd on the host level (which is perfectly fine as well).
Then the problem becomes that the Erlang nodes in the Container need to talk to the host-level (not container-level) EPMD, so that probably requires a custom distribution protocol library in Erlang (easy, I did that before).
Anyway, I see lots of issues...