It's relatively easy to scale out stateless web applications. You often only need a reverse proxy. But for those stateful web applications, especially those applications that embeds websocket services in them, it's always a pain to distribute them in a cluster. The traditional way is introducing some external services like Redis to handle pubsub, however, in such way, you often need to change your code. Can Erlang/Elixir, the "concurrency oriented programming languages", best other languages in this use case? Has Phoenix framework already integrated the solution of horizontally scaling websocket? I'll do an experiment to prove (or disprove) that.
You can download the source code from https://gitee.com/aetherus/gossipy.
Distribute a Phoenix chatroom application (websocket only) across multiple nodes in a cluster, without introducing extra services (e.g. Redis, RabbitMQ, etc.), without changing a bit of the source code of the project. The only thing allowed is adding/modifying config files.
- Ubuntu 16.04 or its derivative distro (I use Elementary Loki)
- Docker (I have only 1 PC)
- Docker compose (The same reason as above)
- Elixir SDK
- A backbone Phoenix chatroom application without database, ecto, assets
I'll use Distillery as the release tool.
Just add the following line into mix.exs
and run mix deps.get
to fetch that.
defp deps do
[
{:distillery, "~> 1.5", runtime: false} #<--- This line
]
end
The runtime: false
here means distillery is not a dependency of your web application. It's just used in releasing procedure.
First, let distillery generate some config files for us.
$ mix release.init
You'll find a new directory rel
at the root of your application. Inside it there is a config.exs
and an empty plugins
directory. We'll ignore the rel/plugins
for the whole experiment. Let's have a look at the content of rel/config.exs
now.
Path.join(["rel", "plugins", "*.exs"])
|> Path.wildcard()
|> Enum.map(&Code.eval_file(&1))
use Mix.Releases.Config,
default_release: :default,
default_environment: Mix.env()
environment :dev do
set dev_mode: true
set include_erts: false
set cookie: :"<&9.`Eg/{6}.dwYyDOj>R6R]2IAK;5*~%JN(bKuIVEkr^0>jH;_iBy27k)4J1z=m"
end
environment :prod do
set include_erts: true
set include_src: false
set cookie: :">S>1F/:xp$A~o[7UFp[@MgYVHJlShbJ.=~lI426<9VA,&RKs<RyUH8&kCn;F}zTQ"
end
release :gossipy do
set version: current_version(:gossipy)
set applications: [
:runtime_tools
]
end
You can see 2 environments there, :dev
and :prod
. We'll focus on :prod
because releasing :dev
doesn't make any sense to me. It's worth noting that the environments in rel/config.exs
are NOT MIX_ENV
!! They are just names of building strategies. You can see the cookie there, which will be used in the -setcookie
option of starting Erlang virtual machine. The option :include_erts
tells distillery whether to embed the whole ERTS (Elang RunTime System) into the release package. If set to true
, then you don't have to install Erlang or Elixir on your target nodes in the cluster, but you have to make sure that the nodes run the same operating system as the host building the release.
By default, this configuration builds releases with startup option -name [email protected]
, which is not good for docker deployment because it's difficult to get the IP addresses of a running docker container. We need to use sname
instead. For this purpose, I created rel/vm.args
, with the content below:
## Name of the node
# -name [email protected]
-sname gossipy
## Cookie for distributed erlang
-setcookie >S>1F/:xp$A~o[7UFp[@MgYVHJlShbJ.=~lI426<9VA,&RKs<RyUH8&kCn;F}zTQ
## Heartbeat management; auto-restarts VM if it dies or becomes unresponsive
## (Disabled by default..use with caution!)
##-heart
## Enable kernel poll and a few async threads
##+K true
##+A 5
## Increase number of concurrent ports/sockets
##-env ERL_MAX_PORTS 4096
## Tweak GC to run more often
##-env ERL_FULLSWEEP_AFTER 10
# Enable SMP automatically based on availability
-smp auto
This is an exact copy of the default vm.args
file distillery generates in the release, with the only modification that removed -name
option and added -sname
.
Next step is telling distillery to use my modified vm.args
.
environment :prod do
...
set vm_args: "rel/vm.args" # <---- add this line
end
According to distillery's official documentation, I also need to add the following lines in config/prod.exs
:
config :gossipy, GossipyWeb.Endpoint,
...
check_origin: false,
server: true,
root: ".",
version: Application.spec(:gossipy, :vsn)
check_origin: false
is ONLY for convenience of this experiment. Never do this in your production as it decreases security!server: true
means the release will be booted through Cowboy.root: "."
sets the root path of static assets, which, in this experiment, there's none.version
sets the version of the release.Application.spec(:gossipy, :vsn)
fetches the value frommix.exs
.
Further more, we need to make the nodes know each other. We list all the nodes in the config/prod.exs
:
config :kernel,
sync_nodes_optional: [:"gossipy@ws1", :"gossipy@ws2"],
sync_nodes_timeout: 10_000 # milliseconds
sync_nodes_optional
means if a node cannot be reached in the amount of time specified in sync_nodes_timeout
, that node will be ignored. There is another option, sync_nodes_mandatory
, which will crash the current node if one of the nodes listed is unavailable. We obviously don't want this behavior.
$ MIX_ENV=prod mix release --env=prod
Then we enter the deployment phase.
Create a Dockerfile
with the following content:
FROM ubuntu:xenial
EXPOSE 4000
ENV PORT=4000
RUN mkdir -p /www/gossipy && \
apt-get update && \
apt-get install -y libssl-dev
ADD ./_build/prod/rel/gossipy/releases/0.0.1/gossipy.tar.gz /www/gossipy
WORKDIR /www/gossipy
ENTRYPOINT ./bin/gossipy foreground
As mentioned before, we have to make sure that the servers need to run the same OS as the building machine, I have to choose Ubuntu 16.04. Honestly, I should use the docker image elixir:1.6-alpine
to build and to run the application, but I'm lazy.
I need to install libssl-dev
because for some reason, Phoenix need the file crypto.so
to run, maybe it's because of the support of HTTPS and WSS.
I'm lazy, so I created docker-compose.yml
to save my keyboard strokes.
version: '3.2'
services:
ws1:
build: .
hostname: ws1
ports:
- 4001:4000
ws2:
build: .
hostname: ws2
ports:
- 4002:4000
There are 2 containers. Docker NAT's the port 4001 of the host to the port 4000 of the first container, and 4002 to the one of the second. We do this because we need to be sure each websocket connection connects to which container. We don't do this in production though.
When all the configuration job is done, it's time to start the cluster!
$ docker-compose up
Then you can try connecting to each container, and see if the connections communicates. If you don't know how to connect, see APPENDIX 1.
Kill the ws2
container.
$ docker-compose kill ws2
All the connections to ws2
are lost, but the connections to ws1
are still alive. The connection loss is not an issue because in production, we use the same origin (schema + domain name + port) for all the nodes. This means as long as the clients implement a proper reconnection mechanism, they will soon reconnect to the nodes alive.
We start ws2
again.
$ docker-compose start ws2
Everything recovers.
The purpose is not adding a node, but adding a node without rebooting the current nodes.
We first add another service in the docker-compose.yml
ws3:
build: .
hostname: ws3
ports:
- 4003:4000
and then modify the config/prod.exs
accordingly
config :kernel,
sync_nodes_optional: [:"gossipy@ws1", :"gossipy@ws2", :"gossipy@ws3"], #<---- Note the new ws3
sync_nodes_timeout: 10000
Rebuild a release, then start ws3
.
$ MIX_ENV=prod mix release.clean
$ MIX_ENV=prod mix release --env=prod
$ docker-compose up ws3
The newly added node instantly linked to the old nodes as expected.
Though tweaking the config files takes some time, we successfully scaled out the websocket application without changing a single word in our source code! And that is the beauty of Location Transparency!
<!doctype html>
<html>
<head>
<meta charset="utf-8">
<title>Phoenix Channel Demo</title>
</head>
<body>
<pre id="messages"></pre>
<input id="shout-content">
<script>
window.onload = function () {
var wsPort = window.location.search.match(/\bport=(\d+)\b/)[1];
var messageBox = document.getElementById('messages');
var ws = new WebSocket('ws://localhost:' + wsPort + '/socket/websocket');
ws.onopen = function () {
ws.send(JSON.stringify({
topic: 'room:1',
event: 'phx_join',
payload: {},
ref: 0
}));
};
ws.onmessage = function (event) {
var data = JSON.parse(event.data);
if (data.event !== 'shout') return;
messageBox.innerHTML += data.payload.message + "\n";
}
document.getElementById('shout-content').onkeyup = function (event) {
if (event.which !== 13) return;
if (!event.target.value) return;
ws.send(JSON.stringify({
topic: "room:1",
event: "shout",
payload: {message: event.target.value},
ref: 0
}));
event.target.value = '';
};
}
</script>
</body>
</html>
You can host it anyway you like, just make sure that the browser is not accessing it using file://
. I saved the content in a file ws.html
, and served it using $ python -m SimpleHTTPServer
(port defaults to 8000), then accessing it with http://localhost:8000/ws.html?port=4001. You choose which node to connect to by setting the port
parameter.