It's relatively easy to scale out stateless web applications. You often only need a reverse proxy. But for those stateful web applications, especially those applications that embeds websocket services in them, it's always a pain to distribute them in a cluster. The traditional way is introducing some external services like Redis to handle pubsub, however, in such way, you often need to change your code. Can Erlang/Elixir, the "concurrency oriented programming languages", best other languages in this use case? Has Phoenix framework already integrated the solution of horizontally scaling websocket? I'll do an experiment to prove (or disprove) that.
You can download the source code from https://gitee.com/aetherus/gossipy.
Distribute a Phoenix chatroom application (websocket only) across multiple nodes in a cluster, without introducing extra services (e.g. Redis, RabbitMQ, etc.), without changing a bit of the source code of the project. The only thing allowed is adding/modifying config files.
- Ubuntu 16.04 or its derivative distro (I use Elementary Loki)
- Docker (I have only 1 PC)
- Docker compose (The same reason as above)
- Elixir SDK
- A backbone Phoenix chatroom application without database, ecto, assets
I'll use Distillery as the release tool.
Just add the following line into mix.exs and run mix deps.get to fetch that.
defp deps do
  [
    {:distillery, "~> 1.5", runtime: false}   #<--- This line
  ]
endThe runtime: false here means distillery is not a dependency of your web application. It's just used in releasing procedure.
First, let distillery generate some config files for us.
$ mix release.init
You'll find a new directory rel at the root of your application. Inside it there is a config.exs and an empty plugins directory. We'll ignore the rel/plugins for the whole experiment. Let's have a look at the content of rel/config.exs now.
Path.join(["rel", "plugins", "*.exs"])
|> Path.wildcard()
|> Enum.map(&Code.eval_file(&1))
use Mix.Releases.Config,
    default_release: :default,
    default_environment: Mix.env()
environment :dev do
  set dev_mode: true
  set include_erts: false
  set cookie: :"<&9.`Eg/{6}.dwYyDOj>R6R]2IAK;5*~%JN(bKuIVEkr^0>jH;_iBy27k)4J1z=m"
end
environment :prod do
  set include_erts: true
  set include_src: false
  set cookie: :">S>1F/:xp$A~o[7UFp[@MgYVHJlShbJ.=~lI426<9VA,&RKs<RyUH8&kCn;F}zTQ"
end
release :gossipy do
  set version: current_version(:gossipy)
  set applications: [
    :runtime_tools
  ]
endYou can see 2 environments there, :dev and :prod. We'll focus on :prod because releasing :dev doesn't make any sense to me. It's worth noting that the environments in rel/config.exs are NOT MIX_ENV!! They are just names of building strategies. You can see the cookie there, which will be used in the -setcookie option of starting Erlang virtual machine. The option :include_erts tells distillery whether to embed the whole ERTS (Elang RunTime System) into the release package. If set to true, then you don't have to install Erlang or Elixir on your target nodes in the cluster, but you have to make sure that the nodes run the same operating system as the host building the release.
By default, this configuration builds releases with startup option -name [email protected], which is not good for docker deployment because it's difficult to get the IP addresses of a running docker container. We need to use sname instead. For this purpose, I created rel/vm.args, with the content below:
## Name of the node
# -name [email protected]
-sname gossipy
## Cookie for distributed erlang
-setcookie >S>1F/:xp$A~o[7UFp[@MgYVHJlShbJ.=~lI426<9VA,&RKs<RyUH8&kCn;F}zTQ
## Heartbeat management; auto-restarts VM if it dies or becomes unresponsive
## (Disabled by default..use with caution!)
##-heart
## Enable kernel poll and a few async threads
##+K true
##+A 5
## Increase number of concurrent ports/sockets
##-env ERL_MAX_PORTS 4096
  
## Tweak GC to run more often
##-env ERL_FULLSWEEP_AFTER 10  
  
# Enable SMP automatically based on availability
-smp auto
This is an exact copy of the default vm.args file distillery generates in the release, with the only modification that removed -name option and added -sname.
Next step is telling distillery to use my modified vm.args.
environment :prod do
  ...
  set vm_args: "rel/vm.args"  # <---- add this line
endAccording to distillery's official documentation, I also need to add the following lines in config/prod.exs:
config :gossipy, GossipyWeb.Endpoint,
  ...
  check_origin: false,
  server: true,
  root: ".",
  version: Application.spec(:gossipy, :vsn)- check_origin: falseis ONLY for convenience of this experiment. Never do this in your production as it decreases security!
- server: truemeans the release will be booted through Cowboy.
- root: "."sets the root path of static assets, which, in this experiment, there's none.
- versionsets the version of the release.- Application.spec(:gossipy, :vsn)fetches the value from- mix.exs.
Further more, we need to make the nodes know each other. We list all the nodes in the config/prod.exs:
config :kernel,
  sync_nodes_optional: [:"gossipy@ws1", :"gossipy@ws2"],
  sync_nodes_timeout: 10_000  # millisecondssync_nodes_optional means if a node cannot be reached in the amount of time specified in sync_nodes_timeout, that node will be ignored. There is another option, sync_nodes_mandatory, which will crash the current node if one of the nodes listed is unavailable. We obviously don't want this behavior.
$ MIX_ENV=prod mix release --env=prod
Then we enter the deployment phase.
Create a Dockerfile with the following content:
FROM ubuntu:xenial
EXPOSE 4000
ENV PORT=4000
RUN mkdir -p /www/gossipy && \
    apt-get update && \
    apt-get install -y libssl-dev
ADD ./_build/prod/rel/gossipy/releases/0.0.1/gossipy.tar.gz /www/gossipy
WORKDIR /www/gossipy
ENTRYPOINT ./bin/gossipy foreground
As mentioned before, we have to make sure that the servers need to run the same OS as the building machine, I have to choose Ubuntu 16.04. Honestly, I should use the docker image elixir:1.6-alpine to build and to run the application, but I'm lazy.
I need to install libssl-dev because for some reason, Phoenix need the file crypto.so to run, maybe it's because of the support of HTTPS and WSS.
I'm lazy, so I created docker-compose.yml to save my keyboard strokes.
version: '3.2'
services:
  ws1:
    build: .
    hostname: ws1
    ports:
      - 4001:4000
  ws2:
    build: .
    hostname: ws2
    ports:
      - 4002:4000
There are 2 containers. Docker NAT's the port 4001 of the host to the port 4000 of the first container, and 4002 to the one of the second. We do this because we need to be sure each websocket connection connects to which container. We don't do this in production though.
When all the configuration job is done, it's time to start the cluster!
$ docker-compose up
Then you can try connecting to each container, and see if the connections communicates. If you don't know how to connect, see APPENDIX 1.
Kill the ws2 container.
$ docker-compose kill ws2
All the connections to ws2 are lost, but the connections to ws1 are still alive. The connection loss is not an issue because in production, we use the same origin (schema + domain name + port) for all the nodes. This means as long as the clients implement a proper reconnection mechanism, they will soon reconnect to the nodes alive.
We start ws2 again.
$ docker-compose start ws2
Everything recovers.
The purpose is not adding a node, but adding a node without rebooting the current nodes.
We first add another service in the docker-compose.yml
  ws3:
    build: .                   
    hostname: ws3              
    ports:                     
      - 4003:4000
and then modify the config/prod.exs accordingly
config :kernel,
  sync_nodes_optional: [:"gossipy@ws1", :"gossipy@ws2", :"gossipy@ws3"],  #<---- Note the new ws3
  sync_nodes_timeout: 10000Rebuild a release, then start ws3.
$ MIX_ENV=prod mix release.clean
$ MIX_ENV=prod mix release --env=prod
$ docker-compose up ws3
The newly added node instantly linked to the old nodes as expected.
Though tweaking the config files takes some time, we successfully scaled out the websocket application without changing a single word in our source code! And that is the beauty of Location Transparency!
<!doctype html>
<html> 
  <head>
    <meta charset="utf-8">
    <title>Phoenix Channel Demo</title>     
  </head>
  <body> 
    <pre id="messages"></pre>  
    <input id="shout-content"> 
  
    <script>
      window.onload = function () {   
        var wsPort = window.location.search.match(/\bport=(\d+)\b/)[1];
        var messageBox = document.getElementById('messages');
        var ws = new WebSocket('ws://localhost:' + wsPort + '/socket/websocket');
  
        ws.onopen = function () {       
          ws.send(JSON.stringify({        
            topic: 'room:1',
            event: 'phx_join',
            payload: {},
            ref: 0
          }));
        };
        ws.onmessage = function (event) {
          var data = JSON.parse(event.data);
          if (data.event !== 'shout') return;
          messageBox.innerHTML += data.payload.message + "\n"; 
        }
        document.getElementById('shout-content').onkeyup = function (event) {
          if (event.which !== 13) return; 
          if (!event.target.value) return;
          ws.send(JSON.stringify({        
            topic: "room:1",
            event: "shout",
            payload: {message: event.target.value},
            ref: 0
          }));
          event.target.value = '';
        };
      }
    </script>
  </body>
</html>You can host it anyway you like, just make sure that the browser is not accessing it using file://. I saved the content in a file ws.html, and served it using $ python -m SimpleHTTPServer (port defaults to 8000), then accessing it with http://localhost:8000/ws.html?port=4001. You choose which node to connect to by setting the port parameter.