Elixir language is a love child of Ruby and Erlang giving you the best of both worlds: easy concurrency and programmer friendliness with advanced metaprogramming capabilities. Running tests concurrently enhances development cycle time. Nothing comes without drawbacks though. Compilation time is usually quite big. It doesn't matter that much during development because you compile only changed files. However, building from scratch takes time which might hurt CI time.
What makes compiling larger Elixir project slow? Is it inherent to the problem or is there a room for improvement?
I'm going to perform measurements on my work laptop:
13 inch MacBook Pro early 2015
3,1 GHz Intel Core i7 with 16GB of RAM
Let's measure how much time, CPU and memory does Elixir compilation take.
I'm using latest master commit 5947ed751eb0276cb7fda666c319dba451a5e9e7
time make compile
...
make compile 116,55s user 10,70s system 153% cpu 1:22,98 total
I was using htop in the other window and here is what I learned.
- apart from one brief moment, the compilation didn't use all CPUs
- the starting memory before compilation was over 5GB and it never went over 7GB
So the compilation is neither memory nor CPI bound. Does it automatically mean that it is IO bound. Let's retry the experiment on RAM disk.
time make compile
...
make compile 112,84s user 9,92s system 154% cpu 1:19,36 total
The results look very similar and those couple of seconds might be just a statistical error. If compilation is neither CPU, nor memory, nor IO bound, maybe it will be possible to parallelize it more?
José wrote a great article on inner workings of parallel compiler. https://elixir-lang.org/blog/2012/04/24/a-peek-inside-elixir-s-parallel-compiler/ It explains the main idea behind while leaving out all the optimizations.
The main idea behind it is like this:
- there is one main process called "parent" or "elixir_compiler"
- for each file spawn new compiling process
- when it finishes: report to parent
- when it needs other module - report it to parent and wait for the module to be compiled
Very often performance problems in Erlang systems are caused by wrong communication patterns: lock contention or one process being a bottleneck. Let's dive into the source some more.
Firstly, I wanted to run the compilation task from IEx, so that I am able to apply any profiling code. I started with Observer because it allows to check if any process has a growing message queue which means it doesn't keep up and becomes a bottleneck
#!/usr/bin/env iex
Mix.start #start mix
:observer.start #start observer before compilation
{time, retval} = :timer.tc(fn() ->
Mix.CLI.main ["compile"]
end)
IO.inspect time/1_000_000 #convert from microseconds to seconds
IO.inspect retval
I dropped that to my work project to make sure that most of the compilation uses Elixir. Elixir source itself is first bootstrapped with Erlang.
Using observer I checked the process queues lengths and none of them jumped higher than two. This rules out my theory with bottleneck.
I wanted to check real time CPU usage while building, but htop doesn't work, so I needed to go with Activity monitor.
Another common problem is lock contention. Erlang VM uses locks internally to protect resources and it is possible that those make the VM slower. http://erlang.org/doc/apps/tools/lcnt_chapter.html
Is Lexical Tracker guilty? https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/kernel/lexical_tracker.ex
% CNT ACC OWN
[{ "<0.874.0>", 1050,undefined, 20.921}, %%
{ spawned_by, "<0.535.0>"},
{ spawned_as, {proc_lib,init_p,
["<0.535.0>",[],gen,init_it,
[gen_server,"<0.535.0>","<0.535.0>",
'Elixir.Kernel.LexicalTracker',nil,[]]]}},
{ initial_calls, [{proc_lib,init_p,5},{erlang,put,2}]}].
{[{undefined, 1,20977.726, 0.037}],
{ {proc_lib,init_p,5}, 1,20977.726, 0.037}, %
[{{proc_lib,init_p_do_apply,3}, 1,20848.597, 0.081},
:lcnt.conflicts
lock id #tries #collisions collisions [%] time [us] duration [%]
----- --- ------- ------------ --------------- ---------- -------------
proc_link 1399 755506 59627 7.8923 1537139 7.6958
run_queue 6 3439147 49701 1.4452 948310 4.7478
atom_tab 1 2803735 1881 0.0671 67882 0.3399
:lcnt.conflicts
lock id #tries #collisions collisions [%] time [us] duration [%]
----- --- ------- ------------ --------------- ---------- -------------
code_server 6 1393762 73801 5.2951 1155458 5.7849
run_queue 6 3439147 49701 1.4452 948310 4.7478
atom_tab 1 2803735 1881 0.0671 67882 0.3399
process_table 1 89258 5607 6.2818 32691 0.1637
:lcnt.inspect(:proc_link)
lock id #tries #collisions collisions [%] time [us] duration [%] histogram [log2(us)]
----- --- ------- ------------ --------------- ---------- ------------- ---------------------
proc_link code_server 149268 59609 39.9342 1140048 5.7078 |X ............... |
proc_link <[email protected]> 774 0 0.0000 13260 0.0664 | X .. . . |
proc_link <[email protected]> 1227 0 0.0000 11648 0.0583 |X . .. xx Xx.x |