On Unix, mv
is an atomic operation. This enables a well-known "symlink replacement trick" for race-condition-free website deployment, among other things. Let's create a script that encapsulates the process for general-purpose use.
When deploying an update to a website, if we do something like
git pull
within our deployment directory, orrsync
to our deployment directory, or even- run a script that quickly replaces the deployment directory with a new one,
... then there can be some number of milliseconds where the files we are trying to serve are nonexistent or in a state of change.
There is a filesystem-based mechanism that enables one to deploy updates with zero risk of this race condition occurring.1 It relies on the fact that mv
is an atomic operation and Unix supports symlinks. The basic idea is that we specify our document root as a symlink to a directory containing the current version.
In the following examples, our document root is www
. First, we copy all the files for our website into www.A
:
$ rsync -CrP 'remote:~/website/' 'www.A/'
Then, we create the symlink, pointing the document root to www.A
.
$ ln -s www.A www
Here's the directory structure in its entirety so far:
$ tree -AF --noreport
.
├── www -> www.A/
└── www.A/
└── index.html
When it's time to deploy an update, we prepare the next version in a different directory. We copy the updated files into www.B
:
$ rsync -CrP 'remote:~/website/' 'www.B/'
Giving:
$ tree -AF --noreport
.
├── www -> www.A/
├── www.A/
│ └── index.html
└── www.B/
├── blah.html
└── index.html
Then, to deploy, we replace the www
symlink currently pointing to www.A
with one pointing to the new version, www.B
.
$ ln -s www.B www.new
$ mv -T www.new www
$ tree -AF
.
├── www -> www.B/
├── www.A/
│ └── index.html
└── www.B/
├── blah.html
└── index.html
You can verify that this is atomic:
$ inotifywait -m . &
[1] 14628
Setting up watches.
Watches established.
$ ln -s www.B www.new
./ CREATE www.new
$ mv -T www.new www
./ MOVED_FROM www.new
./ MOVED_TO www
Whereas something like just asking 'ln' to overwrite the existing symlink is not:
$ ln -sfn www.B www
./ DELETE www
./ CREATE www
There is an unlikely but possible moment between that DELETE and the subsequent CREATE where a webserver might attempt to serve a file and find that its directory is missing!
One of the interesting things about this technique is that if you find shortly after deploy that your updates are broken, you can switch the symlink back to the previous version. With no extra effort you've gained the ability to do deployment rollbacks.
Another is that if you instruct some webserver to use the "next" directory www.B
as its document root, you gain a staging or preview site where you can inspect your changes before they "go live."
If we can assume that the new versions will be mostly the same as the previous versions, we can take advantage of a bandwidth-saving feature of rsync
. Cloning the "live" site into the "stage" site before using rsync
to transfer updated files will result in a bandwidth reduction, as rsync
skips unmodified parts of files.
If we do this just before transferring files, we encounter a mildly complex multiple-connection process:
- Create staging area with clone on host
- Push all files up from development boxes
- Perform symlink switch
However if we are willing to let the 'stage' directory persist on disk, we can re-use it and prepare it ahead of time, after each deploy.
- Perform symlink switch
- Prepare next staging area
- Development boxes copy files into staging area at their leisure
Let's encapsulate this technique into a set of scripts so we don't have to remember, say, what the -T
option to mv
is and why it's needed.
#!/bin/sh
# deploy.sh
N="`readlink \"$1.prev\"`"
cp -PT "$1" "$1.prev"
mv -T "$1.stage" "$1"
ln -s "$N" "$1.stage"
rm -rf "$N"
cp -aH "$1" "$N"
Next we provide a script to perform rollbacks.
#!/bin/sh
# rollback.sh
[ ! -e "$1.prev" ] && echo "Can't roll back." && exit 1
N="`readlink \"$1.stage\"`"
cp -PT "$1" "$1.stage"
mv -T "$1.prev" "$1"
ln -s "$N" "$1.prev"
rm -rf "$N"
Finally, we can also automate the initial setup of a set of directories and symlinks that this process expects to work with. Note that this initial setup step can't take advantage of the 'symlink replacement' trick.
#!/bin/sh
# initialize-deployable.sh
mkdir "$1.d"
mv "$1" "$1.d/1"
ln -s "$1.d/1" "$1"
cp -aH "$1" "$1.d/2"
ln -s "$1.d/2" "$1.stage"
ln -s "$1.d/3" "$1.prev"
Let's create an example site.
$ mkdir www
$ echo First > www/index.html
$ tree -AF --noreport
.
└── www/
└── index.html
$ cat www/index.html
First
Now, let's set it up to be usable with our deployment script. Note that this step does not employ the symlink switch trick.
$ initialize-deployable.sh www
$ ls -F
www@ www.d/ www.prev@ www.stage@
The contents of the www
directory remain the same:
$ cat www/index.html
First
The contents of the stage are currently the same as the live:
$ cat www.stage/index.html
First
And as expected there's nothing to roll back to:
$ rollback.sh www
Can't roll back.
Let's create a new version:
$ echo Second > www.stage/index.html
$ echo foo > www.stage/foo.html
$ ls -F www/
index.html
$ ls -F www.stage/
foo.html index.html
Now, let's deploy it.
$ deploy.sh www
The code deployed as expected:
$ ls -F www/
foo.html index.html
$ cat www/index.html
Second
And the .prev link works now:
$ ls -F www.prev/
index.html
$ cat www.prev/index.html
First
Let's do another!
$ echo Therd > www.stage/index.html
$ deploy.sh www
$ cat www/index.html
Therd
Oops! I did something wrong. Let's try a rollback:
$ rollback.sh www
$ cat www/index.html
Second
Pfwhew! How about another? We shouldn't be able to.
$ rollback.sh www
Can't roll back.
Let's see the state of the three directories.
$ cat www.prev/index.html
cat: www.prev/index.html: No such file or directory
$ cat www/index.html
Second
$ cat www.stage/index.html
Therd
As you can see the changes made in the stage are still there. Adjusting this to reset the stage to the contents of the live is trivial. Finally let's fix the problem and deploy again.
$ echo Third > www.stage/index.html
$ deploy.sh www
$ cat www/index.html
Third
Finally let's look at the entire directory structure as it exists now.
$ tree -AF --noreport
.
├── www -> www.d/3/
├── www.d/
│ ├── 1/
│ │ ├── foo.html
│ │ └── index.html
│ ├── 2/
│ │ ├── foo.html
│ │ └── index.html
│ └── 3/
│ ├── foo.html
│ └── index.html
├── www.prev -> www.d/2/
└── www.stage -> www.d/1/
The directory www.d
contains three arbitrarily-named directories to hold the previous, current, and next versions as needed. The current directory contains three symlinks, www
, www.prev
, and www.stage
.
With trivial changes to the management scripts these names could be modified to taste, or the three version directories need not live in their own containing directory. For example, changing '.d/1' to '.A', '.d/2' to '.B' and '.d/3' to '.C' in the initialization script is all that is needed to produce a filesystem layout like this instead:
$ tree -AF --noreport
.
├── www -> www.C/
├── www.A/
│ ├── foo.html
│ └── index.html
├── www.B/
│ ├── foo.html
│ └── index.html
├── www.C/
│ ├── foo.html
│ └── index.html
├── www.prev -> www.B/
└── www.stage -> www.A/
This version eschews the rollback feature. Since I keep my files in version control, as should you, I can let this deployment mechanism stay ignorant of how a "rollback" differs from just another deploy. Attempting to eschew the staging directory as well makes the script more complex so we'll just keep that.
#!/bin/sh
# deploy.sh
N="`readlink \"$1\"`"
mv -T "$1.stage" "$1"
ln -s "$N" "$1.stage"
rm -rf "$N"
cp -aH "$1" "$N"
And the setup script:
#!/bin/sh
# initialize-deployable.sh
mkdir "$1.d"
mv "$1" "$1.d/1"
ln -s "$1.d/1" "$1"
cp -aH "$1" "$1.d/2"
ln -s "$1.d/2" "$1.stage"
Let's create an example site.
$ mkdir www
$ echo First > www/index.html
$ tree -AF --noreport
.
└── www/
└── index.html
$ cat www/index.html
First
Now, just like last time, let's set it up to be usable with our deployment script. Note that we have no www.prev
symlink.
$ initialize-deployable.sh www
$ ls -F
www@ www.d/ www.stage@
The contents of the www
directory remain the same:
$ cat www/index.html
First
The contents of the stage are currently the same as the live:
$ cat www.stage/index.html
First
Let's create a new version:
$ echo Second > www.stage/index.html
$ echo foo > www.stage/foo.html
$ ls -F www/
index.html
$ ls -F www.stage/
foo.html index.html
Now, let's deploy it.
$ deploy.sh www
The code deployed as expected:
$ ls -F www/
foo.html index.html
$ cat www/index.html
Second
Finally let's look at the entire directory structure as it exists now.
$ tree -AF --noreport
.
├── www -> www.d/2/
├── www.d/
│ ├── 1/
│ │ ├── foo.html
│ │ └── index.html
│ └── 2/
│ ├── foo.html
│ └── index.html
└── www.stage -> www.d/1/
We have seen that this technique employs two directories, but with three directories we gain one level of rollback.
With 4 directories, could be have 2 levels of rollback? Can we write a script that works for N directories and achieves N-2 levels of rollback?
This is an interesting question from an abstraction and reduction perspective, that I may explore at some point but I don't really have a practical use for more than one rollback directory. (In fact I prefer none for my own work.)
Rather than have multiple scripts on our $PATH
, let's bake them together and add some sanity-checking, argument parsing, and error handling for a more robust tool.
FIXME TODO: ...
Footnotes
-
There are other means of avoiding this problem. Reconfiguring your webserver to serve from the new directory, and then sending them a signal to begin using the updated configuration would also work. In this document we only consider the case where we are deploying filesystem updates and we can't restart our webserver. Like when deploying a static website on a cheap commodity web host. ↩