My notes from the Meet Chef course at http://pluralsight.com/training/Courses/TableOfContents/meet-chef
Chef is a Ruby framework for automating, reusing and documenting server configuration. It's like Unit tests for your servers.
One of the primary features of Chef is that its recipes are idempotent - You can run a script several times, but it won't change anything after the first run. If none of your inputs to Chef change, running it over and over should not try to run all the commands over and over.
Chef will not magically configure your server. It only runs recipes that you specify with the inputs you give it.
It's important that you do not blindly use other Chef cookbooks or recipes. No two companies have exactly the same server architecture.
Chef does not monitor the runtime behavior of an of the software it configures. It can not tell you whether or not a service is running. It is a short lived deployment service, not a continuously running one. It was built to be run occasionally to keep a server's configuration in a specific state.
Chef doesn't have any concept of undoing changes. It's up to us to rollback any changes if we feel a mistake has been made. It does backup some configuration files when it makes changes, but it doesn't have any tools to restore these files in the event of a mistake.
We should always test our cookbooks and recipes in a virtual environment before deploying.
Chef makes server configuration readable, accessible, and repeatable. In addition, automated deployments are faster than connecting via SSH and running one command after another via the shell terminal.
Because Chef cookbooks are both reusable and idempotent, Chef can take some of the fear and guesswork out of sysadmin work. While Chef doesn't save you from the need to understand what you're doing as a sysadmin, it gives you a much more friendly environment for dealing with it.
Chef recipes are a great way to learn about how the software works rather than using reading through a lot of disparate manuals.
It's also useful for keeping development in sync with production. Doing so minimizes deployment issues.
A node is a server, or a host. It could be for the web, database, worker, etc.
The command line program that configures the server from the node, or host
stores information about the nodes. is rest-based. opscode is an example, but could be self-hosted
a standalone version of the chef client that doesn't require a chef server. point it at a recipe instead
A ruby file containing instructions for building a node, each executed in order
files, directories, users and services needed by a node
a collection of recipes, and associated files, such as configuration file templates.
reusable configuration across multiple nodes. I.e. a web role that configures a server distributed over 5 nodes, or a database role that configures slave/master database nodes.
an array of recipes and roles defining what gets executed on a node
variables that are passed through chef and used in recipes and templates, such as software version.
a file with placeholders for attributes
when one resource is changed, it can trigger an update in another resource. Ex: If an Nginx configuration file is updated, then notify the nginx resource to restart
We need
- VirtualBox (https://www.virtualbox.org/)
- Starter code (https://github.com/jsierles/peepcode-chef-recipes.git)
- Vagrant (http://vagrantup.com)
The starter code includes a Vagrantfile that's setup to run a preconfigured Debian Squeeze 64 setup. It also comes bundled with Ruby 1.9.3 and Ruby Gems 1.8.17.
vagrant up
vagrant ssh
In a new Terminal tab, from the starter code root:
cd nginx
rm -rf recipes/ templates/ attributes/ # because we want to start from scratch
mkdir recipes
mkdir templates
mkdir attributes
touch recipes/default.rb
default.rb
is the recipe that is run by default when no specific
recipe is specified.
The first step will be to install the nginx
package. (A package is
a chef resource which relies on a package management system on the host
OS, i.e apt
)
Specifying a version is only useful if you want to stop Chef from upgrading to a later version in the future.
package "nginx" do
version "1.0.3"
end
We can just stick with the default version, so:
package "nginx"
If we run the recipe now, chef will try to install nginx (using apt-get) if it hasn't already been installed.
We also want to be able to start and stop nginx, so we'll install it as a service:
package "nginx"
service "nginx"
During an update, other recipes can now determine if the state of the nginx service.
Most unix systems run process daemons from the /etc/init.d directory. Chef relies on the package installer (i.e. apt-get) to do this, then sends the appropriate commands (start, stop, etc.) and assumes it will do the right thing
We want to be able to get status, restart and reload for the nginx. And we want it to be started when the OS boots and when chef runs. So:
package "nginx"
service "nginx" do
supports :status => true, :restart => true, :reload => true
action [:enable, :start]
end
enable
ensures the service is started when the OS boots, start
means
Chef should start the service if it isn't already.
The config file is where we can customize nginx, based on our config variables (with web directory variables, etc)
We could start with nginx's default config file, since that has a lot of settings we can use already. But... we don't have nginx installed yet. We can install it temporarily to get the file:
sudo su
apt-get update
apt-get install nginx
cat /etc/nginx/nginx.conf
I had to run
apt-get update
first before I could install nginx
We can update any of these options dynamically using a chef template.
Chef templates use erb
syntax. We tell chef where the template lives.
If we specify a block, we can do other things such as trigger
notifications.
Continuing in default.rb
:
template "/etc/nginx/nginx.conf" do
notifies :reload, "service[nginx]"
end
The :reload
refers to reload
on line 4. (This is an nginx shortcut
for reading config changes that don't require a full restart.)
service[nginx]
refers to line 3 where we referred to nginx as a
service.
When chef is run, it will look for a template file named
templates/nginx.conf.erb
. We could also explicitly set the source
template file within the block using source "nginx.conf.erb"
, but
this is unnecessary because chef uses convention-over-configuration to
determine the template file name.
We can create the template file by copying what we cat
ed from nginx's
default config file and pasting it into nginx.conf.erb
. This is good
practice to ensure we're as close to the standard setup as possible.
Before we do that, create a default
folder in the templates
folder.
This has nothing to do with the default.rb
file we created earlier.
It refers instead to the default host server. This templates directory
can have templates customized for each host machine. For example, we
could have made a beta
directory for a beta
server. default
will
be the directory that gets used for all hosts in this recipe.
Now we create the file there. It won't have any ERB tags in it just yet.
Going back to the VM we are SSH'd into, let's install chef-solo:
gem install chef --no-rdoc --no-ri
We now have the chef-solo
command line utility available to us. If we
run it now, it will complain that it can't find any cookbooks or a
configuration file. We can tell chef-solo where to look for cookbooks
by setting up a config file like so:
mkdir /etc/chef
echo "cookbook_path \"/cookbooks\"" > /etc/chef/solo.rb
Now if we run chef-solo -l info
it no longer complains, but we can
see that the run list (recipes or roles) is empty. Since we only want
nginx, we can pass it in directly using a json file. We can use vim /etc/chef/node.json
to create this:
{
"run_list": ["recipe[nginx]"]
}
This refers to the nginx
cookbook, and the default recipe. If we had
another recipe, for example a "client" recipe, we could specify it
after a double colon:
"run_list": ["recipe[nginx::client]"]
When writing this file, always use double-quotes, and be sure to close all brackets and quotes. Now save the file and quit vim.
Now we tell chef that we're going to use our own run list:
chef-solo -l info -j /etc/chef/node.json
I had to add the
-l info
option to the command to get the same level of output as was shown in the video.
The output looks like:
[2014-03-18T15:34:05+01:00] INFO: Forking chef instance to converge...
Starting Chef Client, version 11.10.4
[2014-03-18T15:34:05+01:00] INFO: *** Chef 11.10.4 ***
[2014-03-18T15:34:05+01:00] INFO: Chef-client pid: 1905
[2014-03-18T15:34:07+01:00] INFO: Setting the run_list to ["recipe[nginx]"] from JSON
[2014-03-18T15:34:07+01:00] INFO: Run List is [recipe[nginx]]
[2014-03-18T15:34:07+01:00] INFO: Run List expands to [nginx]
[2014-03-18T15:34:07+01:00] INFO: Starting Chef Run for vagrant-debian-squeeze-64.vagrantup.com
[2014-03-18T15:34:07+01:00] INFO: Running start handlers
[2014-03-18T15:34:07+01:00] INFO: Start handlers complete.
Compiling Cookbooks...
Converging 3 resources
Recipe: nginx::default
* package[nginx] action install[2014-03-18T15:34:07+01:00] INFO: Processing package[nginx] action install (nginx::default line 1)
(up to date)
* service[nginx] action enable[2014-03-18T15:34:07+01:00] INFO: Processing service[nginx] action enable (nginx::default line 3)
(up to date)
* service[nginx] action start[2014-03-18T15:34:07+01:00] INFO: Processing service[nginx] action start (nginx::default line 3)
[2014-03-18T15:34:08+01:00] INFO: service[nginx] started
- start service service[nginx]
* template[/etc/nginx/nginx.conf] action create[2014-03-18T15:34:08+01:00] INFO: Processing template[/etc/nginx/nginx.conf] action create (nginx::default line 8)
(up to date)
[2014-03-18T15:34:08+01:00] INFO: Chef Run complete in 0.953383215 seconds
Running handlers:
[2014-03-18T15:34:08+01:00] INFO: Running report handlers
Running handlers complete
[2014-03-18T15:34:08+01:00] INFO: Report handlers complete
Chef Client finished, 1/4 resources updated in 2.514633292 seconds
As we can see in the output above, chef doesn't actually perform any
actions other than restarting the nginx service. That's because we
already have nginx installed, and our template file is exactly the same
as the nginx.conf file already in use. Let's go back to our template
and change the keepalive_timeout
setting to 5
. Then return to the
VM and re-run the chef-solo
command. Now the output includes info
about backing up the old config file and updating the original. It even
includes a diff of the changes. If we go back and look at the live
config file we'll see our changes are now present.
This is great, but not terribly useful to pass in static values. A better use is using attributes.
Cookbooks can define attributes for use anywhere in one of their templates. It's always good to provide a default value for any attribute used in your cookbooks. Note that attribute values have different levels of precedence depending on where you set them. The order of precedence (highest -> lowest) looks like this:
Node <- Role <- Environment <- Cookbook
Environment
means the same thing here as it would in Rails:
development, test, staging, production, etc. Here's an example:
Say we want to define the number of worker processes for our servers. We might define that number to be 4 in our cookbook since that's a safe default. Our cookbook will also be used to setup our development environment, so we could let that environment specify that it only needs a single worker process. We then have a Role for setting up web servers, and we decide that our hardware specs mean they can comfortably handle 6 worker processes instead. Then for one of those nodes, perhaps an older legacy server that we've yet to retire, we don't want to risk overtaxing the CPU so we can set it to use a value of "2" worker processes instead.
In addition, Chef has different kinds of attributes, each with their own precedence:
Automatic <- Override <- Normal/Set <- Default
The lowest level point where we can create attributes is inside the
attributes
directory of the cookbook itself.
touch attributes/nginx.rb
Some people prefer to use the name default.rb
, but that can be
confusing regarding the name default
. It's also easier to search the
cookbook for the name of the package (i.e. nginx
).
Inside this file we'll use what are called "cookbook default attributes". It's almost always best to use "default attributes" in practice. (If default attributes aren't powerful enough, we can explore chef's other attributes, but it's likely we'll never need to.) Default attributes are attributes that can be overridden at the environment or role level.
In this file, we'll specify:
default[:nginx][:dir] = "/etc/nginx"
default[:nginx][:worker_processes] = 4
Attributes can be set using symbols (as above), or string keys, or using dot notation. The line above could also be written as:
default["nginx"][:worker_processes] = 4
or
default.nginx[:worker_processes] = 4
It's best to choose one style and stick with it.
Attributes are embedded in templates using the standard ERB output
notation via the @node
instance variable. In our nginx.conf.erb
file, we can use ours like so:
worker_processes <%= @node[:nginx][:worker_processes] %>;
The @node
object holds all of chef's attributes, whether from the
cookbook, environment, role or node.
If we run our recipe again, we'll see that our nginx.conf file gets updated again using our attribute value.
When constructing a template, you might be tempted to replace every possible configuration option with a matching attribute. Most of the time you really only need to change a select few options, so try to constrain yourself to major options.
There's another way to specify attributes directly in template blocks in
recipe files. It's not used very often, but here's what it would look
like if we used it in recipes/default.rb
:
template "/etc/nginx/nginx.conf" do
notifies :reload, "service[nginx]"
variables :user => 'www-data'
end
Here we pass a hash to the variables method. Then in our template file
we'd reference it as @user
:
user <%= @user %>;
Whenever possible, use default attributes rather than variables. We should now back out these changes from our own recipe and template files.
The resources
and providers
directories are usually not needed, but
are useful for auxiliary configurations. The directories are used to
create an LWRP - Light Weight Resource Provider. An example LWRP might
be to create a resource for managing the creation of an nginx virtual
host file. As with services and notification, we'd use resources inside
a recipe. It's a way of taking groups of resources and giving them more
functionality and integration within Chef by emulating the syntax of
built-in resources, extending the chef DSL for things we do in many
places.
Taking a break from our nginx example, we'll build a recipe to setup a
Rails app and deploy using Capistrano. This will be easier if we copy
our SSH key to the server (aka node). Copy/paste the contents of
~/.ssh/id_rsa.pub
on the local machine to ~/.ssh/authorized_keys
on
the VM. (Make sure you are not sudo'd) There is already one key in that
file which is what vagrant uses to authenticate us through
vagrant ssh
. We don't want to use this key for any public server since
the same key is used with ALL vagrant gems, but we can leave it for
our own use.
Confirm that nginx is running on the VM
ps auwx | grep nginx
We'd like to access nginx directly from our desktop web browser. Our
Vagrantfile
specified a config.vm.network
option that lets us
connect to the VM from our virtual host at IP address 33.33.33.10
. We
can add this to our local hosts
file to access the server using a more
friendly name. On the local machine, run sudo vim /etc/hosts
and add
the following:
33.33.33.10 kayak.test
If we try to access http://kayak.test
in our local browser, we should
now see a 404 error page from nginx. We can call this a success since
it shows that nginx is handling our requests and serving responses.
Goals:
- Import the Nginx and Unicorn cookbooks
- Understand the existing Unicorn recipe
- Create a Rails recipe
- Understand metadata
- Install Ruby gems
- Create directories and files
- Edit the run list
- Create and configure templates
- Reuse variables
The starter code already has recipes for Rails and Unicorns. We're going to create the former from scratch, but leave the latter one untouched. Looking at the Unicorn recipe we see it:
-
Installs the Unicorn ruby gem with the
gem_package
method.We could install this using the Gemfile in the Rails app, but we want to do it in Chef to ensure it is present.
-
Creates a directory for storing configuration files
-
Installs a cookbook file, a new type of chef resource that lets us copy plain, static files and copy them to the node. In this case, we're copying a Ruby script to control the Unicorn process.
We're going to write our Rails recipe from scratch:
rm -rf rails/
mkdir rails
mkdir rails/attributes rails/templates rails/recipes
touch rails/recipes/default.rb
We need our recipe to:
- Run the nginx and unicorn installers
- Setup a metadata file to define those dependencies
- Create some directories
- Configure the Unicorn app server
- Configure nginx to serve this as a virtual host
We need chef to run the nginx and unicorn recipes first, and then use
resources from those recipes. To get Chef to do this, we add the
following to our recipes/default.rb
file like so:
include_recipe "nginx"
include_recipe "unicorn"
An older syntax you might see is
require_recipe
, but don't use that; it's been deprecated.
We also need to define those recipes as dependencies so that chef server
will deliver those recipes to the node when using chef client. We can
copy the one from the unicorn
cookbook to use as a starting point.
cp unicorn/metadata.rb rails/
It currently looks like this:
maintainer "Joshua Sierles"
maintainer_email "[email protected]"
description "Configures unicorn"
version "0.1"
We want to add these two lines to tell chef that it can't run without access to the other two cookbooks:
depends "nginx"
depends "unicorn"
Note that we are specifying the name of a top-level cookbook, not an individual recipe. This is for coordination with the chef server, but you should always do it even if you're only working with chef-solo.
Back in recipes/default.rb
, add the following line to ensure we have
access to sqlite and the bundler gem, both of which we'll need to setup
the rails app:
package "libsqlite3-dev"
gem_package "bundler" # builds a ruby gem, not a debian package
The video left out the part about adding the "libsqlite3-dev" package, but the original files had it, and Capistrano was failing without, so I've added it back.
We need to create a few directories for the application itself, and for log and config files. Recipe files are just Ruby files, so we can use the Ruby language in chef recipes. We can add a Ruby hash for common variables:
common = {:name => "kayak", :app_root => "/u/apps/kayak"}
We'll use this throughout the recipe, as well as in templates.
We are hardcoding the app directory here because this recipe is specific to that Rails app
The directory
method creates a directory with optional attributes.
Add the following:
directory common[:app_root] do
owner "vagrant"
recursive true
end
This creates our root directory. The original video left out any step that created the
/u/apps
folder, so I've addedrecursive true
to the block to create it for us if it doesn't already exist. Note thatrecursive
doesn't apply theowner
to parent folders, but that's OK in this case as we want those folders to be owned byroot
, who we should besudo
d in with.
Normally, Capistrano would symlink the latest release into common
,
but we're going to customize this by using a git-based deploy that
checks out a git repo inside the current
directory
The video had us creating
common[:app_root]+"/common"
, but we don't need it so I've left it off. Thecurrent
directory will be created by Capistrano as necessary, withvagrant
as the owner, so we don't need to do that here.
We need to add a few more directories. We can use a bit of Ruby metaprogramming:
The steps in the video assumed that
common[:app_root]+"/shared"
would be created recursively in the loop below (using therecursive true
command), but I found that lead to permissions errors later when deploying with Capistrano sinceshared
was not owned byvagrant
. (See note above aboutrecursive
permissions.) So we'll create it explicitly here.
directory common[:app_root]+"/shared" do
owner "vagrant"
end
%w(config log tmp sockets pids).each do |dir|
directory "#{common[:app_root]}/shared/#{dir}"
recursive true # create parent directories as needed
mode 0755
end
We're leaving an intentional bug in our code so we can debug it. Also, we won't use that recursive statement. It will be replaced below.
Chef is to servers as Unit tests are to code, and like unit tests you should run your chef recipes often along the way to help you catch errors along the way.
The last time we ran chef-solo on the VM we specified a run list using JSON. We want to update that to specify the new recipe we're building.
sudo su
vim /etc/chef/node.json
Then inside that file:
{
"run_list": ["recipe[rails]"]
}
Note that we don't need to add it and keep the old recipe since the new recipe includes the old recipe as a dependency. The downside to this is that simply looking at a runlist doesn't always tell you all of the recipes that will be run since each recipe might run other recipes.
We also need to setup our apps directory on the VM
mkdir /u
mkdir /u/apps
Now we can run our cookbook on the VM:
chef-solo -l info -j /etc/chef/node.json
This time we see an error in the output:
================================================================================
Recipe Compile Error in /cookbooks/rails/recipes/default.rb
================================================================================
NoMethodError
-------------
No resource or method named `recursive' for `Chef::Recipe "default"'
Cookbook Trace:
---------------
/cookbooks/rails/recipes/default.rb:18:in `block in from_file'
/cookbooks/rails/recipes/default.rb:16:in `each'
/cookbooks/rails/recipes/default.rb:16:in `from_file'
Relevant File Content:
----------------------
/cookbooks/rails/recipes/default.rb:
11:
12: directory common[:app_root]+"/common" do
13: owner "vagrant"
14: end
15:
16: %w(config log tmp sockets pids).each do |dir|
17: directory "#{common[:app_root]}/shared/#{dir}"
18>> recursive true # create parent directories as needed
19: mode 0755
20: end
21:
Fix the code by passing in a block:
%w(config log tmp sockets pids).each do |dir|
directory "#{common[:app_root]}/shared/#{dir}" do
owner "vagrant"
mode 0755
end
end
Run chef-solo again, and it works! The output should include the following (snipped) confirmations:
Recipe: nginx::default
* package[nginx] action install[2014-03-18T19:34:53+01:00] INFO: Processing package[nginx] action install (nginx::default line 1)
(up to date)
* service[nginx] action enable[2014-03-18T19:34:53+01:00] INFO: Processing service[nginx] action enable (nginx::default line 3)
(up to date)
* service[nginx] action start[2014-03-18T19:34:53+01:00] INFO: Processing service[nginx] action start (nginx::default line 3)
(up to date)
* template[/etc/nginx/nginx.conf] action create[2014-03-18T19:34:54+01:00] INFO: Processing template[/etc/nginx/nginx.conf] action create (nginx::default line 8)
(up to date)
Recipe: unicorn::default
* gem_package[unicorn] action install[2014-03-18T19:34:54+01:00] INFO: Processing gem_package[unicorn] action install (unicorn::default line 1)
(up to date)
* directory[/etc/unicorn] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/etc/unicorn] action create (unicorn::default line 5)
(up to date)
* cookbook_file[/usr/local/bin/unicornctl] action create[2014-03-18T19:34:54+01:00] INFO: Processing cookbook_file[/usr/local/bin/unicornctl] action create (unicorn::default line 9)
(up to date)
Recipe: rails::default
* gem_package[bundler] action install[2014-03-18T19:34:54+01:00] INFO: Processing gem_package[bundler] action install (rails::default line 4)
(up to date)
* directory[/u/apps/kayak] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/u/apps/kayak] action create (rails::default line 8)
[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak] created directory /u/apps/kayak
- create new directory /u/apps/kayak[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak] owner changed to 1000
- change owner from '' to 'vagrant'
* directory[/u/apps/kayak/common] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/u/apps/kayak/common] action create (rails::default line 13)
[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/common] created directory /u/apps/kayak/common
- create new directory /u/apps/kayak/common[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/common] owner changed to 1000
- change owner from '' to 'vagrant'
* directory[/u/apps/kayak/shared/config] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/u/apps/kayak/shared/config] action create (rails::default line 18)
[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/config] created directory /u/apps/kayak/shared/config
- create new directory /u/apps/kayak/shared/config[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/config] mode changed to 755
- change mode from '' to '0755'
* directory[/u/apps/kayak/shared/log] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/u/apps/kayak/shared/log] action create (rails::default line 18)
[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/log] created directory /u/apps/kayak/shared/log
- create new directory /u/apps/kayak/shared/log[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/log] mode changed to 755
- change mode from '' to '0755'
* directory[/u/apps/kayak/shared/tmp] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/u/apps/kayak/shared/tmp] action create (rails::default line 18)
[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/tmp] created directory /u/apps/kayak/shared/tmp
- create new directory /u/apps/kayak/shared/tmp[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/tmp] mode changed to 755
- change mode from '' to '0755'
* directory[/u/apps/kayak/shared/sockets] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/u/apps/kayak/shared/sockets] action create (rails::default line 18)
[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/sockets] created directory /u/apps/kayak/shared/sockets
- create new directory /u/apps/kayak/shared/sockets[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/sockets] mode changed to 755
- change mode from '' to '0755'
* directory[/u/apps/kayak/shared/pids] action create[2014-03-18T19:34:54+01:00] INFO: Processing directory[/u/apps/kayak/shared/pids] action create (rails::default line 18)
[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/pids] created directory /u/apps/kayak/shared/pids
- create new directory /u/apps/kayak/shared/pids[2014-03-18T19:34:54+01:00] INFO: directory[/u/apps/kayak/shared/pids] mode changed to 755
- change mode from '' to '0755'
Now we need to make a Unicorn config file. We'll base it on a template.
The unicorn cookbook defines attributes that we can access here. We access them via the standard node object, but this time without the leading "@" sigil.
It's good practice when writing Chef recipes to publish default config locations to other cookbooks by storing them as attributes within the corresponding cookbook.
# recipes/default.rb
template "#{node[:unicorn][:config_path]}/#{common[:name]}.conf.rb" do
mode 0644 # readable/writeable by owner, readable by others, not executable
end
We're storing the config file on the server under the application name
(common[:name]
), but that could be confusing to have several files in
this cookbook named kayak.conf.rb
- exactly what is being configured?
We can specify a specific template source name that is more obvious:
# recipes/default.rb
template "#{node[:unicorn][:config_path]}/#{common[:name]}.conf.rb" do
mode 0644 # readable/writeable by owner, readable by others, not executable
source "unicorn.conf.erb"
end
Using a generic name (unicorn
versus kayak
) means we should also use
this cookbook to configure other rails applications, and we'd only need
to change the application name in one place.
It would be handy to have access to the common variables within the template:
# recipes/default.rb
template "#{node[:unicorn][:config_path]}/#{common[:name]}.conf.rb" do
mode 0644 # readable/writeable by owner, readable by others, not executable
source "unicorn.conf.erb"
variables common
end
Remember that variables
accepts a hash, and common
is a hash, so now
we have access to @name
and @app_root
within our template.
To create the template:
mkdir rails/templates/default
# Shortcut -v
git checkout rails/templates/default/unicorn.conf.erb
The unicorn.conf.erb
file contains some interesting syntax:
app_root = "<%= @app_root %>"
worker_processes 10
working_directory "#{app_root}/current"
preload_app true
timeout 300
listen "#{app_root}/shared/sockets/unicorn.sock", :backlog => 2048
pid "#{app_root}/shared/pids/unicorn.pid"
stderr_path "#{app_root}/shared/log/unicorn.log"
stdout_path "#{app_root}/shared/log/unicorn.log"
if GC.respond_to?(:copy_on_write_friendly=)
GC.copy_on_write_friendly = true
end
# handle zero-downtime restarts
before_fork do |server, worker|
old_pid = "#{server.config[:pid]}.oldbin"
if old_pid != server.pid
begin
sig = (worker.nr + 1) >= server.worker_processes ? :QUIT : :TTOU
Process.kill(sig, File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
end
end
end
after_fork do |server, worker|
# set process title to application name and git revision
revision_file = "#{Rails.root}/REVISION"
if ENV['RAILS_ENV'] != 'development' && File.exists?(revision_file)
ENV["UNICORN_PROCTITLE"] = "<%= @name || "unicorn" %> " + File.read(revision_file)[0,6]
$0 = ENV["UNICORN_PROCTITLE"]
end
# reset sockets created before forking
ActiveRecord::Base.establish_connection
end
before_exec do |server|
Dir.chdir("#{app_root}/current")
end
Note that on line 1 we're accessing a instance variable named
@app_root
, but on line 4 we're accessing a local variable named
app_root
. This is because the unicorn configuration file is
itself a Ruby script! So we're using Ruby (ERB) to write Ruby (the
.rb
config file).
This is called "passive code generation"
Also on line 35 we're using the @name
variable, but setting a default
value in case it isn't defined.
ENV["UNICORN_PROCTITLE"] = "<%= @name || "unicorn" %> " + File.read(revision_file)[0,6]
We know that we can set default values in the recipe file, or using default attributes. Setting default values in the template make attributes more bullet proof, but it comes at the cost of possibly spreading defaults over several different files. Keep this in mind as you develop a strategy for writing new cookbooks and templates.
Now that the template is in place, let's run chef again:
chef-solo -l info -j /etc/chef/node.json
We can see from the output that the template file has been written out
to the server. We can confirm using vim /etc/unicorn/kayak.conf.rb
.
Now we need to configure the nginx configuration template for this
application's virtual host. Nginx can serve many websites at once; we
need to create a config file specific for this application that tells
nginx how to handle requests for http://kayak.test
. Nginx is pre-
configured to look for configuration files in the
/etc/nginx/sites-available/
folder.
# recipes/default.rb
nginx_config_path = "/etc/nginx/sites-available/#{common[:name]}.conf"
template nginx_config_path do
mode 0644
source "nginx.conf.erb"
variables common.merge(:server_names => "kayak.test")
notifies :reload, "service[nginx]"
end
To create the template:
# Shortcut -v
git checkout rails/templates/default/nginx.conf.erb
The nginx.conf.erb
file contains:
upstream <%= @name %> {
server unix:/u/apps/<%= @name %>/shared/sockets/unicorn.sock fail_timeout=0;
}
server {
listen 80;
server_name <%= @server_names %>;
root /u/apps/<%= @name %>/current/public;
access_log /u/apps/<%= @name %>/shared/log/access.log combined;
error_log /u/apps/<%= @name %>/shared/log/error.log;
location / {
if (-f $request_filename) {
break;
}
if (!-f $request_filename) {
proxy_pass http://<%= @name %>;
break;
}
}
error_page 500 502 503 504 /500.html;
error_page 404 /404.html;
location = /500.html {
root /u/apps/<%= @name %>/current/public;
}
location = /400.html {
root /u/apps/<%= @name %>/current/public;
}
}
As you can see, we're using the @name
and @server_name
variables to
configure this instance. We can go one step further and replace all
instances of /u/apps/<%= @name %>
to <%= @app_root %>
.
We have one more step: enable this site by linking it to the
sites-enabled
directory provided by Debian. We'll use the nginx_site
resource defined by the nginx cookbook. (See nginx/resources/site.rb
[where the resource is defined] and nginx/providers/site.rb
[where the
resource actions are defined])
# recipes/default.rb
nginx_site "kayak" do
config_path nginx_config_path
action :enable
end
The nginx_site
resource knows that the server should be reloaded
after this is done, but we won't see any reload notification in Chef's
log messages because this is a LWRP (Light Weight Resource Provider) and
for better or worse they don't log all of their actions.
Now we can run chef-solo
. If we were to run it again, we can see
that chef doesn't do anything else because it knows it doesn't need to.
Even though we've done all these steps, if we visit http://kayak.test
in our local browser we still get a 404 error. This is because we
haven't actually deployed our application yet.
Clone a preexisting Rails app from github into a new folder:
git clone https://github.com/jsierles/kayak.git
Why are we using a 3rd-party app to deploy when we've already got chef?
We use a special chef resource named chef-deploy
. It could be used
for first time deployment, but it's slower than using Capistrano. It's
also more sensitive to errors - if another chef recipe fails along the
way, your rails app will never get deployed.
Look at the config/deploy.rb
script:
require "bundler/capistrano"
require 'fast_git_deploy/enable'
set :application, "kayak"
set :repository, "https://github.com/jsierles/kayak.git"
set :deploy_to, "/u/apps/#{application}"
set :scm, :git
set :user, "vagrant"
set :branch, "master"
set :deploy_type, 'deploy'
set :use_sudo, false
default_run_options[:pty] = true
ssh_options[:forward_agent] = true
ssh_options[:keys] = [File.join(ENV["HOME"], ".vagrant.d", "insecure_private_key")]
role :app, "kayak.test"
role :web, "kayak.test"
role :db, "kayak.test", :primary => true
after "deploy:setup" do
deploy.fast_git_setup.clone_repository
run "cd #{current_path} && bundle install"
end
namespace :unicorn do
desc "Start unicorn for this application"
task :start do
run "cd #{current_path} && bundle exec unicorn -c /etc/unicorn/kayak.conf.rb -D"
end
end
namespace :deploy do
task :create_symlink do
# no-op to remove default symlink task, not needed by fast_git_deploy
end
end
Note that on line 2 we're using fast_git_deploy
which works a little
differently than the standard Capistrano deploy task. It is recommended
as a faster alternative to the standard deployment task.
On line 7 we set the user to "vagrant" since that user already exists.
On line 16 an ssh option is set to use the vagrant demo key since this is a private deployment. This works since the key is installed on our VM, but this isn't what you'd want to use for a production deployment. We could just remove that line and we'd be authenticated instead as the current user.
On lines 18-21 we set the target server for all roles to "kayak.test",
which we setup before in /etc/hosts
.
On line 27 we have a helper task that starts the unicorn server using the unicorn config file we created in our rails recipe. Note that a production app would likely use a process manager to handle unicorn processes, but for this demo we'll just run it manually.
On line 35 it originally used
task :symlink
, but we must be using a newer version of Capistrano because it is instead callingdeploy::create_symlink
. I've updated the code above to override that task instead.
First we need to install Capistrano. On the local machine:
cd kayak
gem install bundler
bundle install
Now we'll deploy using deploy:cold
since this is a fresh server and
it's the first time we've deployed this application. It will:
- Create the applications folder at /u/apps/kayak/current
- Checkout the latest revision from git
From the local machine:
bundle exec cap deploy:cold
After a few seconds you should see it complete with
** transaction: commit
* executing `deploy:restart'
But if we try to load http://kayak.test
in our browser we still get an
error. We need to start Unicorn on the server. We can do that via the
unicorn:start
Capistrano task we created on line 27 of
config/deploy.rb
:
bundle exec cap unicorn:start
In a production app we'd use a daemon monitoring script (bluepill, monit, etc) to make sure Unicorn was up, and stayed up.
Attributes are variables or parameters given to Chef and are used for giving Chef instructions. They are passed in from the server, and from recipes themselves. They are great places for storing configuration values.
In the chapter on nginx, we setup default attributes and used them in our template for configuring the nginx service:
default[:nginx][:dir] = "/etc/nginx"
default[:nginx][:worker_processes] = 4
Chef's attributes are complex - the result of historical design decisions in early versions of Chef.
We know of three types of attributes:
- default
- normal
- override
Right now we're only concerned with default
and normal
attributes.
Overrides are rarely required, usually in emergency situations when
we're not sure where an attribute value is coming from and we need to
explicitly override the value to get something to work.
Default attributes are the attributes we define in the code inside our cookbooks. Normal attributes are defined either in the role or the runlist file on the chef client.
To demonstrate these two types of attributes, we'll setup a new cookbook that will create user accounts on the client. For each user, we'll create the user account and setup ssh keys for them to log into the server with.
As with earlier exercises, there's already a completed users
cookbook
in the sample code. We'll start by emptying the recipes
folder on the
local computer:
cd users
rm recipes/default.rb
touch recipes/default.rb
As with the rails recipes we wrote, we'll pull in other completed recipes to complete the tasks in this one. We'll use the ruby-shadow recipe which gives us the ability to work with unix passwords from Ruby.
# users/recipes/default.rb
require_recipe 'ruby-shadow'
Looking at the ruby-shadow
cookbook in TextMate, we see three
resources that we haven't seen before:
remote_directory
copies an entire directory from the cookbook to the remote server. Similar tocookbook_file
from theunicorn
recipe, which copied a single file to the node. (remote_file
is another resource that copies files from remote URLs.) Thesource
method points to the directory that should be copied, relative to the cookbook'sfiles/default
directory.not_if
is a "meta resource" or "meta command", which is useful for any type of resource. This one will cause theremote_directory
command to be skipped if the files already exist on the remote node. This is important for keeping this cookbook idempotent.- The
bash
resource will run a shell command with the options passed in the block.
We'll start with one user, but we can imagine that we may have more users to setup in the future. First we'll define our user attributes. We can use a runlist in the root of our cookbooks directory:
# run_list.json
{
"run_list": ["rails", "users"],
"users": {
"joshua": {
"ssh_keys": {"mypublickey": "a_long_key_value"},
"password": "$1$rIVP8MzN$C6A/X26wSngSNIJNLjjzc/"
}
}
}
To generate the encrypted password, which will work with ruby-shadow and won't expose sensitive data, we can use the openssl command in unix:
openssl passwd -1 mypass
These attributes will be passed into the recipe when we run the recipe
using chef-solo. They will be available via the node
method. This
method gives us access not only to these attributes, but also to all
other attributes from the underlying attributes, called "underlying
attributes".
As we saw when setting attributes in Chapter 5, attributes can be referenced using symbols (as below), or string keys, or using dot notation. This type of object is called a
Mash
, meaning a syntactically flexible data structure.
# users/recipes/default.rb
require_recipe 'ruby-shadow'
node[:users].each do |name, conf|
home_dir = "/home/#{name}"
user name do
password conf[:password]
action [:create]
end
directory home_dir do
owner name
mode 0700
end
directory "#{home_dir}/.ssh" do
owner name
mode 0700
end
template "#{home_dir}/.ssh/authorized_keys" do
owner name
variables keys: conf[:ssh_keys]
mode 0600
end
end
Chef ships with a built-in user
resource, which we see above. We pass
in a user name and provide a password and an action. Other actions are
available, but this is all we need for our purposes.
The template itself is already available in
users/templates/default/authorized_keys.erb
.
<% @keys.each do |name, key| %>
# <%= name %>
<%= key %>
<% end %>
With everything in place, we can run our recipe on the VM. Since our
cookbooks directory is mirrored in the VM as /cookbooks
, we can
reference our runlist directly
chef-solo -l info -j /cookbooks/run_list.json
We can verify this by changing to the "joshua" user on the VM:
su - joshua
pwd
# => /home/joshua
cat .ssh/authorized_keys
# => # mypublickey
# => a_long_key_value
ohai
is a command-line tool for builds a data structure of automatic
attributes from your system.
Automatic attributes behave like all other attributes in the chef
system, except they are read-only. When chef was installed on the node
it installed another gem named ohai
. We can test this on the VM:
ohai
The output is a huge json list of attributes:
{
"languages": {
"ruby": {
"platform": "x86_64-linux",
"version": "1.9.3",
"release_date": "2012-02-16",
"target": "x86_64-unknown-linux-gnu",
"target_cpu": "x86_64",
"target_vendor": "unknown",
"target_os": "linux",
"host": "x86_64-unknown-linux-gnu",
"host_cpu": "x86_64",
"host_os": "linux-gnu",
"host_vendor": "unknown",
"bin_dir": "/usr/local/bin",
"ruby_bin": "/usr/local/bin/ruby",
"gems_dir": "/usr/local/lib/ruby/gems/1.9.1",
"gem_bin": "/usr/local/bin/gem"
},
"perl": {
"version": "5.10.1",
"archname": "x86_64-linux-gnu-thread-multi"
},
"python": {
"version": "2.6.6",
"builddate": "Dec 26 2010, 22:31:48"
}
},
# ...
We can filter them like this:
ohai ipaddress
# => [
# => "10.0.2.15"
# => ]
This will access the top level of the json hash. The hash will be merged into all the other attributes on every chef run. If you use chef-server, these attributes will be stored on the server so you can query your server farm by their attributes. Even for a single server, this is still very useful information for chef internally. Chef uses it to figure out how to install the right packages based on the platform architecture, find out what version of a package is installed, etc.
Inside your recipes, you can reference these through the node
method.
For an example look at ruby-shadow/recipes/default.rb
:
not_if { File.exists?("#{node[:languages][:ruby][:bin_dir].gsub(/bin$/, "lib/ruby/site_ruby/1.9.1/")}#{node[:languages][:ruby][:platform]}/shadow.so") }
node[:languages][:ruby][:bin_dir]
is an attribute defined by the
ohai
application.
We can see the raw output of that attributes:
ohai languages
# => {
# => "ruby": {
# => "platform": "x86_64-linux",
# => "version": "1.9.3",
# => "release_date": "2012-02-16",
# => "target": "x86_64-unknown-linux-gnu",
# => "target_cpu": "x86_64",
# => "target_vendor": "unknown",
# => "target_os": "linux",
# => "host": "x86_64-unknown-linux-gnu",
# => "host_cpu": "x86_64",
# => "host_os": "linux-gnu",
# => "host_vendor": "unknown",
# => "bin_dir": "/usr/local/bin",
# => "ruby_bin": "/usr/local/bin/ruby",
# => "gems_dir": "/usr/local/lib/ruby/gems/1.9.1",
# => "gem_bin": "/usr/local/bin/gem"
# => },
# => "perl": {
# => "version": "5.10.1",
# => "archname": "x86_64-linux-gnu-thread-multi"
# => },
# => "python": {
# => "version": "2.6.6",
# => "builddate": "Dec 26 2010, 22:31:48"
# => }
# => }
What we've learned so far is great for setting up a single server. But what if we have different servers with different specs and uses? This is where chef roles come in.
Create a roles
directory in our cookbooks directory. Normally we
shouldn't store roles directly alongside cookbooks, because roles aren't
part of cookbooks. But we're going to put it here for our demo because
this directory is already mounted on the VM.
Create a file in new directory named appserver.json
. We'll imagine
that we have an application server that we may have 3 or 4 of. We want
the same attributes to appear on all app servers. Roles can also be
created as Ruby files, but we'll stick with JSON for now.
# roles/appserver.json
{
"name": "appserver",
"description": "Rails application server",
"run_list": ["recipe[rails]", "recipe[users]"],
"default_attributes": {
},
}
We know that there is a hierarchy of attributes. In the run_list.json
file we set normal
attributes at the root of the data structure. Node
attributes will override those coming from the role if the same keyname
is used for any attribute. Let's copy the "users" hash from that file
and paste it into appserver.json
:
# roles/appserver.json
{
"name": "appserver",
"description": "Rails application server",
"run_list": ["recipe[rails]", "recipe[users]"],
"default_attributes": {
"users": {
"joshua": {
"ssh_keys": {"mypublickey": "a_long_key_value"},
"password": "$1$rIVP8MzN$C6A/X26wSngSNIJNLjjzc/",
},
},
},
}
Any server that uses this role will automatically have access to this hash of users. In addition to attributes, we can also specify a run list for all of our app servers. We can also nest run lists inside other run lists - for instance, a role can include another role.
We can now remove the node's runlist and tell it to use the role's runlist instead.
# run_list.json
{
"run_list": ["role[appserver]"],
}
Since we're using JSON and not Ruby to define our appserver role, we need to add one more line:
# roles/appserver.json
{
"json_class": "Chef::Role",
"name": "appserver",
"description": "Rails application server",
"run_list": ["recipe[rails]", "recipe[users]"],
"default_attributes": {
"users": {
"joshua": {
"ssh_keys": {"mypublickey": "a_long_key_value"},
"password": "$1$rIVP8MzN$C6A/X26wSngSNIJNLjjzc/",
},
},
},
}
"json_class" connects this data with the corresponding Ruby class for roles. It won't work without this.
Now we're ready to tell chef-solo about the role. Edit the chef-solo configuration file on the VM
vim /etc/chef/solo.rb
We'll add a role path to point to the roles directory:
cookbook_path "/cookbooks"
role_path "/cookbooks/roles"
Now we're ready to run chef-solo with our modified run list:
chef-solo -l info -j /cookbooks/run_list.json
It's important to understand how attributes interact along the hierarchy of cookbook, environment, role and node.
Let's update one of our nodes to use different SSH keys. Since we've already set the default attributes on the role for the "joshua" user, let's copy the "users" hash back to the original run list:
# run_list.json
{
"run_list": ["role[appserver]"],
"users": {
"joshua": {
"ssh_keys": {
"mypublickey": "different_value",
"another_key": "new_value",
},
"password": "$1$rIVP8MzN$C6A/X26wSngSNIJNLjjzc/",
},
},
}
These will be merged together by Chef using a deep merge. Back on the VM:
chef-solo -l info -j /cookbooks/run_list.json
The rendered template for authorized_keys has been backed up and re- written.
cat ~joshua/.ssh/authorized_keys
# => # mypublickey
# => different_value
# => # another_key
# => new_value
Roles aren't limited to a single role. Let's say we have a small server farm, and we want to define an app server, a database server, and a background job worker server. If we don't have many machines, all three could use the same machine. Or we could nest roles so that one role defers some or all of its behavior to another role.
Let's make a big app server that can run many more nginx worker
processes than a normal app server. We'll call this role bigappserver
touch roles/bigappserver.json
In a text editor:
# roles/bigappserver.json
{
"json_class": "Chef::Role",
"name": "bigappserver",
"description": "Big rails application server",
"run_list": ["role[appserver]"],
"default_attributes": {
"nginx": {
"worker_processes": 10
}
}
}
"run_list": ["role[appserver]"]
is an instance of a nested role. Chef
will first run the appserver role, then the bigappserver role.
To use our new role, just update the run list:
# run_list.json
{
"run_list": ["role[bigappserver]"],
"users": {
"joshua": {
"ssh_keys": {
"mypublickey": "different_value",
"another_key": "new_value"
},
"password": "$1$rIVP8MzN$C6A/X26wSngSNIJNLjjzc/"
}
}
}
On the VM:
chef-solo -l info -j /cookbooks/run_list.json
We see that the nginx config file has changed and the service was reloaded. Did it work?
head /etc/nginx/nginx.conf
# => user www-data;
# => worker_processes 10;
# =>
# => error_log /var/log/nginx/error.log;
# => pid /var/run/nginx.pid;
# =>
# => events {
# => worker_connections 1024;
# => # multi_accept on;
# => }
We'll examine cookbook_file
and remote_directory
- two useful
resources. One works with files, one with directories, but their
operation is essentially the same. (Their names are different for
historical reasons.)
cookbook_file
takes a file out of the cookbook's files/default
folder and copies it to the node. Example:
# unicorn/recipes/default.rb
cookbook_file "/usr/local/bin/unicornctl" do
mode 0755
end
What if we have several scripts and we'd like to install them all in a
directory. We can use remote_directory
to do this without having to
specify every file name. Let's create a new folder from our unicorn
recipe
# unicorn/recipes/default.rb
# ...
remote_directory "/usr/local/myscripts" do
files_mode 0755
end
files_mode
specifies the permissions on the files in the directory,
not the directory itself.
Create the files directory:
cd unicorn/files/default
mkdir myscripts
touch myscripts/script1
touch myscripts/script2
Now in the VM:
chef-solo -l info -j /cookbooks/run_list.json
ls -Al /usr/local/myscripts/
# => -rwxr-xr-x 1 root staff 0 Mar 19 20:04 script1
# => -rwxr-xr-x 1 root staff 0 Mar 19 20:04 script2
Let's say we're debugging and running into problems. Chef has a special debug log:
chef-solo -h
This shows us available options. We see that the -l
option lets us
specify the logger output level.
chef-solo -l debug -j /cookbooks/run_list.json
The output is much more verbose than the normal run.
There's also a new feature: whyrun mode. This mode attempts to show what would be changed without executing a real run.
Note that in the video this option is referred to as
--whyrun
, but this was not a valid option on my install.
chef-solo --why-run
# => Starting Chef Client, version 11.10.4
# => Compiling Cookbooks...
# => Converging 0 resources
# =>
# => Running handlers:
# => Running handlers complete
# =>
# => Chef Client finished, 0/0 resources would have been updated
Because recipes are interdependent and dependent on other parts of the system, such as running services, installed packages, or compiled libraries, there's always a possibility that this option won't work for certain resources or recipes.
To demonstrate, change the SSH key in our run list:
{
"run_list": ["role[bigappserver]"],
"users": {
"joshua": {
"ssh_keys": {
"mypublickey": "yet_another_value",
"another_key": "new_value"
},
"password": "$1$rIVP8MzN$C6A/X26wSngSNIJNLjjzc/"
}
}
}
Then on the VM:
chef-solo -l info -j /cookbooks/run_list.json --why-run
It finishes with:
Chef Client finished, 1/26 resources would have been updated
Looking at the rest of the output shows us:
- Would update content in file /home/joshua/.ssh/authorized_keys from ffab71 to 0893ed
--- /home/joshua/.ssh/authorized_keys 2014-03-19 19:42:09.000000000 +0100
+++ /tmp/chef-rendered-template20140319-8977-12jmtwh 2014-03-19 20:18:10.000000000 +0100
@@ -1,5 +1,5 @@
# mypublickey
-different_value
+yet_another_value
# another_key
new_value
As useful as automating things is, one of the worst things is automating a mistake and repeating that mistake across servers! So this option is useful for previewing changes before committing them.
Remote run lists let you pull cookbooks and run lists from remote URLs instead of from the local file system. So far we've had the convenience of running recipes from a folder mounted on our VM. But in production, we need another way to access run lists and cookbooks.
One solution is to write your own web service that stores your run lists and can serve the proper JSON run list for each node. This is one way that EngineYard hosting service originally setup their cloud service using chef-solo and an application that would generate run lists for servers.
To do this, we need to serve a run list from a URL. Let's just use our kayak server to do this. On the VM:
cp /cookbooks/run_list.json /u/apps/kayak/current/public/
Since kayak.test
is a fake domain name, we also need to tell this node
about it.
vim /etc/hosts
Edit the first line to add an alias to localhost:
127.0.0.1 localhost kayak.test
Now to run chef-solo we just need to specify a URL instead of a file.
chef-solo -l info -j http://kayak.test/run_list.json
It's good practice to use an encrypted SSL source for security, along
with authentication. You can also use the -r
flag in chef-solo for
retrieving gzipped run list files from a remote URL.