One of the earliest lessons in a users' journey with using Chef is how to bootstrap a node with knife. Most of the tutorials will very reasonably illustrate some form of knife bootstrap
but the immediate follow on question is often, how do I do this in production? The answer to that question is less directly clear because different infrastructure needs will lend themselves to different approaches. With the following I want to break down "bootstrapping" and illustrate a few approaches. This is not meant to be a comprehensive review of evertyhing possible but a deconstruction of the primitives in play.
Since Chef 12.2.0, the default bootstrapping method is "validatorless" - which as the name implies, exists in opposition to the fomer "validator" based approach. The older approach makes use of the unique org "validator" key that is created as part of an organisation on chef-server, you may have seen it referenced as "validation.pem" or "org-validator", which allows any node that has the "validator" to self-register during itself during a run. The "validatorless" approach improves upon it in the following ways:
- No more need for the validation key
- No need to share it with other admins
- No need to later clean it up from nodes
- Use the workstation user's key to sign
- No more first run failure failure problems
- Node and client are pre-created, if the chef-run fails the run_list and env aren't lost
- Prompt if node/client already exist instead of failing
If getting getting "validator" style behaviour despite being on recent build, make sure knife.rb
doesn't have validation_client_name
or validation_key
set. In the general case you don't have to do anything and get the "validatorless" behaviour but we're covering this nuance now as we'll reference it later.
In the general case, a user will an invocation like the following to bootstrap a non-Windows node using knife:
knife bootstrap 172.X.X.X --sudo --ssh-user notroot --ssh-password 'totallybadpass' --node-name potato-1 --run-list 'recipe[potatoes]'
The bootstrap
sub-command presumes we're going to reach out to the node via SSH. If the machine being used can't reach the IP address or FQDN, this is game over right away.
In our simple example we are going to SSH to the target node as the user notroot
using the supplied password and use sudo
to elevate our privileges so we can install chef-client and other tasks once we are on the node.
Now that we're on the node we want to set it's node name to potato-1
and it's run list to the recipe potatoes
.
*Note: A common source of confusion is that if/when the FQDN of a node is changed, the corresponding node name is not changed. This is expected behaviour as the node name is purely for internal chef/chef-server use. To get a node's current FQDN, use the Ohai provided attribute node['fqdn']
.
We're now getting to the heart of the knife bootstrap
process which is a script, populated by is this ERB template https://github.com/chef/chef/blob/master/lib/chef/knife/bootstrap/templates/chef-full.erb (Windows nodes use https://github.com/chef/knife-windows/blob/master/lib/chef/knife/bootstrap/windows-chef-client-msi.erb)
- Install chef-client if it's not already present
- Create client.pem or validation.pem (see above)
- Copy an
encrypted_data_bag_secret
if provided - Copy
trusted_certs
if provided - Copy Ohai hints if provided
- Create client.rb
- Create firstboot.json
- Copy
client.d
files if provided - Run chef-client (ex.
chef-client -j /etc/chef/first-boot.json -E _default
)
We can combine a few of the steps above into broader, common tasks
- Install chef-client
- Create client.rb, pointing at either
- a client key
- a validation key
- Run chef-client
Further we can see that the bootstrapping process centers around having the correct keys in place, either by having a node pre-created or having a validation key. The installation and running of chef