Here is a description of the steps I took to cause knife bootstrap
to exit with error when the remote bootstrap process fails. I was
already calling the bootstrap programmatically (because I was adding a lot
of business logic to it) so I didn't patch the bootstrap class but
pulled out the pieces I needed.
Chef uses an ERB template to construct the bootstrap remote
command. We need to add set -e
to this script for it to exit with
error whenever a command is exited with non-zero exit code. Another
option would be to connect the various commands with &&
. The main
idea is to stop the execution of the script upon command failure (at
least the commands you care about). Otherwise you'll always get the
output of the last command even if all the previous commands failed to
execute.
Here's my modifications to the default (in 0.10.8) template (you can compare it to the one included in the gem):
bash -c '
(
set -e
<%= "export http_proxy=\"#{knife_config[:bootstrap_proxy]}\"" if knife_config[:bootstrap_proxy] -%>
if [ ! -f /usr/bin/chef-client ]; then
apt-get update || true
apt-get install -y ruby ruby1.8-dev build-essential wget libruby-extras libruby1.8-extras
cd /tmp
wget <%= "--proxy=on " if knife_config[:bootstrap_proxy] %>http://production.cf.rubygems.org/rubygems/rubygems-1.6.2.tgz
tar zxf rubygems-1.6.2.tgz
cd rubygems-1.6.2
ruby setup.rb --no-format-executable
fi
gem update --no-rdoc --no-ri
gem install ohai --no-rdoc --no-ri --verbose
gem install chef --no-rdoc --no-ri --verbose <%= bootstrap_version_string %>
mkdir -p /etc/chef
(
cat <<'EOP'
<%= validation_key %>
EOP
) > /tmp/validation.pem
awk NF /tmp/validation.pem > /etc/chef/validation.pem
rm /tmp/validation.pem
<% if @chef_config[:encrypted_data_bag_secret] -%>
(
cat <<'EOP'
<%= encrypted_data_bag_secret %>
EOP
) > /tmp/encrypted_data_bag_secret
awk NF /tmp/encrypted_data_bag_secret > /etc/chef/encrypted_data_bag_secret
rm /tmp/encrypted_data_bag_secret
<% end -%>
(
cat <<'EOP'
<%= config_content %>
EOP
) > /etc/chef/client.rb
(
cat <<'EOP'
<%= { "run_list" => @run_list }.to_json %>
EOP
) > /etc/chef/first-boot.json
<%= start_chef %>
)'
I use the Chef::Knife::Bootstrap
class to render the template. The
options are the ones I was interested in. See the source of the
Chef::Knife::Bootstrap
class to get a list of all available
configuration options.
require 'chef'
require 'chef/knife'
require 'chef/knife/bootstrap'
require "chef/knife/core/bootstrap_context"
# Using knife's facilities to generate the ssh command to bootstrap our
# machines. We don't use knife directly because it doesn't report bad
# exit code from the remote ssh commands.
#
# params:
# * name - The name to register in chef as
# * options - A Hash of options. Accepts the following options:
# - :run_list => The run list.
# - :env => Chef environment to use.
# - :template_file => Knife bootstrap template file to use.
def generate_ssh_bootstrap_command(name, options)
kb = Chef::Knife::Bootstrap.new
Chef::Config[:environment] = options[:env]
kb.config[:run_list] = options[:run_list]
kb.config[:use_sudo] = true
kb.config[:template_file] = options[:template_file]
kb.config[:chef_node_name] = name
kb.ssh_command
end
The function below returns the exit code of the remote ssh command. It's also returns the output. Later we'll see how to use it.
(This function is copied from code I found in StackOverflow).
# don't forget to require 'net/ssh'!
# A utility method that helps getting output code from ssh remote
# command. Idea taken from stackoverflow.com.
#
# params:
# * ssh - Net::SSH object to use (see example below)
# * command - The command (string) to execute.
#
# You can optionally add a block that accepts data. The data is both stderr
# and stdout, so you can add a block with 'print data' to see on the console
# both stdout and error.
#
# The stdout and stderr are combined because some commands output
# to stderror (like wget).
#
# Returns: data(stdout and err combined), exit_code, exit_signal(what is this?)
#
# Sample usage:
#
# Net::SSH.start(server, Etc.getlogin) do |ssh|
# puts ssh_exec!(ssh, "true").inspect
# # => ["", "", 0, nil]
# puts ssh_exec!(ssh, "false").inspect
# # => ["", "", 1, nil]
# end
def ssh_exec!(ssh, command)
output_data = ""
exit_code = nil
exit_signal = nil
ssh.open_channel do |channel|
channel.exec(command) do |ch, success|
unless success
abort "FAILED: couldn't execute command (ssh.channel.exec)"
end
channel.on_data do |ch,data|
output_data+=data
if block_given?
yield data
end
end
channel.on_extended_data do |ch,type,data|
output_data+=data
if block_given?
yield data
end
end
channel.on_request("exit-status") do |ch,data|
exit_code = data.read_long
end
channel.on_request("exit-signal") do |ch, data|
exit_signal = data.read_long
end
end
end
ssh.loop
[output_data, exit_code, exit_signal]
end
Here is how I call the above function. Notice I supply a block that
prints the output of the commands without newlines (say
is a utility
method in Thor).
Net::SSH.start(hostname, user) do |ssh|
output, exit_code, signal = ssh_exec!(ssh, command) do |data|
say(data, {}, false)
end
if exit_code != 0
error = "Bootstrap script exited with exit code: #{exit_code}"
log.error("output of failed bootstrap command:\n#{output}")
raise error
end
end