Modifying knife bootstrap (from chef) to exit with the exit code from the bootstrap command.

Modifying 'knife bootstrap' to exit with error upon failure.

Here is a description of the steps I took to cause knife bootstrap to exit with error when the remote bootstrap process fails. I was already calling the bootstrap programmatically (because I was adding a lot of business logic to it) so I didn't patch the bootstrap class but pulled out the pieces I needed.

The template.

Chef uses an ERB template to construct the bootstrap remote command. We need to add set -e to this script for it to exit with error whenever a command is exited with non-zero exit code. Another option would be to connect the various commands with &&. The main idea is to stop the execution of the script upon command failure (at least the commands you care about). Otherwise you'll always get the output of the last command even if all the previous commands failed to execute.

Here's my modifications to the default (in 0.10.8) template (you can compare it to the one included in the gem):

bash -c '
(
set -e

<%= "export http_proxy=\"#{knife_config[:bootstrap_proxy]}\"" if knife_config[:bootstrap_proxy] -%>

if [ ! -f /usr/bin/chef-client ]; then
  apt-get update || true
  apt-get install -y ruby ruby1.8-dev build-essential wget libruby-extras libruby1.8-extras
  cd /tmp
  wget <%= "--proxy=on " if knife_config[:bootstrap_proxy] %>http://production.cf.rubygems.org/rubygems/rubygems-1.6.2.tgz
  tar zxf rubygems-1.6.2.tgz
  cd rubygems-1.6.2
  ruby setup.rb --no-format-executable
fi

gem update --no-rdoc --no-ri
gem install ohai --no-rdoc --no-ri --verbose
gem install chef --no-rdoc --no-ri --verbose <%= bootstrap_version_string %>

mkdir -p /etc/chef

(
cat <<'EOP'
<%= validation_key %>
EOP
) > /tmp/validation.pem
awk NF /tmp/validation.pem > /etc/chef/validation.pem
rm /tmp/validation.pem

<% if @chef_config[:encrypted_data_bag_secret] -%>
(
cat <<'EOP'
<%= encrypted_data_bag_secret %>
EOP
) > /tmp/encrypted_data_bag_secret
awk NF /tmp/encrypted_data_bag_secret > /etc/chef/encrypted_data_bag_secret
rm /tmp/encrypted_data_bag_secret
<% end -%>

(
cat <<'EOP'
<%= config_content %>
EOP
) > /etc/chef/client.rb

(
cat <<'EOP'
<%= { "run_list" => @run_list }.to_json %>
EOP
) > /etc/chef/first-boot.json

<%= start_chef %>
)'

Rendering the template

I use the Chef::Knife::Bootstrap class to render the template. The options are the ones I was interested in. See the source of the Chef::Knife::Bootstrap class to get a list of all available configuration options.

require 'chef'
require 'chef/knife'
require 'chef/knife/bootstrap'
require "chef/knife/core/bootstrap_context"

# Using knife's facilities to generate the ssh command to bootstrap our
# machines. We don't use knife directly because it doesn't report bad
# exit code from the remote ssh commands.
#
# params:
# * name    - The name to register in chef as
# * options - A Hash of options. Accepts the following options:
#           - :run_list       => The run list.
#           - :env            => Chef environment to use.
#           - :template_file  => Knife bootstrap template file to use.
def generate_ssh_bootstrap_command(name, options)
  kb = Chef::Knife::Bootstrap.new
  Chef::Config[:environment] = options[:env]
  kb.config[:run_list]       = options[:run_list]
  kb.config[:use_sudo]       = true
  kb.config[:template_file]  = options[:template_file]
  kb.config[:chef_node_name] = name
  kb.ssh_command
end

Getting the exit code of remote ssh command

The function below returns the exit code of the remote ssh command. It's also returns the output. Later we'll see how to use it.

(This function is copied from code I found in StackOverflow).

# don't forget to require 'net/ssh'!


# A utility method that helps getting output code from ssh remote
# command. Idea taken from stackoverflow.com.
#
# params:
# * ssh     - Net::SSH object to use (see example below)
# * command - The command (string) to execute.
#
# You can optionally add a block that accepts data. The data is both stderr
# and stdout, so you can add a block with 'print data' to see on the console
# both stdout and error.
#
# The stdout and stderr are combined because some commands output
# to stderror (like wget).
#
# Returns: data(stdout and err combined), exit_code, exit_signal(what is this?)
#
# Sample usage:
#
#     Net::SSH.start(server, Etc.getlogin) do |ssh|
#       puts ssh_exec!(ssh, "true").inspect
#       # => ["", "", 0, nil]
#       puts ssh_exec!(ssh, "false").inspect  
#       # => ["", "", 1, nil]
#     end
def ssh_exec!(ssh, command)
  output_data = ""
  exit_code = nil
  exit_signal = nil
  ssh.open_channel do |channel|
    channel.exec(command) do |ch, success|
      unless success
        abort "FAILED: couldn't execute command (ssh.channel.exec)"
      end
      channel.on_data do |ch,data|
        output_data+=data
        if block_given?
          yield data
        end
      end

      channel.on_extended_data do |ch,type,data|
        output_data+=data
        if block_given?
          yield data
        end
      end

      channel.on_request("exit-status") do |ch,data|
        exit_code = data.read_long
      end

      channel.on_request("exit-signal") do |ch, data|
        exit_signal = data.read_long
      end
    end
  end
  ssh.loop
  [output_data, exit_code, exit_signal]
end

Sample Usage of the ssh_exec! command

Here is how I call the above function. Notice I supply a block that prints the output of the commands without newlines (say is a utility method in Thor).

Net::SSH.start(hostname, user) do |ssh|
  output, exit_code, signal =  ssh_exec!(ssh, command) do |data|
    say(data, {}, false)
  end
  if exit_code != 0
    error = "Bootstrap script exited with exit code: #{exit_code}"
    log.error("output of failed bootstrap command:\n#{output}")
    raise error
  end
end

babysnakes/README.md

Modifying 'knife bootstrap' to exit with error upon failure.

The template.

Rendering the template

Getting the exit code of remote ssh command

Sample Usage of the ssh_exec! command