Skip to content

Instantly share code, notes, and snippets.

@thieryl
Last active July 27, 2018 11:43
Show Gist options
  • Save thieryl/6a3f2543493208c8db06d9fe66db7442 to your computer and use it in GitHub Desktop.
Save thieryl/6a3f2543493208c8db06d9fe66db7442 to your computer and use it in GitHub Desktop.
[Yesterday I learned...] Continous learning #general

Diving deeper into Ansible


Sharing ideas about how to resolve issues is one of the best things we can do in the IT and open source world, so I went looking for help by submitting issues in Ansible and asking questions in roles others created.

Reading the documentation (including the following topics) is the best way to get started learning Ansible.

Getting started

Best practices

Ansible Lightbulb

Ansible FAQ

If you are trying to figure out what you can do with Ansible, take a moment and think about the daily activities you do, the ones that take a lot of time that would be better spent on other things. Here are some examples:

  • Managing accounts in systems: Creating users, adding them to the correct groups, and adding the SSH keys… these are things that used to take me days when we had a large number of systems to build. Even using a shell script, this process was very time-consuming.
  • Maintaining lists of required packages: This could be part of your security posture and include the packages required for your applications.
  • Installing applications: You can use your current documentation and convert application installs into tasks by finding the correct module for the job.
  • Configuring systems and applications: You might want to change /etc/ssh/sshd_config for different environments (e.g., production vs. development) by adding a line or two, or maybe you want a file to look a specific way in every system you're managing.
  • Provisioning a VM in the cloud: This is great when you need to launch a few virtual machines that are similar for your applications and you are tired of using the UI.

Now let's look at how to use Ansible to automate some of these repetitive tasks.

Managing users

If you need to create a large list of users and groups with the users spread among the different groups, you can use loops. Let's start by creating the groups:

- name: create user groups
  group:
    name: "{{ item }}"
  loop:
    - postgresql
    - nginx-test
    - admin
    - dbadmin
    - hadoop

You can create users with specific parameters like this:

- name: all users in the department
  user:
    name:  "{{ item.name }}"
    group: "{{ item.group }}"
    groups: "{{ item.groups }}"
    uid: "{{ item.uid }}"
    state: "{{ item.state }}"
  loop:
    - { name: 'admin1', group: 'admin', groups: 'nginx', uid: '1234', state: 'present' }
    - { name: 'dbadmin1', group: 'dbadmin', groups: 'postgres', uid: '4321', state: 'present' }
    - { name: 'user1', group: 'hadoop', groups: 'wheel', uid: '1067', state: 'present' }
    - { name: 'jose', group: 'admin', groups: 'wheel', uid: '9000', state: 'absent' }

Looking at the user jose, you may recognize that state: 'absent' deletes this user account, and you may be wondering why you need to include all the other parameters when you're just removing him. It's because this is a good place to keep documentation of important changes for audits or security compliance. By storing the roles in Git as your source of truth, you can go back and look at the old versions in Git if you later need to answer questions about why changes were made.

To deploy SSH keys for some of the users, you can use the same type of looping as in the last example.

- name: copy admin1 and dbadmin ssh keys
  authorized_key:
    user: "{{ item.user }}"
    key: "{{ item.key }}"
    state: "{{ item.state }}"
    comment: "{{ item.comment }}"
  loop:
    - { user: 'admin1', key: "{{ lookup('file', '/data/test_temp_key.pub'), state: 'present', comment: 'admin1 key' }
    - { user: 'dbadmin', key: "{{ lookup('file', '/data/vm_temp_key.pub'), state: 'absent', comment: 'dbadmin key' }
Here, we specify the user, how to find the key by using lookup, the state, and a comment describing the purpose of the key.

Installing packages

Package installation can vary depending on the packaging system you are using. You can use Ansible facts to determine which module to use. Ansible does offer a generic module called package that uses ansible_pkg_mgr and calls the proper package manager for the system. For example, if you're using Fedora, the package module will call the DNF package manager.

The package module will work if you're doing a simple installation of packages. If you're doing more complex work, you will have to use the correct module for your system. For example, if you want to ignore GPG keys and install all the security packages on a RHEL-based system, you need to use the yum module. You will have different options depending on your packaging module, but they usually offer more parameters than Ansible's generic package module.

Here is an example using the package module:

  - name: install a package
    package:
      name: nginx
      state: installed

The following uses the yum module to install NGINX, disable gpg_check from the repo, ignore the repository's certificates, and skip any broken packages that might show up.

  - name: install a package
    yum:
      name: nginx
      state: installed
      disable_gpg_check: yes
      validate_certs: no
      skip_broken: yes

Here is an example using Apt. The Apt module tells Ansible to uninstall NGINX and not update the cache:

  - name: install a package
    apt:
      name: nginx
      state: absent
      update_cache: no

You can use loop when installing packages, but they are processed individually if you pass a list:

  - name:
      - nginx
      - postgresql-server
      - ansible
      - httpd

NOTE: Make sure you know the correct name of the package you want in the package manager you're using. Some names change depending on the package manager.

Starting services

Much like packages, Ansible has different modules to start services. Like in our previous example, where we used the package module to do a general installation of packages, the service module does similar work with services, including with systemd and Upstart. (Check the module's documentation for a complete list.) Here is an example:

  - name: start nginx
    service: 
      name: nginx
      state: started

You can use Ansible's service module if you are just starting and stopping applications and don't need anything more sophisticated. But, like with the yum module, if you need more options, you will need to use the systemd module. For example, if you modify systemd files, then you need to do a daemon-reload, the service module won't work for that; you will have to use the systemd module.

  - name: reload postgresql for new configuration and reload daemon
    systemd:
      name: postgresql
      state: reload
      daemon-reload: yes

This is a great starting point, but it can become cumbersome because the service will always reload/restart. This a good place to use a handler.

If you used best practices and created your role using ansible-galaxy init "role name", then you should have the full directory structure. You can include the code above inside the handlers/main.yml and call it when you make a change with the application. For example:

handlers/main.yml

  - name: reload postgresql for new configuration and reload daemon
    systemd:
      name: postgresql
      state: reload
      daemon-reload: yes

This is the task that calls the handler:

  - name: configure postgresql
    template:
      src: postgresql.service.j2
      dest: /usr/lib/systemd/system/postgresql.service
    notify: reload postgresql for new configuration and reload daemon

It configures PostgreSQL by changing the systemd file, but instead of defining the restart in the tasks (like before), it calls the handler to do the restart at the end of the run. This is a good way to configure your application and keep it idempotent since the handler only runs when a task changes—not in the middle of your configuration.

This configures the database options on the file #app.ini# for Gitea. This is similar to writing Ansible tasks, even though it is a configuration file, and makes it easy to define variables and make changes. This can be expanded further if you are using group_vars, which allows you to define variables for all systems and specific groups (e.g., production vs. development). This makes it easier to manage variables, and you don't have to specify the same ones in every role.

Provisioning a system

We've gone over several things you can do with Ansible on your system, but we haven't yet discussed how to provision a system. Here's an example of provisioning a virtual machine (VM) with the OpenStack cloud solution.

  - name: create a VM in openstack
    osp_server:
      name: cloudera-namenode
      state: present
      cloud: openstack
      region_name: andromeda
      image: 923569a-c777-4g52-t3y9-cxvhl86zx345
      flavor_ram: 20146
      flavor: big
      auto_ip: yes
      volumes: cloudera-namenode

All OpenStack modules start with os, which makes it easier to find them. The above configuration uses the osp-server module, which lets you add or remove an instance. It includes the name of the VM, its state, its cloud options, and how it authenticates to the API. More information about cloud.yml is available in the OpenStack docs, but if you don't want to use cloud.yml, you can use a dictionary that lists your credentials using the auth option. If you want to delete the VM, just change state: to absent.

Say you have a list of servers you shut down because you couldn't figure out how to get the applications working, and you want to start them again. You can use os_server_action to restart them (or rebuild them if you want to start from scratch).

Here is an example that starts the server and tells the modules the name of the instance:

  - name: restart some servers
    os_server_action:
      action: start
      cloud: openstack
      region_name: andromeda
      server: cloudera-namenode

Most OpenStack modules use similar options. Therefore, to rebuild the server, we can use the same options but change the action to rebuild and add the image we want it to use:

  os_server_action:
    action: rebuild
    image: 923569a-c777-4g52-t3y9-cxvhl86zx345

Doing other things

There are modules for a lot of system admin tasks, but what should you do if there isn't one for what you are trying to do? Use the shell and command modules, which allow you to run any command just like you do on the command line. Here's an example using the OpenStack CLI:

  - name: run an opencli command
    command: "openstack hypervisor list"

They are so many ways you can do daily sysadmin tasks with Ansible. Using this automation tool can transform your hardest task into a simple solution, save you time, and make your work days shorter and more relaxed.

Git Tips and tricks


Git cli guides

thieryl@thieryl[12:42:19]: $ git help -g 
The common Git guides are:

   attributes   Defining attributes per path
   everyday     Everyday Git With 20 Commands Or So
   glossary     A Git glossary
   ignore       Specifies intentionally untracked files to ignore
   modules      Defining submodule properties
   revisions    Specifying revisions and ranges for Git
   tutorial     A tutorial introduction to Git (for version 1.5.1 or newer)
   workflows    An overview of recommended workflows with Git

'git help -a' and 'git help -g' list available subcommands and some
concept guides. See 'git help <command>' or 'git help <concept>'
to read about a specific subcommand or concept.

Git Aliases

git config --global alias.<handle> <command> 
git config --global alias.st status

list you git configuration

thieryl@thieryl[09:52:44]: $ git config --list 
[email protected]
user.name=thiery louison
alias.aa=add --all
alias.bv=branch -vv
alias.ba=branch -ra
alias.bd=branch -d
alias.ca=commit --amend
alias.cb=checkout -b
alias.cm=commit -a --amend -C HEAD
alias.cam=commit -am
alias.ci=commit -a -v
alias.co=checkout
alias.di=diff
alias.ll=log --pretty=format:%C(yellow)%h%Cred%d\ %Creset%s%Cblue\ [%cn] --decorate --numstat
alias.ld=log --pretty=format:%C(yellow)%h\ %C(green)%ad%Cred%d\ %Creset%s%Cblue\ [%cn] --decorate --date=short --graph
alias.ls=log --pretty=format:%C(green)%h\ %C(yellow)[%ad]%Cred%d\ %Creset%s%Cblue\ [%cn] --decorate --date=relative
alias.mm=merge --no-ff
alias.st=status --short --branch
alias.tg=tag -a
alias.pu=push --tags
alias.un=reset --hard HEAD
alias.uh=reset --hard HEAD^
color.ui=auto

Get git bash completion

curl http://git.io/vfhol > ~/.git-completion.bash && echo '[ -f ~/.git-completion.bash ] && . ~/.git-completion.bash' >> ~/.bashrc

List of the remote branches

To see all the branches, try the following command:

thieryl@thieryl[09:50:31]: $ git branch -a 
  master
* my_newfeature
  remotes/origin/master
  remotes/tlo/master
[~/.dotfiles]

Tracking remote branch and creating local branch with the same name.

git checkout -t origin/{{branch_name}}

Remember the branch structure after a local merge

git merge --no-ff some-branch-name

Modify previous commit without modifying the commit message

git add --all && git commit --amend --no-edit

Add a local branch tracking the remote branch.

$ git branch --track style origin/style
Branch style set up to track remote branch style from origin.
$ git branch -a
  style
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/style
  remotes/origin/master
$ git hist --max-count=2
* 2faa4ea 2011-03-09 | Changed README in original repo (HEAD, origin/master, origin/HEAD, master) [Thiery Louison]
* 6e6c76a 2011-03-09 | Updated index.html (origin/style, style) [Thiery Louison]

Map a local branch tracking to a remote branch

thieryl@thieryl[09:47:38]: $ git checkout -b my_newfeature -t tlo/master
thieryl@thieryl[09:50:23]: $ git status -v 
On branch my_newfeature
Your branch is up to date with 'tlo/master'.

nothing to commit, working tree clean
[~/.dotfiles]
thieryl@thieryl[09:50:31]: $ 

thieryl@thieryl[09:48:43]: $ git ld
* e5480c7 2018-07-19 (HEAD -> my_newfeature, tlo/master, origin/master, master) Add new git alias [thiery louison]
* 508b1b4 2018-07-19 Add .gitconfig file to repo [thiery louison]
* 33ac493 2018-07-04 Add new functions [thiery louison]
* c0026b1 2018-07-02 Add new aliases and new variables [thiery louison]
*   210c4ed 2018-06-08 Merge branch 'master' of github.com:thieryl/dotfiles [thiery louison]
|\  
| * 9598e8c 2018-05-23 Delete .bashrc_old [GitHub]
| * ab974f7 2018-05-23 Update .bashrc_old [GitHub]
* | d832d62 2018-06-08 Add new aliases [thiery louison]
|/  
* 8a2acd3 2018-05-23 Add .bash_promt [thiery louison]
* c4fc360 2018-05-23 Add new alias [thiery louison]
* 2d1d74c 2017-09-25 Modified some functionality [thiery louison]
* 9ba5d65 2017-09-12 Add the password generator pgen to the .bash_aliases file [thiery louison]
* 1cf2a81 2017-08-31 Add new alias for git [thiery louison]
* 8033abc 2017-08-31 Add new alias and new functions [thiery louison]
* 43ba75d 2017-08-31 Initial commit [thiery louison]

Five Minute Server Troubleshooting

Check if too many cooks are spoiling the broth

This will show who is logged on and what they are doing.

rbd_thieryl@proxy[07:07:21]: $ w
 07:07:24 up 175 days, 30 min, 11 users,  load average: 0.00, 0.00, 0.00
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
rbd_misb pts/0    197.224.132.11   05:22    6:28   0.05s  0.05s ssh [email protected]
rbd_fabi pts/1    197.224.132.11   05:38    5:00   0.22s  0.22s ssh -A 52.17.65.153
rbd_misb pts/2    197.224.132.11   06:30    6:28   0.05s  0.05s ssh [email protected]
rbd_misb pts/3    197.224.132.11   06:24    6:28   0.06s  0.06s ssh [email protected]
rbd_misb pts/4    197.224.132.11   06:30    6:28   0.05s  0.05s ssh [email protected]
rbd_misb pts/5    197.224.132.11   06:30    6:28   0.04s  0.04s ssh [email protected]
rbd_misb pts/6    197.224.132.11   06:30    6:28   0.04s  0.04s ssh [email protected]
rbd_misb pts/7    197.224.132.11   06:33    1:08   0.03s  0.03s ssh [email protected]
rbd_misb pts/8    197.224.132.11   06:33    1:08   0.03s  0.03s ssh [email protected]
rbd_misb pts/9    197.224.132.11   06:34    0.00s  0.07s  0.07s ssh [email protected]
rbd_thie pts/10   197.224.132.11   07:07    4.00s  0.00s  0.00s w
[~]
rbd_thieryl@proxy[07:07:24]: $ 

Check active processes are running

ps ux --sort +vsize | head -5
ps aux --sort -pcpu | head -5

Check File System Usage

Display files and directories, sorted by largest

Check TCP Connection

thieryl@thieryl[11:09:18]: $ netstat -ntlp
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -                   
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      -                   
tcp6       0      0 :::22                   :::*                    LISTEN      -                   
tcp6       0      0 ::1:631                 :::*                    LISTEN      -                   
[~]
thieryl@thieryl[11:09:23]: $ sudo netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      978/sshd            
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      1688/cupsd          
tcp6       0      0 :::22                   :::*                    LISTEN      978/sshd            
tcp6       0      0 ::1:631                 :::*                    LISTEN      1688/cupsd          
[~]
thieryl@thieryl[11:09:29]: $ 

Check CPU and RAM

free -m

Check IO Performance

iostat -kx 2
vmstat 2 10
mpstat 2 10

List all the open files to active process

lsof -p <process_id>

thieryl@thieryl[11:11:23]: $ sudo lsof -p 978
lsof: WARNING: can't stat() fuse.gvfsd-fuse file system /run/user/1000/gvfs
      Output information may be incomplete.
lsof: WARNING: can't stat() fuse.lepton file system /tmp/.mount_leptonpItJ1K
      Output information may be incomplete.
COMMAND PID USER   FD   TYPE             DEVICE SIZE/OFF   NODE NAME
sshd    978 root  cwd    DIR                8,8     4096      2 /
sshd    978 root  rtd    DIR                8,8     4096      2 /
sshd    978 root  txt    REG                8,8   901280 598597 /usr/sbin/sshd
sshd    978 root  mem    REG                8,8  6406312 262762 /var/lib/sss/mc/group
sshd    978 root  mem    REG                8,8  8406312 262737 /var/lib/sss/mc/passwd
sshd    978 root  mem    REG                8,8    40160 598427 /usr/lib64/libnss_sss.so.2
sshd    978 root  mem    REG                8,8   138720 551046 /usr/lib64/libgpg-error.so.0.24.2
sshd    978 root  mem    REG                8,8    32400 550982 /usr/lib64/libuuid.so.1.3.0
sshd    978 root  mem    REG                8,8   371872 574817 /usr/lib64/libblkid.so.1.1.0
sshd    978 root  mem    REG                8,8    15872 551133 /usr/lib64/libkeyutils.so.1.6
sshd    978 root  mem    REG                8,8    67792 576145 /usr/lib64/libkrb5support.so.0.1
sshd    978 root  mem    REG                8,8   533680 598399 /usr/lib64/libpcre2-8.so.0.7.0
sshd    978 root  mem    REG                8,8   154128 573008 /usr/lib64/libpthread-2.27.so
sshd    978 root  mem    REG                8,8   706232 263479 /usr/lib/libgcc_s.so.1
sshd    978 root  mem    REG                8,8  1208424 551051 /usr/lib64/libgcrypt.so.20.2.3
sshd    978 root  mem    REG                8,8   386656 574918 /usr/lib64/libmount.so.1.1.0
sshd    978 root  mem    REG                8,8    21872 551032 /usr/lib64/libcap.so.2.25
sshd    978 root  mem    REG                8,8    95344 551176 /usr/lib64/liblz4.so.1.8.1
sshd    978 root  mem    REG                8,8   173408 290256 /usr/lib/liblzma.so.5.2.4
sshd    978 root  mem    REG                8,8    43880 573016 /usr/lib64/librt-2.27.so
sshd    978 root  mem    REG                8,8    24376 536320 /usr/lib64/libcap-ng.so.0.0.0
sshd    978 root  mem    REG                8,8  2123224 533886 /usr/lib64/libc-2.27.so
sshd    978 root  mem    REG                8,8    15944 550947 /usr/lib64/libcom_err.so.2.1
sshd    978 root  mem    REG                8,8   138560 576139 /usr/lib64/libk5crypto.so.3.1
sshd    978 root  mem    REG                8,8  1048896 524886 /usr/lib64/libkrb5.so.3.3
sshd    978 root  mem    REG                8,8   368168 576133 /usr/lib64/libgssapi_krb5.so.2.2
sshd    978 root  mem    REG                8,8   179984 550553 /usr/lib64/libselinux.so.1
sshd    978 root  mem    REG                8,8    98144 573014 /usr/lib64/libresolv-2.27.so
sshd    978 root  mem    REG                8,8   141576 534751 /usr/lib64/libcrypt.so.1.1.0
sshd    978 root  mem    REG                8,8   101152 292646 /usr/lib/libz.so.1.2.11
sshd    978 root  mem    REG                8,8    14352 537107 /usr/lib64/libutil-2.27.so
sshd    978 root  mem    REG                8,8    19208 558854 /usr/lib64/libdl-2.27.so
sshd    978 root  mem    REG                8,8  2910640 533620 /usr/lib64/libcrypto.so.1.1.0h
sshd    978 root  mem    REG                8,8   687400 533664 /usr/lib64/libsystemd.so.0.22.0
sshd    978 root  mem    REG                8,8    68904 527375 /usr/lib64/libpam.so.0.84.2
sshd    978 root  mem    REG                8,8   132672 531842 /usr/lib64/libaudit.so.1.0.0
sshd    978 root  mem    REG                8,8    11832 559471 /usr/lib64/libfipscheck.so.1.2.1
sshd    978 root  mem    REG                8,8   187632 531840 /usr/lib64/ld-2.27.so
sshd    978 root    0r   CHR                1,3      0t0   1031 /dev/null
sshd    978 root    1u  unix 0x000000009ef6e0e3      0t0  29136 type=STREAM
sshd    978 root    2u  unix 0x000000009ef6e0e3      0t0  29136 type=STREAM
sshd    978 root    3r   REG                8,8  8406312 262737 /var/lib/sss/mc/passwd
sshd    978 root    4u  unix 0x000000003afd52f9      0t0  29143 type=STREAM
sshd    978 root    5u  IPv4              30871      0t0    TCP *:ssh (LISTEN)
sshd    978 root    6r   REG                8,8  6406312 262762 /var/lib/sss/mc/group
sshd    978 root    7u  IPv6              30873      0t0    TCP *:ssh (LISTEN)
[~]
thieryl@thieryl[11:11:30]: $ 

How to create an empty stand-alone branch in GIT


I’m trying to refactor a web app over to Laravel. Initially, I checked out the code from GIT, created a branch, and started making my changes. When I realized how drastic the changes were, I decided that it would be better to start with a blank slate.

I deleted everything in the branch and started from scratch, but the obvious problem is that the branch shares commit history with the master branch.

After doing some research, I came across a git command to create an “orphan” branch that doesn’t share any commit history with ‘master’. Perfect for maintaining different projects in the same repository without mixing up their commit histories. Here’s how to do it:

Before starting, upgrade to the latest version of GIT. To make sure you’re running the latest version, run

which git

If it spits out an old version, you may need to augment your PATH with the folder containing the version you just installed.

Ok, we’re ready. After doing a cd into the folder containing your git checkout, create an orphan branch. For this example, I’ll name the branch “mybranch.”

git checkout --orphan mybranch

Delete everything in the orphan branch

git rm -rf .

Make some changes

vi README.txt

Add and commit the changes

git add README.txt
git commit -m "Adding readme file"

That’s it. If you run

git log

you’ll notice that the commit history starts from scratch. To switch back to your master branch, just run

git checkout master

. You can return to the orphan branch by running

git checkout mybranch
#git

Ansible

Understanding Ansible

Ansible is a powerful, simple, and easy to use tool for managing computers. It is most often used to update programs and configuration on dozens of servers at once, but the abstractions are the same whether you're managing one computer or a hundred. Ansible can even do "fun" things like change the desktop photo or backup personal files to the cloud. It can take a while to learn how to use Ansible because it has an extensive terminology, but once you understand the why and the how of Ansible, its power is readily apparent.

Ansible's power comes from its simplicity. Under the hood, Ansible is just a domain specific language (DSL) for a task runner for a secure shell (ssh). You write ansible yaml (.yml) files which describe the tasks which must run to turn plain old / virtualized / cloud computers into production ready server-beasts. These tasks, in turn, have easy to understand names like "copy", "file", "command", "ping", or "lineinfile". Each of these turns into shell commands which are run on the client server. For example, "copy" is essentially secure copy, also known as scp, and is used to move files from the ansible runner onto the client.

The output from these commands is collected and sent back to the ansible runner. This output can then effect the task execution flow. For example, if a "copy" command fails, it will by default stop task execution, but ansible can be instructed to instead ignore the failure, retry the operation, or even select a new source to copy from. In this way, ansible is very much like an imperative domain specific language. Tasks are run sequentially. If the first task copies a file to the computer, and the last removes it, it will still exist on the computer if the task pipeline fails somewhere in the middle.

However, a single task is generally declarative. It describes the state of the computer we want, and ansible ensures that the computer ends up in that state. Because ansible "gathers facts" about a system during setup and as it runs, it knows whether files it cares about exist on the computer. Instead of creating a directory with mkdir, you tell the "file" module to ensure that a certain path is set to directory mode, as everything is a file in Unix. The file module is smart enough to not do anything of the path is already a directory.

The glaring exception to this rule are the core modules "shell" and "command", though they can be treated declaratively with the creates option or the when task parameter, both of which return the no-change "OK" if a boolean flag is active.

Ansible attempts to be idempotent: when a playbook of tasks is run twice successively, or on two congruent computers, little should be different. There are many ways to subvert this in the imperative DSL, but for most ansible use cases, the same playbook should effect the computer in the same way every time. This presumption allows ansible to skip running tasks. For example, if the server already has the right node.js installed, or maybe just any version of node.js installed, the task will be "OK"'d and skipped. Note that "skipped" is a task end state for when a conditional isn't met, while "ok" is a task end state for when the computer was already in the end state.

This allows the ansible runner computer to not matter, as long as the runner has the correct files. This seemingly difficult task is fairly easy to ensure, as ansible encourages you to keep important configuration files along with ansible yaml files in source control, either as a configurable "template" or as a whole.

If every task were just a stateful function call, or a call to an object's method, then task includes statements are how you create your own function calls. A task list can include tasks which simply pass arguments to other task lists. In this way, you can compose functions of task lists, effectively giving us meta-tasks.

Tasks and meta tasks can be included in either playbooks or roles. A "role" is a description of what a computer is: "mysql", "programmer", "youtube-streamer", etc. This is what makes ansible an idempotent task runner. Remember, ansible runs tasks in order to get a computer's software into some end state. A role describes the configuration needed to take a standard computer and transform it into a home media server. But what if you want your home media server to also be, perhaps, a SteamBox? You could use a new role, but this is a case for a playbook.

Playbooks are selections of roles which are applied to specific user logins and computer ip addresses. Your media serving home computer can also be a steam box, or a "bitcoin_miner", or whatever else you may want it to be. Of course, you can create conflicting roles, but that's what virtualization and containers help manage.

The inventory file provides a mapping between a group of computers, and the login information for each computer. That's all that ansible needs into order to ssh into your "tumblr-scrapers" and get them ready for action, without touching your ever-ready "airBnB for iguanas" service server. One day the world will catch up.

So, to recap, the inventory file provides logins for computers. A playbook maps groups of logins to specific computer roles: "wordpress" or "dev2" or "abc" for the cruel hearted. A role contains everything necessary to turn a computer into a server-beast, including task lists, configuration files, and templates, as well as meta data such as "this role needs this other role in order to work". Tasks describe specific pieces of state which must be true. And modules turn tasks into ssh commands!

Roles also have special "handler" tasks, which are "globally unique" and can be notified by any other task. They are best used to restart services such as apache servers or for triggering computer reboots.

The last key piece of ansible is the humble variable system. Ansible yaml files can contain variables which control their behavior. Often these variables instruct the computer to download a new or otherwise specific program version, such as OpenSSL version 1.0.1f. They are also often used for machine specific configuration, such as naming the machine specially on DNS so everyone knows not to touch "production-load-balancer-plz-no-fail".

Variable rules are pretty simple: you define default variable values, then later you can overwrite them. There's a straightforward (if confusing) precedence order that interested parties can find in the docs. It is similar to: command line variables always win, then shell environmental ansible variables, then multiple levels of ansible yaml file rules, then finally a role's defaults/main.yml.

Because variables can be set anywhere are everywhere, this can lead to confusing and hard to debug situations with variable name clashes, until precedence rules are internalized.

A workflow for making a role

Let's walk through installing the bare essentials for any Mac OS X box: Google Chrome, Transmission torrent client, and VLC. You pay for HBO, but you want Game of Thrones anywhere, anytime, on any device.

It often makes sense to think at the role level of abstraction when writing ansible scripts. "This computer is a dev box configured with my settings, stored in environmental variables." You can use the ansible role manager (arm) application to scaffold new playbooks and roles with arm init -r {{ role_name }}. This will create the new role directory structure in the current working directory.

Once you've scaffolded the "media_mac" role, open the tasks/main.yml file (it may have the .arm suffix as well). Let's think about what needs to happen in order for the computer to be ready for use

  1. Install Google Chrome
  2. Install VLC
  3. Install Transmission

Seems straight forward. Let's list these out:

---
# media_mac/tasks/main.yml
- name: Install Google Chrome
- name: Install VLC
- name: Install Transmission

How should we install these three apps? Why, the homebrew_cask module is perfect for this.

---
# media_mac/tasks/main.yml
- name: Install Google Chrome
  homebrew_cask: name=google-chrome state=present

Remember that we are declaring a state we want, in this case, please have google-chrome installed through homebrew_cask. We can also make the yaml more git line diff friendly by taking advantage of yaml syntax.

---
# media_mac/tasks/main.yml
- name: Install Google Chrome
  homebrew_cask: >
   name=google-chrome state=present

Now, we must test this role. Don't bother writing out the other two installations, there's no point if the google chrome one doesn't work. In order to imprint a role onto a computer, you need a playbook and a hosts file. Ansible can configure the computer it's run on, so configure your ansible_hosts file will look like this:

[self]
# IP         special host variable settings
127.0.0.1    ansible_connection=local

Now let's make a playbook, in playbooks/test.yml. Don't scaffold with arm yet, because we need to type this path often. This playbook is tiny:

---
- hosts: self
  roles:
  - role: media_mac

And now run ansible-playbook playbooks/test.yml... and the debugging starts. If you've installed homebrew, then used homebrew to install the cask command, then run the cask command, you set up ansible and its dependencies, and ansible hasn't changed yet, and this tutorial has all the required steps, and you're lucky, the command will work.

Let's update the role yaml to prevent you in the future from running into the homebrew problem. We're going to check to see if homebrew exists on the media_mac already. If homebrew was more programmer friendly or I was smarter, we would simply ensure homebrew's existence or install it, but right now we're going to push the problem onto future you, using the ansible stat module

The stat module lets you do light system fact checking at run time. You register the end result of the stat command, and then you can reference that result later. Here, we check to see if brew is installed, and choose to fail if it isn't.

---
# media_mac/tasks/main.yml
- name: check if homebrew is already installed
  stat: "path=/usr/local/bin/brew"
  register: brew_exists

- fail: msg="Please install homebrew with the ruby installer script, then cask, then run cask once for permissions reasons"
  when: brew_exists.stat.exists == False

- name: Install Google Chrome
...

Now that we've already started debugging, before we ever even get "hello world" working. Welcome to devops. Let's move on and hope nothing else bad happens and forces us to adjust our engineering estimate again.

Use Caskroom.io/search to discover that VLC and transmission can also be installed with homebrew_cask. Other installations might require unzipping a tar archive somewhere, or running an installation script with the shell command. Luckily for us, these things all exist already.

Now that you can install everything you need, let's do some configuration. Media Macs should be friendly to everyone, even the family dog. Let's add these apps to the dock. Normally, on a mac, that's an issue of messing around with an XML file called a preference list. Preference lists (plists) are similar to Yaml, but look like HTML with all those <words> tags.

Instead let's use dockutil, a python program which can manage the dock more easily than we can. Let's use brew for this.

- name: install /usr/local/bin/dockutil to manage the dock
  homebrew: >
    name=dockutil
    state=present

Note the /usr/local/bin/dockutil. This is used by the shell module to run dockutil. Prefer absolute paths if possible. Let's use dockutil to add the Google Chrome to the Dock.

- name: "add google chrome to the dock"
  shell: /usr/local/bin/dockutil --add "/opt/homebrew-cask/Caskroom/google-chrome/latest/Google Chrome.app"

Note that this task must run after the dockutil install command, otherwise it won't work on untouched computers. If you run this command again, there will be two Chromes. Oops. Let's fix that. First, let's collect the output of dockutil --list and then if "Google Chrome" is in that output, don't add another dock item.

- name: read defaults to know what to add to the dock
  shell: /usr/local/bin/dockutil --list
  register: dock_list

- name: "add google chrome to the dock"
  shell: /usr/local/bin/dockutil --add "/opt/homebrew-cask/Caskroom/google-chrome/latest/Google Chrome.app"
  when: dock_list.stdout.find("Google Chrome") == -1

Do that for the other two apps, and you're good to go. If you want to do more, check out the list of ansible modules and how to use them. Also check out the tips section below, as it illustrates how I develop with ansible.

Tips: to insure promptness

It takes a day or two to get used to ansible. This section should help past most of the ansible humps.

Debugging

  1. Use the debug and assert modules to assist in debugging
  2. Use the --step CLI flag to enable interactive mode
  3. Use the --start-at-task CLI directive to skip to the step you're currently debugging
  4. Run ps aux | grep ansible on the remote host to track the ansible process.
  5. Run ps aux | grep {{ task_underlying_command }} to track the amount of CPU time a long running task has taken.
  6. Understand ssh, privilege escalation, and ssh remote agents.

Getting better at ansible

Also check out ansible galaxy, and read through some other roles to see what's possible. Favor an iterative approach when building playbooks, knocking out installation problems as you go along. Combine tasks into meta tasks, and use variables to and loops to write less and do more. Favor actions which can be "OK"'d over "CHANGED", though not always necessary or possible.

Try starting specific, then becoming more abstract as the role grows. Knock one problem down at a time, and refactor and add variables once you know your patterns.

If you have the data you need to know whether or not to run a task, and just need to get that data into ansible, there's usually a way. Aside from computer fact gathering, you can offer a prompt to a user to ask for input. You can also share encrypted data (such as ssh keys?) with ansible vault. You can control ansible with anything, as lookups allow you to communicate with external API's. If you need certain programs to be installed on the same server rack, use ansible tags to control deployment to inventories.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment