Birger Schmidt bs-github

Ports

1: Main Network in
2: Main Network out
3: Guest Network in
4: Guest Network out
5: MGMT

Interfaces

Application specific host grouping in Riemann-dash

It is generally desirable to group all the hosts for a specific service into a single dashboard view. For example, all the web servers are in single view while all the database servers are in another view.

This is usually not an issue when you are sending custom metrics using Riemann client. However, there are cases where you are using something that you do not control how the metrics are being sent. i.e., Riemann-tools.

Since Riemann-tools scripts are application agnostic, in order for the dashboard view to group hosts, we must inject some application specific information into the tags field. Tags is a collection of arbitrary strings. In the case of Riemann-tools scripts you can pass in arbitrary strings on the command line.

riemann-health --host 127.0.0.1 --tag "prod" --tag "webserver"

Use flapjack as bridge between ops and dev, and bridge from nagios towards more freedom.

Flapjack is an alert umbrella for people that intelligently routes and rolls up alerts, integrates with check execution engines like Sensu & Nagios, and ships a well documented API for restart-less configuration.

There are two problems in monitoring:

check configuration
notification rules
and a well known third one: check execution

I would considder the configuring checks part more or less solved in times of automated setups via puppet or chef.

Leveling up your Flapjack stack

Too many alerts. Too many dashboards. Too much noise - and the alert fatigue isn't receding.

If you're frequently on the end of a pager (or pager-like device) and working with systems running in the cloud, you've probably noticed an increase in the volume of alerts over last few years.

This is a problem that's not going away - in fact, with the proliferation of monitoring tools going on at the moment due to a renaissance in Open Source monitoring, coupled with the ever expanding sprawl of systems that make up modern businesses on the web, the problem is only getting worse.

Flapjack is an alert umbrella for people on-call that intelligently routes and rolls up alerts, integrates with check execution engines like Sensu & Nagios, and ships a well documented API for restart-less configuration.

Working in operations in 2014 is hard.

The infrastructures we manage are growing rapidly, and responsibility is being divided up across multiple teams.

Then something breaks. Your on-call engineer receives 900 SMS in 30 seconds. Her phone melts. You can’t distinguish the signal from the noise. It takes an hour to fix the problem.

Enter Flapjack: an event processing & monitoring alert routing system. Flapjack sits at the end of your monitoring pipeline and sends alerts to the right person.

You should be interested in Flapjack if:

	#!/bin/bash

	# === INFO ===
	# altnetworking.sh
	# Description: Run the specified application in a custom networking environment.
	# Uses cgroups to run process(es) in a network environment of your own choosing (within limits!)
	VERSION="0.1.0"
	# Author: John Clark
	# Requirements: Debian 8 Jessie (plus iptables 1.6 from unstable)
	#

	#!/usr/bin/env bash

	BRANCH=$(git name-rev HEAD 2> /dev/null \| awk "{ print \$2 }")


	TAG=${1:-"HEAD"}
	FILE=$2


	if [ -f $TAG ]; then

	#!/usr/bin/ruby
	# inspired by http://ariejan.net/2010/08/23/resque-how-to-requeue-failed-jobs 3
	# retry all failed Resque jobs except the ones that have already been retried
	# This is, for instance, useful if you have already retried some jobs via the web interface. 6 require 'rubygems'
	require 'resque'
	# Enable auto flush
	STDOUT.sync = true
	hostname = 'redisjobs.moprodus.cust.bulletproof.net'

	Resque.redis = "redis://#{hostname}:6379/2" 15 ENV['RAILS_ENV']='production'

	# -- mode: ruby --
	# vi: set ft=ruby :

	VAGRANTFILE_API_VERSION = "2"

	Vagrant.configure(VAGRANTFILE_API_VERSION) do \|config\|
	config.vm.box = 'precise64'
	config.vm.box_url = 'http://files.vagrantup.com/precise64.box'
	#config.vm.box_url = '/space/CDs/precise64.box'
	config.vm.hostname = 'buildbox.example.org'

	import sys
	import subprocess
	import tempfile
	import urllib

	text = sys.stdin.read()

	chart_url_template = ('http://chart.apis.google.com/chart?'
	'cht=qr&chs=300x300&chl={data}&chld=H\|0')
	chart_url = chart_url_template.format(data=urllib.quote(text))