Skip to content

Instantly share code, notes, and snippets.

View bs-github's full-sized avatar

Birger Schmidt bs-github

  • SUSE S.A.
  • Germany
View GitHub Profile
@bs-github
bs-github / altnetworking.sh
Created May 3, 2017 09:11 — forked from level323/altnetworking.sh
Run a command inside a customised networking environment (using cgroups)
#!/bin/bash
# === INFO ===
# altnetworking.sh
# Description: Run the specified application in a custom networking environment.
# Uses cgroups to run process(es) in a network environment of your own choosing (within limits!)
VERSION="0.1.0"
# Author: John Clark
# Requirements: Debian 8 Jessie (plus iptables 1.6 from unstable)
#
@bs-github
bs-github / qos-multi-network.md
Created April 20, 2017 11:12 — forked from rtreffer/qos-multi-network.md
QoS on a 5 ethernet port system with 2 networks and management.

Ports

  • 1: Main Network in
  • 2: Main Network out
  • 3: Guest Network in
  • 4: Guest Network out
  • 5: MGMT

Interfaces

Application specific host grouping in Riemann-dash

It is generally desirable to group all the hosts for a specific service into a single dashboard view. For example, all the web servers are in single view while all the database servers are in another view.

This is usually not an issue when you are sending custom metrics using Riemann client. However, there are cases where you are using something that you do not control how the metrics are being sent. i.e., Riemann-tools.

Since Riemann-tools scripts are application agnostic, in order for the dashboard view to group hosts, we must inject some application specific information into the tags field. Tags is a collection of arbitrary strings. In the case of Riemann-tools scripts you can pass in arbitrary strings on the command line.

riemann-health --host 127.0.0.1 --tag "prod" --tag "webserver"

@bs-github
bs-github / proposal devopsdays 2014 berlin.markdown
Last active August 29, 2015 14:04
Use flapjack as bridge between ops and dev, and bridge from nagios towards more freedom.

Use flapjack as bridge between ops and dev, and bridge from nagios towards more freedom.

Flapjack is an alert umbrella for people that intelligently routes and rolls up alerts, integrates with check execution engines like Sensu & Nagios, and ships a well documented API for restart-less configuration.

There are two problems in monitoring:

  • check configuration
  • notification rules
  • and a well known third one: check execution

I would considder the configuring checks part more or less solved in times of automated setups via puppet or chef.

Leveling up your Flapjack stack

Too many alerts. Too many dashboards. Too much noise - and the alert fatigue isn't receding.

If you're frequently on the end of a pager (or pager-like device) and working with systems running in the cloud, you've probably noticed an increase in the volume of alerts over last few years.

This is a problem that's not going away - in fact, with the proliferation of monitoring tools going on at the moment due to a renaissance in Open Source monitoring, coupled with the ever expanding sprawl of systems that make up modern businesses on the web, the problem is only getting worse.

Flapjack is an alert umbrella for people on-call that intelligently routes and rolls up alerts, integrates with check execution engines like Sensu & Nagios, and ships a well documented API for restart-less configuration.

#!/usr/bin/env bash
BRANCH=$(git name-rev HEAD 2> /dev/null | awk "{ print \$2 }")
TAG=${1:-"HEAD"}
FILE=$2
if [ -f $TAG ]; then

Working in operations in 2014 is hard.

The infrastructures we manage are growing rapidly, and responsibility is being divided up across multiple teams.

Then something breaks. Your on-call engineer receives 900 SMS in 30 seconds. Her phone melts. You can’t distinguish the signal from the noise. It takes an hour to fix the problem.

Enter Flapjack: an event processing & monitoring alert routing system. Flapjack sits at the end of your monitoring pipeline and sends alerts to the right person.

You should be interested in Flapjack if:

#!/usr/bin/ruby
# inspired by http://ariejan.net/2010/08/23/resque-how-to-requeue-failed-jobs 3
# retry all failed Resque jobs except the ones that have already been retried
# This is, for instance, useful if you have already retried some jobs via the web interface. 6 require 'rubygems'
require 'resque'
# Enable auto flush
STDOUT.sync = true
hostname = 'redisjobs.moprodus.cust.bulletproof.net'
Resque.redis = "redis://#{hostname}:6379/2" 15 ENV['RAILS_ENV']='production'
@bs-github
bs-github / Vagrantfile
Created November 12, 2013 09:16
buildbox
# -*- mode: ruby -*-
# vi: set ft=ruby :
VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = 'precise64'
config.vm.box_url = 'http://files.vagrantup.com/precise64.box'
#config.vm.box_url = '/space/CDs/precise64.box'
config.vm.hostname = 'buildbox.example.org'
import sys
import subprocess
import tempfile
import urllib
text = sys.stdin.read()
chart_url_template = ('http://chart.apis.google.com/chart?'
'cht=qr&chs=300x300&chl={data}&chld=H|0')
chart_url = chart_url_template.format(data=urllib.quote(text))