Table of Contents generated with DocToc

trends

most talks started with "this may not work for you".
kubernetes was a huge focus,
CI pipelines powered most of the speakers apps

day 1

keynote

Patrick Lencioni - 5 dysfunctions of teams (book)
" be respective helpful, collaborative and candid"
"result, accountability, commitment, conflict, trust", 5 parts of… a something pyramid
if a result has a collective outcome it is usually the best result. (Improving the whole over a part)
collaborate daily to build trust
free agent do weekly demos of work in progress.
freeagaent use Discourse for technical discussion and decision making
quarterly talk at the Engineering monthly to explain the DR (disaster recovery) process (so new people get caught up with principles and best practices)
what works for one company might not work for others. "Not everyone will benefit from a micro services architecture"

questions

" how can a remote worker manage a team in the office?"

All about having support, tools. Backing of the company.

Critical mass, 1 or 2 might not work.

micro service deployment techniques

"70% of outages are due to changes i n a live system"
advice:
- quickly and accurately detecting problems
"control over speed of rollout" - (thought: how do you control rollout in CD?)
deployment should be without fear
rollback should be clear and up front to reduce fear
docker sends a sigterm to a container, 10s later sends a sigkill
kubernetes separates liveness and readiness checks
routing internal traffic to new versions
alpha/power users could be served canary versions
scientist by github is still a cool project
permanent canary, haproxy puts 5% of traffic to canaries (bitmovin blog)
shadowing - evaluate both code paths, result from old version, store new version results (speed, result), Scientist does this
flux by weaveworks
"sidecar container" - intelligent routing within pods
istio, linkerd (service mesh)
istio and libkerd are battling with kubernetes routing, as they implement another layer essentially

using ml to optimise DevOps

collect data on CI, quality of deploys, deploys a day
predicting trends based on data
used rum, synthetics and tracing to identify trends

things to measure

commits per deployment could be an interesting metric, can you predict failure?
when a ticket is raised, how detailed was it, can that predict quality
week before and week after performance around deploys
are late day deploys more risky?
can deadlines be tracked, can time deadline declared be tracked (if deadline is made 4 weeks into the project), rather than at the start

DevOps and Groupthink

groupthink: fear to disagree, so agree when you might think it's wrong (maybe the whole room disagrees)
illusion of unanimity
Being in a group can emphasise biases
Representativeness - "this is how we've always done it". Basing a decision based on past similar decisions
failures in deliberation
- cascade effect - desire to conform, people will agree with the speaker before
- polarisation - ?
" if something bothers you, say it"
establish a growth mindset,
Withhold discussion until opinions made first. Reduce influence of previous point
have the leader or most opinionated speak last
appoint a devils advocate - to facilitate viewing all possible solutions
Container, Difference, Exchange (CDE)
exchange - who provides info, how is it received
identify CDE, what, impact, what changes can be made
changing the difference - add a strong opinionated team member to balance out other strong opinion
Medici effect -
create an environment everyone is comfortable in speaking their opinion

Managing Complex Kube Deployment s

Helm
manage kubernetes on EC2
a kubernetes app has the same problems installing something like nginx would (config, dependencies, versions)
helm
- template based config for manifests
- reproducible installs
- cloud native pkg manager
helm-template - render templates locally
helm plugins can extend any part if the app, and can be any executable on the system. Can use for secret management etc
Personal thought: kubernetes access control?

From To DevOps

Non-tradational entry into industry (CodeClan)
Mercedes Coyle Onboarding and Mentoring Apprentices with DevOps Culture) https://vimeo.com/115484860 (video no longer exists :( )
A Mind For Numbers - book about learning how to learn
craftsman mindset - find something valuable and rare, get really good at making something from it
finding a definitive definition of DevOps is difficult as a junior
"a crisis is a terrible thing to waste" - Japanese philosopher
Linux Academy, wide source of tutorials and lessons (£170 a year?)
The Open Guide to AWS - guide with simplified names for things
https://github.com/openzipkin/docker-zipkin
Documentation is important for onboarding
Tuckman's model of Team Development
as long as a junior is showing progression everything is great
be patient with mentees
Graduate Level Apprenticeship at Heriot Watt
good mentorship, weekly check ins, big helps for a junior
DevOps handbook (book)

providing and supporting docker images

elastic has a custom registry (one on dockerhub isn't supported)
JVM adds a lot of bloat to alpine, so doesn't matter
JVM has more tools than jre, just always use jdk
alpine had too many bugs for elastic
centos7 released a buggy jdk, elastic had to fork
Cockroach ideas (ideas that never stay dead for long). Like running things as root
bootstrap checks (open file checks, catch things that will hurt)
Elastic use fat images, against modern docker principles. But Elastic ache is a data store so you won't be deploying often.
Security updates are pushed by overwriting labels, probably not best

Ignites

Using Swarms to improve ITSM workflow

Subject Matter experts end up answering all the questions, when thy should be building.
swarming to break apart itsm levels
dispatch swarm cherry picks items, immediately fixes
weekly backlog swarms make sure things are moving into the right groups

nosql security

No real thing to add. Interesting points on nosql security issues.

chatops

queryable interface for services
Two way Comms are very valuable

Automating Home and at Work

vera smarter control

Open spaces

databases

Session was mainly people trying to find a common place to talk. Everyone had different DBs
Do DBAs exist, what does the modern DBA look like.
Some people were looking for someone to check SQL, probably better ways to do this
Apparently this session was supposed to be about versioning data, moving it through staging, versioning etc.

tools for large scale deployments

All about kubernetes

diversity

have the facilities and support (around people with disabilities)
requirements should state that you will be enabled to learn
reducing barriers to entry
whiteboard tests are biased towards wealthy white males comfortable in front of a whiteboard

Day 2

The Perfect DevOps Storm

increase "speed of execution"
built their own optimizely
getting a new service off the ground in a day
lessons learned over the years
- organisational structure isn't static, it must evolve and adapt
- "autonomy without accountability is just a vacation" - Kent Beck
- Turn the Ship Around - book
- Change Fatigue
- Journeys not Destinations for continuous improvement
- Just because you were the customer doesn't mean you know what they want (for enablement teams)
- measure standards, track adoption
- enable squads to make internal contributions cross-squad
mshell, template for mixroservices standards

Continuously Deploying at Ocado

travelling salesman problem at Ocado
Getting a big picture
- what version is it
- who owns it
- how much does it cost
templating doesn't work for every team, some teams won't like or want them
Post deployment visibility important
gitops
Operations by Pullrequest - Weaveworks
documentation
- peer review
- Concise, but many examples. How you did it, more than why, leave out extraneous details
- Updated Regularly
minikube - small emulated kubernetes
Just have basic ground rules:
- Resource namespace
- what team uses it
- Don't need excessive templates
Workflow bot to write the bulk of PR content
- I think it did cross pr referencing and updates (as a change needs to be in 4 prs in this workflow)
rollback with reverts
"if users aren't finding success on their own, its not their problem it's ours"

Work Life Balance

waitbutwhy.com
saying yes leads to saying yes
saying no is ok
work-life balance is wrong. Lead with Life. Integrate the two. Life work Balance
focus on skills you need to work on
"my favourite part of being a manager is saying no for them (colleagues)"
Invest in your skills
don't live in a vacuum (when you're in a vacuum you're not hearing everything everyone around you is saying)

Repository driven Development

as people take on more responsibility they need support
when you arrive at a repository, you don't know all the invisible dependencies (logging, monitoring, etc)
Script hierarchies decoupled from ci. So can deploy to prod using the same commands
command hierarchy becomes living documentation
Consumer driven contracts
- a service contains contracts
- if a service is modified, all existing contracts can be executed against the service in pre-prod
Prepush git hook check (trunk based development)
using consul template to watch changes to update Icinga checks

Continuous Security with Kube

"Only the paranoid Survive" - Andy Grove
Openshift
kubernetes signed images
don't use 'latest' tag in docker
OpenSCAP - tech

Scrutinising the Scrutiny

The field guide to understanding human error - book
Locked into the idea of a linear sequence
Post-Incident Review (Nicer term to Postmortem).
postmortem is the search for the cause of death.
Incident review is just about learning and improving.
Readiness phase - improve the system from your learning's
take note of direction. Finding someone to call takes you away from a solution
industry average cost for an outrage is $6000 a minute
ensure responders have the access they need
internal status pages help internal users know what's up
Continuous Improvement (As a term instead of DevOps)
Bit.ly/PIR_book

Improving Culture through Workshops

external speaker didn't take in context of the world's to build workshop on
Running a [Security] Workshop
GDS "its OK to… list" https://gds.blog.gov.uk/2016/05/25/its-ok-to-say-whats-ok/
workshops show its OK to take time to learn
encourage others to just do stuff that will help culture
Using a workshop to scale impact, rather than teaching one person . people can learn from each other in sessions
workshops support the idea of learning together
building a workshop doesn't have to be perfect first time. Everything else s done with continuous improvement
can't cover everything in Security
experts aren't always the best teachers.
having hypotheses for an alpha workshop
OWASP top 10, a good start
Practical experiences are most valuable for learning.
source code is useful for a workshop. Attacking a blackbox isn't going to help everyone
make something others can build on
Practical tips
- about 15 people
- range of skills, juniors as well as experienced
- ask people to setup in advance
- Setup a local network (for a security talk). Maybe run an RPI with big files
- short retro
don't have to be an expert to run a [sec] workshop
be ruthless with an MVP. Anything you miss can be done again later
people get tired at different times, hard to gauge
reflect on each topic as a group to make sure everyone understands
shared the trello board for future organisers to use
workshops aren't the end of the journey

ignites day 2

CD

Value stream map - find waste in your processes
You need to understand how software gets to prod
need to measure change, get a baseline measurement
the meeting, the people involved in the work need to be there
estimate the lead time, the time between steps
acknowledge your variables
where is the most pain
document the actual work, not what the document says it should be

Gain Visibility into your Apps

metric watch? (Don't know what I meant here)

docker tips

Docker PS --format
can put in config.json
docker system prune
container shouldn't crash if something isn't there yet
use exec in any start up (need pid 1)
set a read-only fs
gosu is like sudo but doesn't fork processes

open space notes

Anchore
cncf landscape
aws have restricted ports, that you can't use between instances? (Sounds like bullshit to me)
Azure apparently have traffic monitoring and will shut network access down if they see trends change???

Open Spaces Ideas

how is your devops team organised?
Podcasts/Books
Ownership (of services) in a DevOps world

klf ideas

shadowing - and maybe other deployment techniques
consul, as the time comes

homelinen/devops-days-edinburgh-notes.md