- security?
- same issue as docker images from dockerhub
- http://twitter.com/nixgeek/status/694103481909649409
- secret management
- “secret” vs “sensitive”
- secrets: creds, certs, keys, passwords
- sensitive: phone numbers, mother’s maiden name, emails, dc locations, PII
- we think correctly about secrets
- maybe not so much with sensitive data
- my definition: “anything which makes the news”
- this includes “secrets” and “sensitive” categories above
- vault is designed for this case
- certificates
- specific type of secret
- backed by RFCs
- used almost universally
- historically a pain to manage
- unencrypted secrets in files
- fs perms
- encrypted secrets with the key on the same fs
- eg chef databags
- also, how does the secret get on to the system?
- why don’t we use config management for secrets? because config
management tools don’t have features we need:
- no access control
- no auditing
- no revocation
- no key rolling
- why not (online) databases?
- rdbms, consul, zookeeper, etc
- (one of the big motivations to create vault was we didn’t want people using consul to store secrets)
- not designed for secrets
- limited access controls
- typically plaintext storage
- (even if the filesystem is encrypted, which has its own issues)
- no auditing or revocation abilities
- rdbms, consul, zookeeper, etc
- how to handle secret sprawl?
- secret material is distributed
- don’t want one big secret with all the access to everything
- who has acces?
- when were they used?
- what is the attack surface?
- in the event of a compromise, what happened? what was leaked?
- can we design a system that allows us to audit what was compromised?
- secret material is distributed
- how to handle certs?
- openssl command line (ugh)
- where do you store the keys?
- if you have an internal CA, how do you manage that?
- how do you manage CRLs?
- vault goals:
- single source for secrets, certificates
- programmatic application access (automated)
- operator access (manual)
- practical security
- modern data center friendly
- (private or cloud, commodity hardware, highly available, etc)
- vault features
- secure secret storage
- full cert mgmt
- dynamic secrets
- leasing, renewal
- auditing
- acls
- secure secret storage
- encrypt data in transit and rest
- rest: 256bit AES in GCM
- TLS 1.2 for clients
- no HSM required (though you can if you want)
- inspired by unix filesystem
- mount points, paths
$ vault write secret/foo bar=bacon
success!
$ vault read secret/foo
Key Value
lease_id ...
lease_duration ...
lease_renewable false
bar bacon
- dynamic secrets
- never provide “root” creds to clients
- secrets made on request per client
$ vault mount postgres
$ vault write postgresql/config/connection value=...
# written once, can never be read back
$ vault read postgresql/creds/production
-> get back a freshly-created user
-> if you don't come back to vault within an hour, vault will drop the user
- auditing
- pluggable audit backends
- request and response loggin
- prioritizes safety over availability
- secrets hashed in audits (salted HMAC)
- searchable but not reversible
- rich ACLs
- flexible auth
- pluggable backends
- machine-oriented vs operator-oriented
- tokens, github, appid, user/pass, TLS cert
- separate authentication from authorization
- high availability
- leader election
- active/standby
- automatic failover
- (depending on backend: consul, etcd, zookeeper)
- “unsealing the vault”
- data in vault is encrypted
- vault requires encryption key, doesn’t store it
- must be provided to unseal the vault
- must be entered on every vault restart
- turtles problem!
- this secret is sometimes on a piece of paper in a physical safe
- alternative: shamir’s secret sharing
- we split the key
- a number of operators have a share of the key
- N shares, T required to recompute master
- default: N:5 T:3
- ways to access vault
- http + TLS API, JSON
- CLI
- consul-template (for software you won’t rewrite to talk directly to vault)
- application integration
- the best way to access vault
- vault-aware
- native client libraries
- secrets only in memory
- safest but high-touch
- consul-template
- secrets templatized into application configuration
- vault is transparent
- lease management is automatic
- non-secret configuration still via consul
- (then put your secrets on to a ramdisk and make sure the ramdisk can’t be swapped)
- PII
- it’s everywhere
- “transit” backend for vault
- encrypt/decrypt data in transit
- avoid secret management in client apps
- builds on vault foundation
- web server has no encryption keys
- requires two-factor compromise (vault + datastore)
- decouples storage from encryption and access control
$ vault write -f transit/keys/foo
$ vault read -f transit/keys/foo
-> don't key key, just metadata about key
to send:
$ echo msg | vault write transit/encrypt/foo plaintext=-
to recv:
$ vault write transit/decrypt/foo ciphertext=-
- rich libraries to do this automatically: vault-rails
- disadvantage: everything round-trips through vault
- can increase performance with another trick (…?)
- CA
- vault acts as internal CA
- vault stores root CA keys
- dynamic secrets - generates signed TLS keys
- no more tears
- mutual TLS
$ vault mount pki
$ vault write pki/root/generate/internal common_name=myvault.com ttl=87600h
...
...
- vault documents their security and threat models! https://www.vaultproject.io/docs/internals/security.html
- certificates with vault
- can be revoked
- never exposes your CA private keys
- can manage intermediate CAs
- secured via ACLs like anything else
- audited like anything else
- do you support letsencrypt certs?
- just started planning this morning
- if I use vault with consul, can I use that consul for something
else?
- in theory yes
- vault uses a subpath within consul, and encrypts everything
- we recommend you use an ACL to prevent access to other apps (it’s all encrypted garbage but if someone flips a bit you lose everything)
- what if there’s a rogue operator who’s keylogged?
- if you’re running it the right way it should be okay
- if the operator doesn’t have root on the machine running vault it should be okay
- if the operator has root then they can just coredump the vault process and get the key that way
- a rogue root user is not in our threat model
- puppet chef ansible salt juju
- deeply linked with the OS, from the start, until EOL
- vendors, epel, make install, gems etc
- regular/commodity users -> EPEL
- (gap)
- advanced users
- where is the srpm?
- where is the buildchain?
- they have bugs
- public build system
- everything needed to build software
- software interest group
- topic-focussed
- release RPMs
- can we make a CFGMGMT SIG?
- objectives:
- recent versions of cfg mgmt tools
- we were using CM but not winning
- what we had built with love
- automated tests
- monitoring
- but it was a total failure
- non-managable rebuild times
- envs were starting to leak
- our systems are “eventually repeatable”
- darn it, test that change in prod
- docker docker docker docker
- solution: stop doing configuration management!
- artifacts and pipelines
- inputs are typically managed artifacts
- change
- feed input to packer which in turn runs a builder that applies change producing output
- output: a versioned artifact (rpm)
- repos, packages, images, containers
- abstraction is key to doing changes
- defns:
- a input-change-output chain is a project
- a project is versioned in git
- all artifacts are testable
- your new job is: describing state to produce artifacts and keeping that state from drifting
- http://nubis-docs.readthedocs.org/en/latest/MANIFESTO/
- change from stateful VMs to managing artifacts
- this worked really well
- packer with masterless puppet
- terraform and ansible
- masterless puppet to audit and correct drift
- yum upgrade considered harmful
- own the tools you run
- replace a bunch of shell scripts
- puppet infrastructure setup sucks
- tried to scale PuppetDB or PE?
- puppet kick (deprecated)
- mcollective is moribund
- SSH - why not?
- centralized secrets
ssh <node> -c 'facter -j'
puppet master --compile > catalog
ssh <node> -c 'puppet apply'
- problem: file server
- shipping files with ssh?
- fix the catalog on the fly – change puppet:// urls
- problem: pluginsync
- custom facts
- reports, exported resources
- they just work
- one node per catalog - why?
- I want partial catalogs, different schedules, different user accounts
- tasks on demand
- testing
- docker docker
- “war stories”
- “in the trenches”
- how is this so much of a thing?
- pop culture depictions of war v common
- mainstream media depictions of war
- yes, our jobs can be stressful
- we can be under pressure
- but it’s still not an actual war situation
- this is about culture
- we’re a corner of OSS community
- it’s not about making great code
- the way we talk to each other matters
- we should be inclusive
- alternatives to “war story”?
- anecdote
- story
- experience report
- I have working python code, how do I start now?
- a proper deployment artifact:
- python package
- (debian package? docker?)
- python package
- it should be uniquely versioned
- it should manage dependencies
- https://github.com/blue-yonder/pyscaffold
- CI:
- run tests, build package
- push to artifact repository
- http://doc.devpi.net
- automated deploy: ansible
- we use virtualenvs to isolate dependencies
- pip doesn’t do true dependency resolution
- maintain and refactor your deployment
- pypa/pip issue #988 (pip needs a dependency resolver)
- OS package managers v pip: the two worlds should unite
- pip is still optimized for a manual workflow (eg no –yes option)
- you can build your own CD pipeline!
- front-end load testing
- browser-based load testing tool with scenarios
- history:
- looked at Squish - selenium alternative
- uses chrome
- looked at phantomjs
- headless so it’s good
- missed some APIs
- looked at SlimerJS
- not headless but still performant
- looked at Squish - selenium alternative
- use cases for felt
- quick load test
- FE apps (eg angularJS)
- simulate user
- cfgmgmt not just about files
- but also files
- puppet module: example42/tp
- http://tiny-puppet.com/
- tp::conf { ‘nginx’: … }
- tp::conf { ‘nginx::example.com.conf’: … }
- tp::dir { ‘nginx::www.example42.com: … }
- tp::test { ‘redis’: }
- tp::install { ‘redis’: …}
- puppet 2.7
- dark, hacky features (eg dynamic scoping)
- puppet 3
- functional insanity with some pretty cool new tools and toys
- rspec-puppet, librarian-puppet etc came along
- we upped our game
- puppet 4
- language spec!
- type system!
- lambdas
- iterations
- all the things
- sanity
- ponies!
- as a module maintainer, it’s painful
- maintaining compatibility with 3 and 4 is frustrating
- step 1: breathe
- talk it through on your team
- step 2: get to puppet 3.8 first
- the last 3.x release
- it starts throwing deprecation warnings at you
- fix these
- scoping, templates, etc
- upgrade your modules
- vox ppupli and puppetlabs modules Just Work™
- step 3: enable the future parser
- (don’t do this on puppet < 3.7.4)
- types of defaults will matter
- eg where default is empty string but you can pass in an array
- that won’t work any more
- clustered node data
$facts
hash- unshadowable, unmodifiable
- unlike the old
$operating_system
fact lookup style
- step 4: upgrade to puppet 4
- two options:
- distro packages
- move to puppetlabs’s omnibus packages
- recommend using omnibus, but changes some things:
- /var/lib/puppet moves to /opt/puppetlabs/puppet
- two options:
- step 5: caek
- two choices for the actual upgrade:
- spin up a new master
- point agents to the new master and babysit them one at a time
- pre-compile and compare catalogs
- tools that help:
- less common approach, but tends to go nicer
- spin up a new master
- we did it in:
- 1-2 weeks of prep
- 1 week of rollout
- 2-3 days of cleanup
- 0 production incidents
- (over 10k nodes)
- but.. we cheated
- we migrated to the future parser over a year ago :)
- 60 modules & tooling
- 50 contributors
- basically everyone has commit to everything
- join the revolution!
- bob: how many things did break when you enabled futureparser?
- weird scoping with templates
- name: /etc distributed
- a clustered key-value store
- GET and SET ops
- a building block for higher order systems
- primitives for distributed systems
- distributed locks
- distributed scheduling
- primitives for distributed systems
- 2013.8: alpha (v0.x)
- 2015.2: stable (v2.0+)
- stable replication engine (new Raft impl)
- stable v2 API
- 2016.? (v3.0+)
- efficient, powerful API
- some operations we wanted to support couldn’t be done in the existing API
- highly scalable backend
- (ed: what does this mean?)
- efficient, powerful API
- production ready
- coreos mission: “secure the internet”
- updating servers = rebooting servers
- move towards app container paradigm
- need a:
- shared config store (for service discovery)
- distributed lock manager (to coordinate reboots)
- existing solutions were inflexible
- (zookeeper undocumented binary API – expected to use C bindings)
- difficult to configure
- highly available
- highly reliable
- strong consistency
- simple, fast http API
- raft
- using a replicated log to model a state machine
- “In Search of an Understandable Consensus Algorithm” (Ongaro, 2014)
- response to paxos
- (zookeeper had its own consensus algorithm)
- raft is meant to be easier to understand and test
- three key concepts:
- leaders
- elections
- terms
- the cluster elects a leader for every term
- all log appends (…)
- implementation
- written in go, statically linked
- /bin/etcd
- daemon
- 2379 (client requests/HTTP + JSON api)
- 2380 (p2p/HTPP + protobuf)
- /bin/etcdctl
- CLI
- net/http, encoding/json
- eg: have 5 nodes
- can lose 2
- lose 3, lose quorum -> cluster unavailable
- prefer odd-numbers for cluster sizes
- the more nodes you have, the more failures you can tolerate
- but the lower throughput becomes because every operation needs to hit a majority of nodes
- GET /v2/keys/foo
- GET /v2/keys/foo?wait=true
- poll for changes, receive notifications
- PUT /v2/keys/foo -d value=bar
- DELETE /v2/keys/foo
- PUT /v2/keys/foo?prevValue=bar -d value=ok
- atomic compare-and-swap
- locksmith
- cluster wide reboot lock - “semaphore for reboots”
- CoreOS updates happen automatically
- prevent all machines restarting at once
- set key: Sem=1
- take a ticket by CASing and decrementing the number
- release by CASing and incrementing
- flannel
- virtual overlay network
- provide a subnet to each host
- handle all routing
- uses etcd to store network configuration, allocated subnets, etc
- virtual overlay network
- skydns
- service discovery and DNS server
- backed by etcd for all configuration and records
- vulcand
- “programmatic, extendable proxy for microservices”
- HTTP load balancer
- config in etcd
- (though actual proxied requests don’t touch etcd)
- confd
- simple config templating
- for “dumb” applications
- watch etcd for chagnes, render templates with new values, reload
- (sounds like consul-template mentioned in the vault talk?)
- recent improvements (v2)
- asynchronous snapshotting
- append-only log-based system
- grows indefinitely
- snapshot, purge log
- safest: stop-the-world while you do this
- this is problematic because it blocks all writes
- now: in-memory copy, write copy to disk
- can continue serving while you purge the copy
- append-only log-based system
- raft pipelining
- raft is based around a series of RPCs (eg AppendEntry)
- etcd previously used synchronous RPCs
- send next message only after receiving previous response
- now: optimistically send series of messages without waiting for replies
- (can these messages be reordered?)
- asynchronous snapshotting
- future improvements (v3)
- “scaling etcd to thousands of nodes”
- efficient and powerful API
- flat binary key space
- multi-object transaction
- extends CAS to allow conditions on multiple keys
- native leasing API
- native locking API
- gRPC (HTTP2 + protobuf)
- multiple streams sharing a single tcp connection
- compacted encoding format
- disk-backed storage
- historically: everything had to fit in memory
- keep cold historical data on disk
- keep hot data in memory
- support “entire history” watches
- user-facing compaction API
- incremental snapshots
- only save the delta instead of the full data set
- less I/O and CPU cost per snapshot
- no bursty resource usage, more stable performance
- upstream recipes for common usage patterns
- leases: attaching ownership to keys
- leader election
- locking resources
- client library to support these higher level use cases
- server density: monitoring
- the cost of uptime
- expect downtime
- prepare
- respond
- postmortem
- incident example:
- power failure to half our servers
- primary generator failed
- backup generator failed
- UPS failed
- automated failover unavailable
- (known failure condition)
- manual DNS switch required
- expected impact: 20 minutes
- actual impact: 43 minutes
- power failure to half our servers
- human factor
- unfamiliarity with process
- pressure of time sensitive event (panic)
- escalation introduces delays
- documented procedures
- checklists! ✓
- not to follow blindly – knowledge and experience still valuable
- independent system
- searchable
- list of known issues and documented workarounds/fixes
- checklists – why?
- humans have limitations
- memory and attention
- complexity
- stress and fatigue
- ego
- pilots, doctors, divers:
- Bruce Willis Ruins All Films
- checklists help humans
- increase confidency
- reduce panic
- humans have limitations
- realistic scenarios for your game day
- replica environment
- or mock command line
- record actions and timing
- multiple failures
- unexpected results
- simluartion goals
- team and individual test of response
- run real commands
- training the people
- training the procedures
- training the tools
- postmortem
- failure sucks
- but it happens, and we should recognize this
- fearless, blameless
- significant learning
- restores confidence
- increases credibility
- timing
- short regular updates
- even “we’re still looking in to it”
- ~1 week to publish full version
- follow-up incidents
- check with 3rd party providers
- timeline for required changes
- content
- root cause
- turn of events which led to failure
- steps to identify & isolate the cause
- failure sucks
- http://www.slideshare.net/bobtfish/empowering-developers-to-deploy-their-own-data-stores
- https://github.com/bobtfish/AWSnycast
- puppet data in modules
- this is amazing. it changed our lives
- apply regex to hostname
search1-reviews-uswest1aprod
to parse out cluster name elasticsearch_cluster { 'reviews': }
- developers can create a new cluster by writing a yaml file
- pull the data out of the puppet hierarchy
- resuse the same YAML for service discovery and provisioning
- puppet ENC - external node classifier
- a script called by puppetmaster to generate node definition
- our ENC looks at AWS tags
- cluster name, role, etc
- puppet::role::elasticsearch_cluster => cluster_name = reviews
- stop needing individual hostnames!
- host naming schemes are evil!
- silly naming schemes (themed on planets)
- “sensible” naming schemes (based on descriptive role)
- do you identify mysql master in hostname?
- what happens when you failover?
- host naming schemes are evil!
- customize your monitoring system to actually tell you what’s wrong
- “the master db has crashed” v “a db has crashed”
- terraform has most of the pieces
- it’s awesome
- as long as you don’t use it like puppet
- roles/profiles => sadness
- treat it as a low level abstraction
- keep things in composable units
- add enough workflow to not run with scissors
- don’t put logic in your terraform code
- it’s a sharp tool
- can easily trash everything
- it’s the most generic abstraction possible
- map JSON (HCL) DSL => CRUD APIs
- it will do anything
- as a joke I wrote a softlayer terraform provider which used twilio to phone a man and request a server to be provisioned
- cannot do implicit mapping
- but puppet/ansible/whatever can?
- “Name” tag => namevar
- Only works in some cases - not everything has tags!
- implicit mapping is evil
- eg: puppet AWS
- in March 2014, I wanted to automate EC2 provisioning
- I could write a type and provider in puppet to generate VPCs
- @garethr stole it and it’s now puppet AWS
- BUG - prefetch method eats exceptions (fixed now)
- you ask AWS for all VPCs up front (in prefetch)
- if you throw an exception while prefetching, it was silently swallowed
- so it looks like there are no VPCs
- now you generate a whole bunch of duplicates
- workaround: an exception class with an overridden
to_s
method which wouldkill -9
itself- works, but not pretty
- I wouldn’t recommend puppet-aws unless you’re on puppet 4 which fixed this bug
- it’s awesome
- terraform modules
- reusable abstraction (in theory)
- sharp edges abound if you have deep modules or complicated
modules
- these are bugs and will be fixed
- you can’t treat terraform like puppet
- use modules, but don’t nest modules
- use version tags
- use other git repos
- split modules into git repos
- state
- why even is state?
- how do you cope with state?
- use hashicorp/atlas
- it will run terraform for you
- it solves these problem
- we.. reinvented atlas
- workflow (locking!) is your problem
- if two people run terraform concurrently, you’ll have a bad time
- state will diverge
- merging is not fun
- use hashicorp/atlas
- split state up by team
- search team owns search statefile
- S3 store
- many read, few write
- wrap it yourself (make, jenkins, etc)
- don’t install terraform in $PATH
- you don’t want people running terraform willy-nilly
- don’t install terraform in $PATH
- jenkins to own the workflow
- force people to generate a plan and okay it
- people aren’t evil, but they will take shortcuts
- if they can just run
terraform apply
without planning first, they will do so - protecting me from myself
- “awsadmin” machine + IAM Role as slave
- Makefile based workflow
- jenkins job builder to template things
- you shouldn’t have shell scripts typed in to the jenkins text boxes
- split up the steps
- refresh state, and upload the refreshed state to S3
- plan + save as an artefact
- filter plan!
- things in AWS that terraform doesn’t know about
- lambda functions which tag instances based on who created them
- terraform doesn’t know these tags, so will remove them
- we filter this stuff out
- approve plan
- apply plan, save state
- force people to generate a plan and okay it
- nirvana
- self service cluster provisioning
- devs define their own clusters
- 1 click from ops to approve
- owning team gets accounted
- aws metadata added as needed
- all metadata validated
- clusters built around best practices
- and when we update best practices, clusters get updated to match
- can abstract further in future
- opportunities to do clever things around accounting
- dev requested m4.xlarges, but we have m4.2xlarges as reserved instances
- self service cluster provisioning
- slides: http://ow.ly/XPkvT
- InSpec: Infrastructure Specification
- v similar to server-spec
- started on top of the server-spec project
- code breaks
- normal accident theory
- why?
- reduce number of defects
- security and compliance testing
- test any target
- bare metal / VMs / containers
- linux / windows / other / legacy
- tiny howto
- install from rubygems, or clone git repo
- (see slides)
- test local node
- test remote via ssh
- (no ruby / agent on node)
- test remote via winRM (still no agent)
- test docker container
- example test
describe package('wget') do
it { should be_installed }
end
describe file('/fetch-all.sh') do
it { should be_file }
its('owner') { should eq 'root' }
its('mode') { should eq 0640 }
end
inspec exec dtest.rb -t docker://f02e
- run via test-kitchen
kitchen-inspec
- demo: solaris box running within test kitchen (on vagrant)
- test-kitchen verify normally takes a long time because it installs a bunch of stuff on the box
- much faster to verify with inspec
- solaris test:
describe os do
it { should be_solaris }
end
describe package('network/ssh') do
it { should be_installed }
end
describe service('/network/ssh') do
it { should be_running }
end
describe file('/etc/ssh/sshd_config') do
its(:content) { should match /Protocol 2/ }
end
this regex is brittle. comment? prefix/suffix?
Better:
describe sshd_config do
its('Protocol') { should cmp 2 }
end
custom resources help with this.
- “containers suck too”
- “docker security is a mess!”
- physical separation?
- “images on docuker hub are insecure!”
- just community contributions
- lots of docker images contain bash with shellshock vulnerability
- docker images are artefacks, treat them like vmdk/vhd/vdi/deb/rpm
- build your own lightweight base images
- use base images without lots of userland tools if possible (eg alpine linux)
- just community contributions
- dockerfile is “return of the bash”
- over-engineered dockerfiles
- replace large shell scripts with CM running outside the container
- what we want is configuration management with a smaller footprint
- avoid requiring ruby/python/etc inside the container just to get your CM tool running
- scheduling/orchestration is a whole new area
- “docker security is a mess!”
- http://rexify.org - a perl-based CM tool
- “it doesn’t matter how many resources you have, if you don’t know how to use them, it will never be enough”
- use cases
- continuous integration & delivery
- AIX was first released in 1987 (?)
- I first came in as an engineering manager, but I knew nothing about this platform
- some quirks, some pains
- Test Kitchen support – rough and unreleased
- traditional management of AIX:
- manually
- SMIT - menu-driven config tool
- transforming old-school shops has two routes:
- migrate AIX to linux, then automate with chef
- manage AIX with chef, then migrate to linux
- second route is easier as it abstracts away the OS so there’s less to learn at each step
- challenges
- lack of familiar with platform, hardware, setup
- hypervisor-based but all in hardware
- XLC - proprietary compiler
- if you use XLC, output is guaranteed forward-compatible forever
- binaries from 1989 still run on AIX today
- can’t use GNU-isms
- no bash, it’s korn shell
- no
less
!
- no real package manager
- bff
- you can use rpm
- but no yum or anything
- two init systems (init and SRC)
- key features which are missing!
- virtualization features
- sometimes cool, sometimes not
- lack of familiar with platform, hardware, setup
- platform quirks & features
- all core chef resources work out of the box on AIX
- special resources in core
- bff_package
- service - need to specify init or SRC, some actions don’t work
- more specific AIX resources in
aix
library cookbook- manage inittab etc
- chef’s installer is sh-compatible which is necessary for AIX
- the file at https://www.chef.io/chef/install.sh doesn’t use bash-isms
- starts with this comment:
# WARNING: REQUIRES /bin/sh
#
# - must run on /bin/sh on solaris 9
# - must run on /bin/sh on AIX 6.x
#
- future work
- other POWER platform support
- chef server on POWER
- chef client for linux on System/z
- links
rkt and Kubernetes: What’s new with Container Runtimes and Orchestration, Jonathan Boulle @baronboulle
- appc pods ≅ kubernetes pods
- rkt
- simple cli tool
- no (mandatory) daemon
- big daemon running as root feels like not the best default setup
- no (mandatory) API
- bash/systemd/kubelet -> rkt run -> application(s)
- stage0 (rkt binary)
- primary interface to rkt
- discover, fetch, manage app images
- set up pod filesystems
- manage pod lifecycle
- rkt run
- rkt image list
- stage1 (swappable execution engines)
- default impl
- systemd-nspawn+systemd
- linux namespaces + cgroups
- kvm impl
- based on lkvm+systemd
- hw virtualization for isolation
- others?
- default impl
- rkt TPM measurement
- used to “measure” system state
- historically just use to verify bootloader/OS
- CoreOS added support to GRUB
- rkt can now record information about running pods in the TPM
- tamper-proof audit log
- rkt API service
- optional gRPC-based API daemon
- exposes information on pods and image
- runs as unprivileged user
- read-only
- easier integration
- recap: why rkt?
- secure, standards, composable
- rktnetes
- using rkt as the kubelet’s container runtime
- a pod-native runtime
- first-class integration with systemd hosts
- self-contained pods process model -> no SPOF
- multiple-image compatibility (eg docker 2aci)
- transparently swappable container engines
- possible topologies
- kubelet -> systemd -> rkt pod
- could remove systemd and run pod directly on kubelet (kubelet -> rkt pod)
- using rkt to run kubernetes
- kubernetes components are largely self-hosting, but not entirely
- need a way to bootstrap kubelet on the host
- on coreos, this means in a container (because that’s the only way to run things on coreos)..
- ..but kubelet has some unique reuirements
- like mounting volumes on the host
- rkt “fly” feature (new in rkt 0.15.0)
- unlike rkt run, doesn’t run in pod; uncontained
- has full access to host mount (and pid..) namespace
- rkt networking
- plugin-based
- IP(s) per pod
- container networking interface (CNI)
- CNI was just another plugin type, but soon to be the kubernetes plugin model
- http://blog.kubernetes.io/2016/01/why-Kubernetes-doesnt-use-libnetwork.html
- aside: use letsencrypt, please, blog.kubernetes.io!
- future
- rkt v1.0.0
- soon....
- rktnetes 1.0 2016Q1
- fully supported, full feature parity, automated testing on coreos
- rktnetes 1.0+
- lkvm backend by default
- native support for ACIs
- tectonic trusted computing
- https://coreos.com/blog/coreos-trusted-computing.html
- kubelet upgrades
- mixed-version clusters don’t always work
- (eg api from 1.0.7 to 1.1.1: https://coreos.com/blog/coreos-trusted-computing.html )
- solution: API-driven upgrades
- rkt v1.0.0
- summary
- use rkt
- use kubernetes
- get involved and help define future of app containers
- does use of KVM mean non-linux hosts can be run inside?
- currently no
- image format for a registry?
- we don’t have a centralized registry
- we want to get away from that model
- can rkt run docker images
- yes
- the current kubernetes api only accepts docker images so it’s the only thing it can run
- what do I have to actually do to use rkt?
- there’s a using rkt with kubernetes guide