Created
January 26, 2016 22:43
-
-
Save michaeltchapman/adf2eb593e619d1494f6 to your computer and use it in GitHub Desktop.
OpenStack as microservices
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Openstack as microservices | |
-------------------------- | |
One of the more promising experiments I did in the devops space in the last couple of years was supporting | |
microservices architectures in the way OpenStack is deployed and operated. The core design is that we have | |
consul running across all nodes as the initial or 'seed' service to establish cluster membership, and from | |
that base we can build everything we need. When data changes in consul, this triggers puppet runs on the | |
nodes that are subscribing to that data via consul_watch, and the updated data is sent in via hiera to be | |
realised on each node. This creates a feedback loop whereby services can be arbirtrarily distributed or | |
consolidated as needed depending on the deployment scenario. | |
Bootstrap | |
--------- | |
We run this script via cloud-init on every node to get consul going: | |
https://github.com/michaeltchapman/puppet_openstack_builder/blob/master/provision/bootstrap.sh | |
This depends on a consul RPM I built, since it's not in any standard repos. It's just a go binary | |
so it's a super simple rpmspec. | |
We need to give consul a couple of pieces of data in order to get started: | |
- An integer for bootstrap_expect, that tells consul to bootstrap the k/v store after this many nodes | |
have joined the serf cluster (serf sits below consul and handles membership + messaging) | |
- The IP address of another node in the cluster (the first node doesn't need this) | |
- The interface to use, if not eth0 | |
This is sent in via cloud-init. I think I convert all cloud-init data directly to yaml and put it in hiera. | |
Puppet | |
------ | |
Standard nodeless role/profile composition layer for puppet defines roles as a list of profiles mapped to hosts, | |
while profiles are simply classes that include the lower level classes in modules. Openstack has a long history | |
of getting this wrong, many of which I've been involved in. The best implementation today that I know of is not | |
maintained any more, but it's here: https://github.com/Mylezeem/puppeels | |
Simply using this serves to clearly organise the puppet code, but it won't work for microservices because these | |
profiles have no concept of dependencies between them. For example, keystone requires mysql in order to function, | |
and therefore we want to model that dependency and deploy them in order. In reality, there's actually more to it, | |
because we want mysql to be deployed, then we want the load balancer to recognise the backends, create a frontend, | |
and then we want keystone to consume the loadbalanced frontend for mysql. This comes later. | |
To create dependencies, we need a way for puppet to read the current system state. This is done using a hiera | |
backend for consul. The standard consul backend only reads the k/v store, so I wrote an extension to dump the service | |
catalog into a giant hash called service_hash. Then I realised I could do more cool things like query for a particular | |
service by tag, or by whether it's up/down. I also needed to be able to grab an arbitrary IP from a list of nodes all | |
providing one service (so that I can pick a node from the list of backends). I added this here: | |
https://github.com/michaeltchapman/hiera-consul/commit/83fe92abb0d151933353659d1f633ddc2ec0f414 | |
Another thing you'll need to know is there's an undocumenated feature of hiera that I added a year ago called alias() | |
that preserves types, unlike other hiera commands, so you can do hiera mappings of non-string types. This is handy | |
for mapping something like 'the list of all backends for this service' to something. | |
So now that we can do a hiera('service_hash') and then look into that has for the current state, we can create a | |
dependency on some part of that state. In my current code I tied myself in knots doing this with collector appends, | |
but I don't remember why I did that. The easiest way is to have your profile class and then a second level profile, | |
like this: | |
class profile::openstack::keystone( | |
) { | |
include keystone | |
include keystone::roles::admin | |
} | |
class microservices::openstack::keystone( | |
) { | |
# note the haproxy::balanced tag is used here | |
if (!hiera('service_hash__haproxy::balanced__mysql__Address', false)) { | |
runtime_fail { 'keystonemysqldep': | |
fail => true, | |
message => "Keystone: requires mysql and it is not in the registry", | |
} | |
} else { | |
include profile::openstack::keystone | |
include microservices::discovery::consul::keystone | |
} | |
} | |
runtime_fail is a special resource that will always fail. | |
https://github.com/JioCloud/puppet-orchestration_utils/blob/master/lib/puppet/provider/runtime_fail/default.rb | |
This makes the puppet exit code unhappy and lets us know that the deployment didn't succeed. It's not a bad thing | |
and will happen hundreds of times before the cluster finishes deploying. We have a little bash script that runs puppet | |
and if the exit code isn't clean, it queues up puppet again. Once the cluster finishes deploying, the exit codes | |
will be clean across all nodes and no more puppet runs are queued. I use a little tool called ts for the queueing but | |
it's not really important. | |
The second class included needs to create the consul service resource. Here's an example: | |
https://github.com/michaeltchapman/consul_profile/blob/master/manifests/discovery/consul/image_registry.pp | |
I don't put this in the profile so that the profiles can still be used without the orchestration layer. | |
Load balancing | |
-------------- | |
In the previous example keystone is depending on mysql, but with the haproxy::balanced tag. The way this works | |
is a very customised haproxy puppet class. It's here and it's a bit complicated: | |
https://github.com/michaeltchapman/consul_profile/blob/master/manifests/highavailability/loadbalancing/haproxy.pp | |
In the previous step, services registered themselves into consul. When they do that, they can also specify extra | |
data that the haproxy class uses to decide what options to add to the loadbalancing. This is done using the k/v store | |
like this: | |
https://github.com/michaeltchapman/consul_profile/blob/master/manifests/discovery/consul/haproxy_service.pp | |
You can see mysql call this here: | |
https://github.com/michaeltchapman/consul_profile/blob/master/manifests/discovery/consul/mariadb.pp | |
When haproxy has created these resources it will create its own service entry in consul, so if there's 3 galera nodes | |
and haproxy, the mysql service will have 4 nodes providing that service in consul. 3 will have the haproxy::balancemember tag | |
and the haproxy one itself will have haproxy::balanced. This lets me filter for the right address when I look it up | |
from hiera. Haproxy is set to watch the entire consul catalog and mirror what it sees at any given time, and also runs via | |
cron every 2 minutes. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment