Skip to content

Instantly share code, notes, and snippets.

@dav1x
Created December 12, 2017 15:45
Show Gist options
  • Save dav1x/5de4cb2039e9783dc33095defa3030fc to your computer and use it in GitHub Desktop.
Save dav1x/5de4cb2039e9783dc33095defa3030fc to your computer and use it in GitHub Desktop.
Keeping your OpenShift Container Platform HAproxy HA with Keepalived
A typical OpenShift Container Platform deployment will have multiple master, app and infra nodes for high availability. In this case, there is no single point of failure for the cluster, unless you have a single HAproxy server configured. The following article will discuss how to configure keepalived for maximum uptime for HAproxy. In the vsphere on OCP reference architecture, [https://access.redhat.com/documentation/en-us/reference_architectures/2017/html/deploying_a_red_hat_openshift_container_platform_3_on_vmware_vcenter_6/] two HAproxy virtual machines are configured and the ansible playbooks set up keepalived using a virtual IP address for Virtual Router Rendundancy Protocol or VRRP.
Load Balancer Options
The load balancer will distribute traffic accross two different groups. HAproxy serves ports 8443 for the masters and 80, 443 for the infra nodes for the routers. The reference architecture provides a couple of different options for deployments.
Playbook variables Description
byo_lb = False The first option creates a single HAproxy VM instance using the Ansible playbooks.
byo_lb = False
lb_ha_ip = Assign the floating VIP for keepalived The Ansible playbooks will create two custom HAproxy VM instances and configure them as highly available utilizing keepalived.
byo_lb = True
lb_host = Assign the FQDN of the existing load balancer The last option leverages an existing on-premise loadbalaner, define the variables in the INI file.
The first option deploys a single virtual machine running HAproxy to load balance the cluster as above.
The second option deploys two virtual machines, configures HAproxy and lastly configures keepalived.
In some circumstances an existing load balancer will be available and on premise, the playbooks allow for the use of the external load balancer using the third option. The keepalived option will be discussed below.
Keepalived Automated Configuration
The keepalived ansible role accomplishes the following tasks:
* Install keepalived and psmisc
* keepalived: Obviously this package will be converting haproxy to a high available multi node load balancer.
* psmisc: This package contains killall for the vrrp check.
* An interface is then queried for use with the services: external_interface
* Allow all connections from the interface for traffic use
* A fact is set for the load balancers VRRP interface: openshift_master_cluster_public_ip
* Generate a random external password
* Start keepalived
* Configure keepalived with the provided jinja2 template and trigger the handler to restart the services
Jinja2 Keepalived Template Configuration
The jinja2 template will be discussed below:
```
global_defs {
router_id ocp_vrrp
}
vrrp_script haproxy_check {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance OCP_EXT {
interface {{ external_interface }}
virtual_router_id 51
priority {% if groups.haproxy_group.index(inventory_hostname) == 0 %} {{ keepalived_priority_start }}{% else %} {{ keepalived_priority_start - 2 }}{% endif %}
state {% if groups.haproxy_group.index(inventory_hostname) == 0 %} {{ "MASTER" }}{% else %} {{ "BACKUP" }}{% endif %}
virtual_ipaddress {
{{ openshift_master_cluster_public_ip }} dev {{ external_interface }}
}
track_script {
haproxy_check
}
authentication {
auth_type PASS
auth_pass {{ keepalived_pass.stdout }}
}
}
```
The interesting bits are the priority and state sections. The logic pivots off of the inventory hostname index of the group 'haproxy_group'. If the index of the group is 0 then that haproxy is the master and the keepalived keepalived_priority_start is set to 100. If the index is not 0, the server is set to backup and 98 is the keepalived_priority_start.
The completed master configuration below:
```
[root@haproxy-1 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
router_id ocp_vrrp
}
vrrp_script haproxy_check {
script "killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance OCP_EXT {
interface ens192
virtual_router_id 51
priority 100
state MASTER
virtual_ipaddress {
10.x.x.231 dev ens192
}
track_script {
haproxy_check
}
authentication {
auth_type PASS
auth_pass 1cee4b6e-2cdc-48bf-83b2-01a96d1593e4
}
}
```
Verifying functionality and simulating a failure
After a successful deployment and install the haproxy nodes will be deploying traffic via haproxy and the VRRP vip will be deployed on that host as well:
```
[root@haproxy-1 ~]# ss -tlpn | grep haproxy
LISTEN 0 128 *:80 *:* users:(("haproxy",pid=2606,fd=7))
LISTEN 0 128 *:8443 *:* users:(("haproxy",pid=2606,fd=9))
LISTEN 0 128 *:443 *:* users:(("haproxy",pid=2606,fd=8))
LISTEN 0 128 *:9000 *:* users:(("haproxy",pid=2606,fd=5))
[root@haproxy-1 ~]# ip addr show dev ens192
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:a5:18:73 brd ff:ff:ff:ff:ff:ff
inet 10.19.114.227/23 brd 10.19.115.255 scope global ens192
valid_lft forever preferred_lft forever
inet 10.19.114.231/32 scope global ens192
[root@haproxy-1 ~]# cat /etc/keepalived/keepalived.conf | grep MASTER
state MASTER
```
Note that in this deployment that haproxy-1 is the master. Haproxy-1 needs to be rebooted. Here is a simulation of that:
```
dav1x-m:~ dphillip$ ping haproxy.example.com
PING haproxy.example.com (10.x.x.231): 56 data bytes
64 bytes from 10.19.114.231: icmp_seq=0 ttl=54 time=120.903 ms
64 bytes from 10.19.114.231: icmp_seq=1 ttl=54 time=119.683 ms
64 bytes from 10.19.114.231: icmp_seq=2 ttl=54 time=119.945 ms
64 bytes from 10.19.114.231: icmp_seq=3 ttl=54 time=119.907 ms
64 bytes from 10.19.114.231: icmp_seq=4 ttl=54 time=120.771 ms
64 bytes from 10.19.114.231: icmp_seq=5 ttl=54 time=119.627 ms
64 bytes from 10.19.114.231: icmp_seq=6 ttl=54 time=119.696 ms
64 bytes from 10.19.114.231: icmp_seq=7 ttl=54 time=120.184 ms
64 bytes from 10.19.114.231: icmp_seq=8 ttl=54 time=119.258 ms
Request timeout for icmp_seq 9
64 bytes from 10.19.114.231: icmp_seq=10 ttl=54 time=121.358 ms
64 bytes from 10.19.114.231: icmp_seq=11 ttl=54 time=120.285 ms
64 bytes from 10.19.114.231: icmp_seq=12 ttl=54 time=119.652 ms
```
Now haproxy-0 is the master until the reboot finishes:
```
[root@haproxy-0 ~]# ip addr show dev ens192
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether 00:50:56:a5:18:73 brd ff:ff:ff:ff:ff:ff
inet 10.19.114.227/23 brd 10.19.115.255 scope global ens192
valid_lft forever preferred_lft forever
inet 10.19.114.231/32 scope global ens192
```
Accompanying OCP Installation Vars
To complete the OCP deployment installation variables need to be set to inform OpenShift to use the load balancer VIP:
```
wildcard_zone: apps.example.com
osm_default_subdomain: "{{ wildcard_zone }}"
openshift_master_default_subdomain: "{{osm_default_subdomain}}"
deployment_type: openshift-enterprise
load_balancer_hostname: 10.x.x.231
openshift_master_cluster_hostname: "{{ load_balancer_hostname }}"
openshift_master_cluster_public_hostname: "{{ load_balancer_hostname }}"
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment