dav1x · December 12, 2017 15:45
diff --git a/gistfile1.txt b/gistfile1.txt
 Keeping your OpenShift Container Platform HAproxy HA with Keepalived

 A typical OpenShift Container Platform deployment will have multiple master, app and infra nodes for high availability. In this case, there is no single point of failure for the cluster, unless you have a single HAproxy server configured. The following article will discuss how to configure keepalived for maximum uptime for HAproxy. In the vsphere on OCP reference architecture, [https://access.redhat.com/documentation/en-us/reference_architectures/2017/html/deploying_a_red_hat_openshift_container_platform_3_on_vmware_vcenter_6/] two HAproxy virtual machines are configured and the ansible playbooks set up keepalived using a virtual IP address for Virtual Router Rendundancy Protocol or VRRP. 


 Load Balancer Options

 The load balancer will distribute traffic accross two different groups. HAproxy serves ports 8443 for the masters and 80, 443 for the infra nodes for the routers. The reference architecture provides a couple of different options for deployments.

 Playbook variables	Description
 byo_lb = False	The first option creates a single HAproxy VM instance using the Ansible playbooks.
 byo_lb = False lb_ha_ip = Assign the floating VIP for keepalived	The Ansible playbooks will create two custom HAproxy VM instances and configure them as highly available utilizing keepalived.
 byo_lb = True lb_host = Assign the FQDN of the existing load balancer	The last option leverages an existing on-premise loadbalaner, define the variables in the INI file.

 The first option deploys a single virtual machine running HAproxy to load balance the cluster as above. 

 The second option deploys two virtual machines, configures HAproxy and lastly configures keepalived. 

 In some circumstances an existing load balancer will be available and on premise, the playbooks allow for the use of the external load balancer using the third option. The keepalived option will be discussed below. 

 Keepalived Automated Configuration

 The keepalived ansible role accomplishes the following tasks:
 * Install keepalived and psmisc
 	* keepalived: Obviously this package will be converting haproxy to a high available multi node load balancer.
 	* psmisc: This package contains killall for the vrrp check.
 * An interface is then queried for use with the services: external_interface
 * Allow all connections from the interface for traffic use
 * A fact is set for the load balancers VRRP interface: openshift_master_cluster_public_ip
 * Generate a random external password
 * Start keepalived 
 * Configure keepalived with the provided jinja2 template and trigger the handler to restart the services

 Jinja2 Keepalived Template Configuration

 The jinja2 template will be discussed below:
 ```
 global_defs {
   router_id ocp_vrrp
 }

 vrrp_script haproxy_check {
   script "killall -0 haproxy"
   interval 2
   weight 2 
 }

 vrrp_instance OCP_EXT {
   interface {{ external_interface }}

   virtual_router_id 51

   priority {% if groups.haproxy_group.index(inventory_hostname) == 0 %} {{ keepalived_priority_start }}{% else %} {{  keepalived_priority_start - 2 }}{% endif %}

   state {% if groups.haproxy_group.index(inventory_hostname) == 0 %} {{ "MASTER" }}{% else %} {{ "BACKUP" }}{% endif %}

   virtual_ipaddress {
       {{ openshift_master_cluster_public_ip }}  dev {{ external_interface }}

   }
   track_script {
       haproxy_check
   }
   authentication {
      auth_type PASS
      auth_pass {{ keepalived_pass.stdout }}
   }
 }
 ```
 The interesting bits are the priority and state sections. The logic pivots off of the inventory hostname index of the group 'haproxy_group'. If the index of the group is 0 then that haproxy is the master and the keepalived keepalived_priority_start is set to 100. If the index is not 0, the server is set to backup and 98 is the keepalived_priority_start. 

 The completed master configuration below:
 ```
 [root@haproxy-1 ~]# cat /etc/keepalived/keepalived.conf
 global_defs {
   router_id ocp_vrrp
 }

 vrrp_script haproxy_check {
   script "killall -0 haproxy"
   interval 2
   weight 2 
 }

 vrrp_instance OCP_EXT {
   interface ens192

   virtual_router_id 51

   priority  100
   state  MASTER
   virtual_ipaddress {
       10.x.x.231  dev ens192

   }
   track_script {
       haproxy_check
   }
   authentication {
      auth_type PASS
      auth_pass 1cee4b6e-2cdc-48bf-83b2-01a96d1593e4
   }
 }
 ```

 Verifying functionality and simulating a failure

 After a successful deployment and install the haproxy nodes will be deploying traffic via haproxy and the VRRP vip will be deployed on that host as well:

 ```
 [root@haproxy-1 ~]# ss -tlpn | grep haproxy
 LISTEN     0      128          *:80                       *:*                   users:(("haproxy",pid=2606,fd=7))
 LISTEN     0      128          *:8443                     *:*                   users:(("haproxy",pid=2606,fd=9))
 LISTEN     0      128          *:443                      *:*                   users:(("haproxy",pid=2606,fd=8))
 LISTEN     0      128          *:9000                     *:*                   users:(("haproxy",pid=2606,fd=5))

 [root@haproxy-1 ~]# ip addr show dev ens192
 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:50:56:a5:18:73 brd ff:ff:ff:ff:ff:ff
    inet 10.19.114.227/23 brd 10.19.115.255 scope global ens192
       valid_lft forever preferred_lft forever
    inet 10.19.114.231/32 scope global ens192

 [root@haproxy-1 ~]# cat /etc/keepalived/keepalived.conf | grep MASTER
   state  MASTER
 ```

 Note that in this deployment that haproxy-1 is the master. Haproxy-1 needs to be rebooted. Here is a simulation of that:

 ```
 dav1x-m:~ dphillip$ ping haproxy.example.com
 PING  haproxy.example.com (10.x.x.231): 56 data bytes
 64 bytes from 10.19.114.231: icmp_seq=0 ttl=54 time=120.903 ms
 64 bytes from 10.19.114.231: icmp_seq=1 ttl=54 time=119.683 ms
 64 bytes from 10.19.114.231: icmp_seq=2 ttl=54 time=119.945 ms
 64 bytes from 10.19.114.231: icmp_seq=3 ttl=54 time=119.907 ms
 64 bytes from 10.19.114.231: icmp_seq=4 ttl=54 time=120.771 ms
 64 bytes from 10.19.114.231: icmp_seq=5 ttl=54 time=119.627 ms
 64 bytes from 10.19.114.231: icmp_seq=6 ttl=54 time=119.696 ms
 64 bytes from 10.19.114.231: icmp_seq=7 ttl=54 time=120.184 ms
 64 bytes from 10.19.114.231: icmp_seq=8 ttl=54 time=119.258 ms
 Request timeout for icmp_seq 9
 64 bytes from 10.19.114.231: icmp_seq=10 ttl=54 time=121.358 ms
 64 bytes from 10.19.114.231: icmp_seq=11 ttl=54 time=120.285 ms
 64 bytes from 10.19.114.231: icmp_seq=12 ttl=54 time=119.652 ms
 ``` 

 Now haproxy-0 is the master until the reboot finishes:

 ```
 [root@haproxy-0 ~]# ip addr show dev ens192
 2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:50:56:a5:18:73 brd ff:ff:ff:ff:ff:ff
    inet 10.19.114.227/23 brd 10.19.115.255 scope global ens192
       valid_lft forever preferred_lft forever
    inet 10.19.114.231/32 scope global ens192
 ```

 Accompanying OCP Installation Vars

 To complete the OCP deployment installation variables need to be set to inform OpenShift to use the load balancer VIP:
 ```
    wildcard_zone: apps.example.com
    osm_default_subdomain: "{{ wildcard_zone }}"
    openshift_master_default_subdomain: "{{osm_default_subdomain}}"
    deployment_type: openshift-enterprise
    load_balancer_hostname: 10.x.x.231
    openshift_master_cluster_hostname: "{{ load_balancer_hostname }}"
    openshift_master_cluster_public_hostname: "{{ load_balancer_hostname }}"
 ```
	Keeping your OpenShift Container Platform HAproxy HA with Keepalived

	A typical OpenShift Container Platform deployment will have multiple master, app and infra nodes for high availability. In this case, there is no single point of failure for the cluster, unless you have a single HAproxy server configured. The following article will discuss how to configure keepalived for maximum uptime for HAproxy. In the vsphere on OCP reference architecture, [https://access.redhat.com/documentation/en-us/reference_architectures/2017/html/deploying_a_red_hat_openshift_container_platform_3_on_vmware_vcenter_6/] two HAproxy virtual machines are configured and the ansible playbooks set up keepalived using a virtual IP address for Virtual Router Rendundancy Protocol or VRRP.


	Load Balancer Options

	The load balancer will distribute traffic accross two different groups. HAproxy serves ports 8443 for the masters and 80, 443 for the infra nodes for the routers. The reference architecture provides a couple of different options for deployments.

	Playbook variables Description
	byo_lb = False The first option creates a single HAproxy VM instance using the Ansible playbooks.
	byo_lb = False lb_ha_ip = Assign the floating VIP for keepalived The Ansible playbooks will create two custom HAproxy VM instances and configure them as highly available utilizing keepalived.
	byo_lb = True lb_host = Assign the FQDN of the existing load balancer The last option leverages an existing on-premise loadbalaner, define the variables in the INI file.

	The first option deploys a single virtual machine running HAproxy to load balance the cluster as above.

	The second option deploys two virtual machines, configures HAproxy and lastly configures keepalived.

	In some circumstances an existing load balancer will be available and on premise, the playbooks allow for the use of the external load balancer using the third option. The keepalived option will be discussed below.

	Keepalived Automated Configuration

	The keepalived ansible role accomplishes the following tasks:
	* Install keepalived and psmisc
	* keepalived: Obviously this package will be converting haproxy to a high available multi node load balancer.
	* psmisc: This package contains killall for the vrrp check.
	* An interface is then queried for use with the services: external_interface
	* Allow all connections from the interface for traffic use
	* A fact is set for the load balancers VRRP interface: openshift_master_cluster_public_ip
	* Generate a random external password
	* Start keepalived
	* Configure keepalived with the provided jinja2 template and trigger the handler to restart the services

	Jinja2 Keepalived Template Configuration

	The jinja2 template will be discussed below:
	```
	global_defs {
	router_id ocp_vrrp
	}

	vrrp_script haproxy_check {
	script "killall -0 haproxy"
	interval 2
	weight 2
	}

	vrrp_instance OCP_EXT {
	interface {{ external_interface }}

	virtual_router_id 51

	priority {% if groups.haproxy_group.index(inventory_hostname) == 0 %} {{ keepalived_priority_start }}{% else %} {{ keepalived_priority_start - 2 }}{% endif %}

	state {% if groups.haproxy_group.index(inventory_hostname) == 0 %} {{ "MASTER" }}{% else %} {{ "BACKUP" }}{% endif %}

	virtual_ipaddress {
	{{ openshift_master_cluster_public_ip }} dev {{ external_interface }}

	}
	track_script {
	haproxy_check
	}
	authentication {
	auth_type PASS
	auth_pass {{ keepalived_pass.stdout }}
	}
	}
	```
	The interesting bits are the priority and state sections. The logic pivots off of the inventory hostname index of the group 'haproxy_group'. If the index of the group is 0 then that haproxy is the master and the keepalived keepalived_priority_start is set to 100. If the index is not 0, the server is set to backup and 98 is the keepalived_priority_start.

	The completed master configuration below:
	```
	[root@haproxy-1 ~]# cat /etc/keepalived/keepalived.conf
	global_defs {
	router_id ocp_vrrp
	}

	vrrp_script haproxy_check {
	script "killall -0 haproxy"
	interval 2
	weight 2
	}

	vrrp_instance OCP_EXT {
	interface ens192

	virtual_router_id 51

	priority 100
	state MASTER
	virtual_ipaddress {
	10.x.x.231 dev ens192

	}
	track_script {
	haproxy_check
	}
	authentication {
	auth_type PASS
	auth_pass 1cee4b6e-2cdc-48bf-83b2-01a96d1593e4
	}
	}
	```

	Verifying functionality and simulating a failure

	After a successful deployment and install the haproxy nodes will be deploying traffic via haproxy and the VRRP vip will be deployed on that host as well:

	```
	[root@haproxy-1 ~]# ss -tlpn \| grep haproxy
	LISTEN 0 128 :80 :* users:(("haproxy",pid=2606,fd=7))
	LISTEN 0 128 :8443 :* users:(("haproxy",pid=2606,fd=9))
	LISTEN 0 128 :443 :* users:(("haproxy",pid=2606,fd=8))
	LISTEN 0 128 :9000 :* users:(("haproxy",pid=2606,fd=5))

	[root@haproxy-1 ~]# ip addr show dev ens192
	2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
	link/ether 00:50:56:a5:18:73 brd ff:ff:ff:ff:ff:ff
	inet 10.19.114.227/23 brd 10.19.115.255 scope global ens192
	valid_lft forever preferred_lft forever
	inet 10.19.114.231/32 scope global ens192

	[root@haproxy-1 ~]# cat /etc/keepalived/keepalived.conf \| grep MASTER
	state MASTER
	```

	Note that in this deployment that haproxy-1 is the master. Haproxy-1 needs to be rebooted. Here is a simulation of that:

	```
	dav1x-m:~ dphillip$ ping haproxy.example.com
	PING haproxy.example.com (10.x.x.231): 56 data bytes
	64 bytes from 10.19.114.231: icmp_seq=0 ttl=54 time=120.903 ms
	64 bytes from 10.19.114.231: icmp_seq=1 ttl=54 time=119.683 ms
	64 bytes from 10.19.114.231: icmp_seq=2 ttl=54 time=119.945 ms
	64 bytes from 10.19.114.231: icmp_seq=3 ttl=54 time=119.907 ms
	64 bytes from 10.19.114.231: icmp_seq=4 ttl=54 time=120.771 ms
	64 bytes from 10.19.114.231: icmp_seq=5 ttl=54 time=119.627 ms
	64 bytes from 10.19.114.231: icmp_seq=6 ttl=54 time=119.696 ms
	64 bytes from 10.19.114.231: icmp_seq=7 ttl=54 time=120.184 ms
	64 bytes from 10.19.114.231: icmp_seq=8 ttl=54 time=119.258 ms
	Request timeout for icmp_seq 9
	64 bytes from 10.19.114.231: icmp_seq=10 ttl=54 time=121.358 ms
	64 bytes from 10.19.114.231: icmp_seq=11 ttl=54 time=120.285 ms
	64 bytes from 10.19.114.231: icmp_seq=12 ttl=54 time=119.652 ms
	```

	Now haproxy-0 is the master until the reboot finishes:

	```
	[root@haproxy-0 ~]# ip addr show dev ens192
	2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
	link/ether 00:50:56:a5:18:73 brd ff:ff:ff:ff:ff:ff
	inet 10.19.114.227/23 brd 10.19.115.255 scope global ens192
	valid_lft forever preferred_lft forever
	inet 10.19.114.231/32 scope global ens192
	```

	Accompanying OCP Installation Vars

	To complete the OCP deployment installation variables need to be set to inform OpenShift to use the load balancer VIP:
	```
	wildcard_zone: apps.example.com
	osm_default_subdomain: "{{ wildcard_zone }}"
	openshift_master_default_subdomain: "{{osm_default_subdomain}}"
	deployment_type: openshift-enterprise
	load_balancer_hostname: 10.x.x.231
	openshift_master_cluster_hostname: "{{ load_balancer_hostname }}"
	openshift_master_cluster_public_hostname: "{{ load_balancer_hostname }}"
	```