Pacemaker Cluster Resource Manager (CRM)
- https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Administration/html/pcs-crmsh.html
- https://people.redhat.com/kgaillot/pacemaker/doc/2.1/Pacemaker_Administration/pdf/Pacemaker_Administration.pdf
RHEL:
dnf install -y firewalld pcs pacemaker corosync resource-agents fence-agents-all
dnf install -y resource-agents-cloud
SLES:
passwd hacluster
firewall-cmd --state
firewall-cmd --permanent --add-service=high-availability
firewall-cmd --add-service=high-availability
systemctl enable --now pcsd.service
pcs cluster auth <host01> <host02> <host03>
pcs cluster setup
pcs status
pcs status --full
pcs status --debug
crm_mon -1
crm_mon --one-shot --inactive
strace -Tttvfs 1024 -o /tmp/strace.out /usr/sbin/crm_mon --one-shot --inactive
pcs status --full
pcs config
pcs cluster cib
Upgrade CIB:
pcs cluster cib-upgrade
pcs property list --all
pcs property list --defaults
pcs property show --defaults
pcs property show cluster-infrastructure
pcs property show property
pcs resource standards
pcs resource agents ocf
pcs resource agents lsb
pcs resource agents service
pcs resource agents stonith
pcs resource agents
pcs resource agents ocf:pacemaker
pcs resource describe <resource_agent>
# example
pcs resource describe IPaddr2
pcs resource describe ocf:heartbeat:IPaddr2
pcs resource show
pcs resource show --full
pcs resource show <resource_name>
pcs resource config
pcs stonith show
pcs stonith show --full
List stonith resources:
pcs stonith list
ls -lA /usr/sbin/fence*
pcs stonith describe <stonith agent>
pcs stonith describe <stonith agent> --full
Show the options for fence_aws stonith script:
pcs stonith describe fence_aws --full
Update hosts and instance IDs for existing stonith resource named clusterfence
:
pcs stonith update clusterfence fence_aws pcmk_host_map="<hostname>:<instance_id>;<hostname>:<instance_id>"
pcs property set startup-fencing=true
pcs property set concurrent-fencing=true
pcs property set stonith-action=reboot
pcs property set stonith-action=reboot
pcs property set stonith-timeout=300s
pcs property set stonith-max-attempts=10
pcs property set stonith-enabled=true
pcs property set stonith-enabled=false
pcs property set maintenance-mode=false
pcs property set no-quorum-policy=ignore
pcs property set symmetic-cluster=
# Create an opt-in cluster, which prevents resources from running anywhere by default
pcs property set symmetric-cluster=false
# Create an opt-out cluster, which allows resources to run everywhere by default
pcs property set symmetric-cluster=true
pcs property set shutdown-escalation=20min
#pcs stonith create name fencing_agent parameters
pcs stonith create rhev-fence fence_rhevm ipaddr=engine.local.net ipport=443 ssl_insecure=1 ssl=1 inet4_only=1 login=admin@internal passwd=PASSWD pcmk_host_map="clu01:clu01.local.net;clu02:clu02.local.net;clu03:clu03.local.net" pcmk_host_check=static-list pcmk_host_list="clu01.local.net,clu02.local.net,clu03.local.net" power_wait=3 op monitor interval=90s
pcs stonith update rhev-fence fence_rhevm api_path=/ovirt-engine/api disable_http_filter=1 ipaddr=engine.local.net ipport=443 ssl_insecure=1 ssl=1 inet4_only=1 login=admin@internal passwd=PASSWORD pcmk_host_map="clu01.local.net:clu01.local.net;clu02.local.net:clu02.local.net;clu03.local.net:clu03.local.net" pcmk_host_check=static-list pcmk_host_list="clu01.local.net,clu02.local.net,clu03.local.net" power_wait=3 op monitor interval=90s
pcs stonith update rhev-fence port=nodeb
pcs stonith show rhev-fence
fence_rhevm -o status -a engine.local.net --username=admin@internal --password=PASSWORD --ipport=443 -n clu03.local.net -z --ssl-insecure --disable-http-filter
pcs stonith fence hostname
Note: When Pacemaker's policy engine creates a transition with a fencing request, the stonith daemon uses the timeout value that is passed by the transition engine. This matches the value of the stonith-timeout
cluster property.
When fencing is triggered manually via stonith_admin
or pcs stonith fence
, the default timeout implemented in stonith_admin (120s) is used instead.
pcs stonith delete fence_noded
systemctl disable --now sbd
pcs property set stonith-watchdog-timeout=0
pcs cluster stop --all
pcs cluster start --all
pcs property unset stonith-watchdog-timeout
pcs resource create webmail MailTo [email protected] subject="CLUSTER-NOTIFICATINS" --group=firstweb
vim /usr/local/bin/crm_notify.sh
pcs resource create mailme ClusterMon extra_options="-e [email protected] -E /usr/local/bin/crm_notify.sh --clone
crmsh configuration file:
/etc/crm/crm.conf
~/.config/crm/crm.conf
~/.crm.rc
crm options user hacluster
cat /etc/sudoers
ls -lA /etc/sudoers.d/
crm configure property enable-acl=true
Now all users for whom you want to modify access rights with ACLs must belong to the haclient
group.
View cluster status using crmsh:
crm status
watch -n 1 -c crm status
crm_mon -1
crm_mon -1Ar
Troubleshooting comamnds:
crm cluster health
crm_mon --one-shot --inactive
strace -Tttvfs 1024 -o /tmp/strace.out /usr/sbin/crm_mon --one-shot --inactive
/usr/sbin/crm configure show
/usr/sbin/crm configure show | grep cli-
/usr/sbin/crm configure show obscure:passw* | grep cli-
cibadmin -Q
Show raw configuration:
/usr/sbin/crm configure show xml
Show options
crm configure show cib-bootstrap-options
crm configure show rsc-options
crm configure show op-options
crm configure show SAPHanaSR
Sometimes new features are only available with the latest CIB syntax version. When you upgrade to a new product version, your CIB syntax version will not be upgraded by default.
Check your version with:
cibadmin -Q | grep validate-with
Upgrade to the latest CIB syntax version with:
cibadmin --upgrade --force
crm ra classes
crm ra list ocf pacemaker
crm configure property stonith-enabled=true
crm configure property stonith-enabled=false
SAPHanaSR-showAttr
SAPHanaSR-monitor
cs_clusterstate -i
cs_show_error_patterns -c | grep -v "=.0"
cs_show_memory
cs_sum_base_config
SAP HANA:
cs_show_hana_autofailover --all
cs_show_hana_info --info $SID $nr
/hana/shared
For the SAP instances to work, the following shared file systems must be available on all cluster nodes:
/sapmnt
/usr/sap/trans
The shared file systems can either be managed by the cluster or they can be statically mounted by adding them to the /etc/fstab on each cluster node.
ps aux | grep <SID>adm | grep sapstartsrv
sapcontrol -nr $nr -function GetSystemInstanceList
sapcontrol -nr $nr -function HAGetFailoverConfig
sapcontrol -nr $nr -function HACheckFailoverConfig
#sapcontrol -nr $nr -function StopService
#sapcontrol -nr $nr -function StartService <SID>
#sapcontrol -nr $nr -function StartSystem
#sapcontrol -nr $nr -function StopSystem ALL
HDBsettings.sh systemOverview.py
HDBSettings.sh systemReplicationStatus.py; echo RC:$?
HDBSettings.sh landscapeHostConfiguration.py; echo RC:$?
hdbnsutil -sr_state