mshafiee · December 12, 2019 07:59
diff --git a/DRBD master-slave setup b/DRBD master-slave setup
 DRBD (Distributed Replicated Block Device) is a really cool cluster solution to have radundant data on two or more nodes. Everybody know RAID, DRBD is like RAID 1 over network.

 Install DRBD virtual package (This will work on Debian and Ubuntu):

 apt-get install drbd-utils
 I my case DRBD is running under control of my cluster-software, that means DRBD will be managed by corosync. I remove DRBD from upstart:


 insserv -v -r drbd remove
 We need to start the driver and tell Linux to load it the next time when it boots:


 modprobe drbd
 echo 'drbd' >> /etc/modules
 lsmod | grep drbd
 By the way, this could be your network configuration for the nodes – sync over a dedicated crossover cable:



 # Cluster Sync IFACE on node 1
 auto eth1
 iface eth1 inet static
        address 169.254.0.1
        netmask 255.255.255.248


 # Cluster Sync IFACE on node2
 auto eth1
 iface eth1 inet static
  address 169.254.0.4
  netmask 255.255.255.248
 Configure DRBD



 vim /etc/drbd.conf
 # I decomment it to use my own resource
 #include "drbd.d/global_common.conf";
 include "drbd.d/*.res";
 Define my resource:

 vim /etc/drbd.d/zeldor.res
 # Content of zeldor.res
 global {
    usage-count no;
 }
 
 common {
  syncer { rate 100M; }
 }
 
 resource drbd0 {
  protocol C;
 
  handlers {
    pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
  }
 
  startup {
    degr-wfc-timeout 120;    # 2 minutes.
  }
 
  disk {
    on-io-error   detach;
  }
 
  net {
    after-sb-0pri disconnect;
    after-sb-1pri disconnect;
    after-sb-2pri disconnect;
    rr-conflict disconnect;
  }
 
  syncer {
    rate 100M;
    al-extents 257;
  }
 
  on kvm-1 {
    device     /dev/drbd0;
    disk       /dev/sda5;
    address    169.254.0.1:7788;
    flexible-meta-disk  internal;
  }
 
  on kvm-2 {
    device     /dev/drbd0;
    disk       /dev/sda5;
    address    169.254.0.4:7788;
    flexible-meta-disk  internal;
  }
 }
 Sync rate 110M will be suitable to fill a gigabit interlink connection.

 First start of DRBD
 Before you start DRBD fix permissions:


 chgrp haclient /sbin/drbdsetup
 chmod o-x /sbin/drbdsetup
 chmod u+s /sbin/drbdsetup
 chgrp haclient /sbin/drbdmeta
 chmod o-x /sbin/drbdmeta
 chmod u+s /sbin/drbdmeta
 Start DRBD:


 /etc/init.d/drbd start
 Now we area ready to create the drbd0 device.


 drbdadm create-md drbd0
 Restart DRBD to refresh created resource:


 /etc/init.d/drbd restart
 Execute on primary node, which should be primary(master) all other nodes will be automatically set to slave mode:


 drbdadm -- --overwrite-data-of-peer primary drbd0
 Thats all, DRBD device is now syncing.


 root@kvm-fw1:~> drbd-overview 
 0:drbd0  SyncSource Primary/Secondary UpToDate/Inconsistent C r---- 
     [>....................] sync'ed:  1.0% (404800/408820)M
 This article will help you to optimize your sync speed.

 Now you can choose between a static or dynamic usage of the DRBD device
 Static:
 You can create a filesystem:


 mkfs.ext3 /dev/drbd0
 # or
 mkfs.ext4 /dev/drbd0
 Dynamic:
 You can use it for LVM and be free to do everything you want with it. (my choice)

 A little fix for lvm:


 vim /etc/lvm/lvm.conf
 # By default we accept every block device:
 #filter = [ "a/.*/" ]
 # We want only drbd0
 filter = [ "a|drbd0|", "r|.*|" ]

 pvcreate /dev/drbd0
 Create a volume group:


 vgcreate zeldor /dev/drbd0
 Create logical volume:


 lvcreate -n data --size 100g zeldor
 Intersted? Read more about LVM

 Troubleshooting and Maintenance section
 Split-Brain Solution: primary/unknown and secondary/unknown

 1. First you should umount drbd device if you can.
 2. On primary node issue:


 drbdadm connect all
 3. On secondary node/on faulty node execute: (what will destroy all your data and resync from primary)


 drbdadm -- --discard-my-data connect all
 Some other commands

 Will set to master:


 drbdadm primary all
 Will set to slave:


 drbdadm secondary all
 Human readable DRBD status:


 drbdadm dstate drbd0
 Start a manual resync (you will invalidate all your data)


 drbdadm invalidate all
 Start a manual resync on the other node


 drbdadm invalidate_remote all
 If you have more than one resource you should be carefully

 drbdadm invalidate resource-name
	DRBD (Distributed Replicated Block Device) is a really cool cluster solution to have radundant data on two or more nodes. Everybody know RAID, DRBD is like RAID 1 over network.

	Install DRBD virtual package (This will work on Debian and Ubuntu):

	apt-get install drbd-utils
	I my case DRBD is running under control of my cluster-software, that means DRBD will be managed by corosync. I remove DRBD from upstart:


	insserv -v -r drbd remove
	We need to start the driver and tell Linux to load it the next time when it boots:


	modprobe drbd
	echo 'drbd' >> /etc/modules
	lsmod \| grep drbd
	By the way, this could be your network configuration for the nodes – sync over a dedicated crossover cable:



	# Cluster Sync IFACE on node 1
	auto eth1
	iface eth1 inet static
	address 169.254.0.1
	netmask 255.255.255.248


	# Cluster Sync IFACE on node2
	auto eth1
	iface eth1 inet static
	address 169.254.0.4
	netmask 255.255.255.248
	Configure DRBD



	vim /etc/drbd.conf
	# I decomment it to use my own resource
	#include "drbd.d/global_common.conf";
	include "drbd.d/*.res";
	Define my resource:

	vim /etc/drbd.d/zeldor.res
	# Content of zeldor.res
	global {
	usage-count no;
	}

	common {
	syncer { rate 100M; }
	}

	resource drbd0 {
	protocol C;

	handlers {
	pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
	pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
	local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
	}

	startup {
	degr-wfc-timeout 120; # 2 minutes.
	}

	disk {
	on-io-error detach;
	}

	net {
	after-sb-0pri disconnect;
	after-sb-1pri disconnect;
	after-sb-2pri disconnect;
	rr-conflict disconnect;
	}

	syncer {
	rate 100M;
	al-extents 257;
	}

	on kvm-1 {
	device /dev/drbd0;
	disk /dev/sda5;
	address 169.254.0.1:7788;
	flexible-meta-disk internal;
	}

	on kvm-2 {
	device /dev/drbd0;
	disk /dev/sda5;
	address 169.254.0.4:7788;
	flexible-meta-disk internal;
	}
	}
	Sync rate 110M will be suitable to fill a gigabit interlink connection.

	First start of DRBD
	Before you start DRBD fix permissions:


	chgrp haclient /sbin/drbdsetup
	chmod o-x /sbin/drbdsetup
	chmod u+s /sbin/drbdsetup
	chgrp haclient /sbin/drbdmeta
	chmod o-x /sbin/drbdmeta
	chmod u+s /sbin/drbdmeta
	Start DRBD:


	/etc/init.d/drbd start
	Now we area ready to create the drbd0 device.


	drbdadm create-md drbd0
	Restart DRBD to refresh created resource:


	/etc/init.d/drbd restart
	Execute on primary node, which should be primary(master) all other nodes will be automatically set to slave mode:


	drbdadm -- --overwrite-data-of-peer primary drbd0
	Thats all, DRBD device is now syncing.


	root@kvm-fw1:~> drbd-overview
	0:drbd0 SyncSource Primary/Secondary UpToDate/Inconsistent C r----
	[>....................] sync'ed: 1.0% (404800/408820)M
	This article will help you to optimize your sync speed.

	Now you can choose between a static or dynamic usage of the DRBD device
	Static:
	You can create a filesystem:


	mkfs.ext3 /dev/drbd0
	# or
	mkfs.ext4 /dev/drbd0
	Dynamic:
	You can use it for LVM and be free to do everything you want with it. (my choice)

	A little fix for lvm:


	vim /etc/lvm/lvm.conf
	# By default we accept every block device:
	#filter = [ "a/.*/" ]
	# We want only drbd0
	filter = [ "a\|drbd0\|", "r\|.*\|" ]

	pvcreate /dev/drbd0
	Create a volume group:


	vgcreate zeldor /dev/drbd0
	Create logical volume:


	lvcreate -n data --size 100g zeldor
	Intersted? Read more about LVM

	Troubleshooting and Maintenance section
	Split-Brain Solution: primary/unknown and secondary/unknown

	1. First you should umount drbd device if you can.
	2. On primary node issue:


	drbdadm connect all
	3. On secondary node/on faulty node execute: (what will destroy all your data and resync from primary)


	drbdadm -- --discard-my-data connect all
	Some other commands

	Will set to master:


	drbdadm primary all
	Will set to slave:


	drbdadm secondary all
	Human readable DRBD status:


	drbdadm dstate drbd0
	Start a manual resync (you will invalidate all your data)


	drbdadm invalidate all
	Start a manual resync on the other node


	drbdadm invalidate_remote all
	If you have more than one resource you should be carefully

	drbdadm invalidate resource-name