Skip to content

Instantly share code, notes, and snippets.

@mshafiee
Created December 12, 2019 07:59
Show Gist options
  • Save mshafiee/e6faca5f2eb6b538427717b9828b1a69 to your computer and use it in GitHub Desktop.
Save mshafiee/e6faca5f2eb6b538427717b9828b1a69 to your computer and use it in GitHub Desktop.
DRBD master/slave setup
DRBD (Distributed Replicated Block Device) is a really cool cluster solution to have radundant data on two or more nodes. Everybody know RAID, DRBD is like RAID 1 over network.
Install DRBD virtual package (This will work on Debian and Ubuntu):
apt-get install drbd-utils
I my case DRBD is running under control of my cluster-software, that means DRBD will be managed by corosync. I remove DRBD from upstart:
insserv -v -r drbd remove
We need to start the driver and tell Linux to load it the next time when it boots:
modprobe drbd
echo 'drbd' >> /etc/modules
lsmod | grep drbd
By the way, this could be your network configuration for the nodes – sync over a dedicated crossover cable:
# Cluster Sync IFACE on node 1
auto eth1
iface eth1 inet static
address 169.254.0.1
netmask 255.255.255.248
# Cluster Sync IFACE on node2
auto eth1
iface eth1 inet static
address 169.254.0.4
netmask 255.255.255.248
Configure DRBD
vim /etc/drbd.conf
# I decomment it to use my own resource
#include "drbd.d/global_common.conf";
include "drbd.d/*.res";
Define my resource:
vim /etc/drbd.d/zeldor.res
# Content of zeldor.res
global {
usage-count no;
}
common {
syncer { rate 100M; }
}
resource drbd0 {
protocol C;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
}
startup {
degr-wfc-timeout 120; # 2 minutes.
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
rate 100M;
al-extents 257;
}
on kvm-1 {
device /dev/drbd0;
disk /dev/sda5;
address 169.254.0.1:7788;
flexible-meta-disk internal;
}
on kvm-2 {
device /dev/drbd0;
disk /dev/sda5;
address 169.254.0.4:7788;
flexible-meta-disk internal;
}
}
Sync rate 110M will be suitable to fill a gigabit interlink connection.
First start of DRBD
Before you start DRBD fix permissions:
chgrp haclient /sbin/drbdsetup
chmod o-x /sbin/drbdsetup
chmod u+s /sbin/drbdsetup
chgrp haclient /sbin/drbdmeta
chmod o-x /sbin/drbdmeta
chmod u+s /sbin/drbdmeta
Start DRBD:
/etc/init.d/drbd start
Now we area ready to create the drbd0 device.
drbdadm create-md drbd0
Restart DRBD to refresh created resource:
/etc/init.d/drbd restart
Execute on primary node, which should be primary(master) all other nodes will be automatically set to slave mode:
drbdadm -- --overwrite-data-of-peer primary drbd0
Thats all, DRBD device is now syncing.
root@kvm-fw1:~> drbd-overview
0:drbd0 SyncSource Primary/Secondary UpToDate/Inconsistent C r----
[>....................] sync'ed: 1.0% (404800/408820)M
This article will help you to optimize your sync speed.
Now you can choose between a static or dynamic usage of the DRBD device
Static:
You can create a filesystem:
mkfs.ext3 /dev/drbd0
# or
mkfs.ext4 /dev/drbd0
Dynamic:
You can use it for LVM and be free to do everything you want with it. (my choice)
A little fix for lvm:
vim /etc/lvm/lvm.conf
# By default we accept every block device:
#filter = [ "a/.*/" ]
# We want only drbd0
filter = [ "a|drbd0|", "r|.*|" ]
pvcreate /dev/drbd0
Create a volume group:
vgcreate zeldor /dev/drbd0
Create logical volume:
lvcreate -n data --size 100g zeldor
Intersted? Read more about LVM
Troubleshooting and Maintenance section
Split-Brain Solution: primary/unknown and secondary/unknown
1. First you should umount drbd device if you can.
2. On primary node issue:
drbdadm connect all
3. On secondary node/on faulty node execute: (what will destroy all your data and resync from primary)
drbdadm -- --discard-my-data connect all
Some other commands
Will set to master:
drbdadm primary all
Will set to slave:
drbdadm secondary all
Human readable DRBD status:
drbdadm dstate drbd0
Start a manual resync (you will invalidate all your data)
drbdadm invalidate all
Start a manual resync on the other node
drbdadm invalidate_remote all
If you have more than one resource you should be carefully
drbdadm invalidate resource-name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment