Note this should only be done once you are sure you have reliable TB mesh network.
this is because proxmox UI seems fragile wrt to changing underlying network after configuration of ceph.
All installation done via command line due to gui not understanding the mesh network
This setup doesn't attempt to seperate the ceph public network and ceph cluster network (not same as proxmox clutser network), The goal is to get an easy working setup.
this gist is part of this series
- On all nodes execute the command
pveceph install --repository no-subscription
accept all the packages and install - On node 1 execute the command
pveceph init --network 10.0.0.81/24
- On node 1 execute the command
pveceph mon create --mon-address 10.0.0.81
- On node 2 execute the command
pveceph mon create --mon-address 10.0.0.82
- On node 3 execute the command
pveceph mon create --mon-address 10.0.0.83
Now if you access the gui Datacenter > pve1 > ceph > monitor
you should have 3 running monitors (ignore any errors on the root ceph UI leaf for now).
If so you can proceed to next step. If not you probably have something wrong in your network, check all settings.
- On any node go to
Datacenter > nodename > ceph > monitor
and clickcreate
manager in the manager section. - Selecty an node that doesn't have a manager from the drop dwon and click
create
3 repeat step 2 as needed If this fails it probably means your networking is not working
- On any node go to
Datacenter > nodename > ceph > OSD
- click
create OSD
select all the defaults (again this for a simple setup) - repeat untill you have 3 nodes like this (note it can take 30 seconds for a new OSD to go green)
If you find there are no availale disks when you try to add it probably means your dedicated nvme/ssd has some other filesystem or old osd on it. To wipe the disk use the following UI. Becareful not to wipe your OS disk.
- On any node go to
Datacenter > nodename > ceph > pools
and clickcreate
- name the volume, e.g.
vm-disks
and leave defaults as is and clickcreate
- On any node go to
Datacenter > options
- Set
Cluster Resource Scheduling
toha-rebalance-on-start=1
(this will rebalance nodes as needed) - Set
HA Settings
toshutdown_policy=migrate
(this will migrate VMs and CTs if you gracefully shutdown a node). - Set
migration settings
leave as default (seperate gist will talk about seperating migration network later)
Fix for Clock Skew was => check ntp settings, is timesync working ;)