Note this should only be done once you are sure you have reliable TB mesh network.
this is because proxmox UI seems fragile wrt to changing underlying network after configuration of ceph.
All installation done via command line due to gui not understanding the mesh network
This setup doesn't attempt to seperate the ceph public network and ceph cluster network (not same as proxmox clutser network), The goal is to get an easy working setup.
**2025.04.24 NOTE: some folks had to switch to IPv6 for ceph due to IPv4 unreliability issues, we think as of pve 8.4.1 and all the input the community has give to update this set of gsists - that IPv4 is now reliable even on MS-01. As such i advising everyone to use IPv4 for ceph as if you have IPv6 you will have issues with SDN at this time (if you don't use SDN this is not an issue).
this gist is part of this series
- On all nodes execute the command pveceph install --repository no-subscriptionaccept all the packages and install
- On node 1 execute the command pveceph init --network 10.0.0.81/24
- On node 1 execute the command pveceph mon create --mon-address 10.0.0.81
- On node 2 execute the command pveceph mon create --mon-address 10.0.0.82
- On node 3 execute the command pveceph mon create --mon-address 10.0.0.83
Now if you access the gui Datacenter > pve1 > ceph > monitor you should have 3 running monitors (ignore any errors on the root ceph UI leaf for now).
If so you can proceed to next step. If not you probably have something wrong in your network, check all settings.
- On any node go to Datacenter > nodename > ceph > monitorand clickcreatemanager in the manager section.
- Selecty an node that doesn't have a manager from the drop dwon and click create3 repeat step 2 as needed If this fails it probably means your networking is not working
- On any node go to Datacenter > nodename > ceph > OSD
- click create OSDselect all the defaults (again this for a simple setup)
- repeat untill you have 3 nodes like this (note it can take 30 seconds for a new OSD to go green)
 
If you find there are no availale disks when you try to add it probably means your dedicated nvme/ssd has some other filesystem or old osd on it.  To wipe the disk use the following UI.  Becareful not to wipe your OS disk.

- On any node go to Datacenter > nodename > ceph > poolsand clickcreate
- name the volume, e.g. vm-disksand leave defaults as is and clickcreate
- On any node go to Datacenter > options
- Set Cluster Resource Schedulingtoha-rebalance-on-start=1(this will rebalance nodes as needed)
- Set HA Settingstoshutdown_policy=migrate(this will migrate VMs and CTs if you gracefully shutdown a node).
- Set migration settingsleave as default (seperate gist will talk about seperating migration network later)
this is my blind attempt at ensuring ceph doesn't try and start until frr service is up - i don't have any tests in the startup to make sure the interfaces are up so it may not make too much diff to MS-01 users. but anyhoo here it is...
edit /usr/lib/systemd/system/ceph.target to look like this
[Unit]
Description=ceph target allowing to start/stop all ceph*@.service instances at once
After=frr.service
Requires=frr.service
[Install]
WantedBy=multi-user.target
note: i need to revise this as this file could be overwritten on upgrade)
- make a directory mkdir /etc/systemd/system/pvestatd.service.d
- create a file in nano /etc/systemd/system/pvestatd.service.d/dependencies.conf
- add to file the following
[Unit]
After=pve-storage.target
- save
(note i am ucnlear if this currently works despite this being the recommended answer)

@turdf someone noted there was typo in the post up line in the mesh gist - might be worth checking the old deprecated one for that change (i corrected it - the sleep was in the wrong place due to bad cut and paste on my part) or moving to the new way i have mine setup. the new one has a new way of bouncing frr once the thunderbolt ports are up, it may be more reliable...
also i added this above in attempt to make sure ceph doesn't start until frr does, i am unclear this is actually needed as multi-user.target.wants didn't start until frr was up... so this may be redundant. So on an ms-01 this may or may not help.
we really need to find away for ceph to not start unless one of the thunderbolt ports is up........
(sorry for slow reply i have been awol for a few months due to brain surgery in dec)