Skip to content

Instantly share code, notes, and snippets.

@scyto
Last active February 28, 2025 06:31
Show Gist options
  • Save scyto/8c652f3eab61ed1fa2f980d02a484c35 to your computer and use it in GitHub Desktop.
Save scyto/8c652f3eab61ed1fa2f980d02a484c35 to your computer and use it in GitHub Desktop.
setting up the ceph cluster

CEPH HA Setup

Note this should only be done once you are sure you have reliable TB mesh network.

this is because proxmox UI seems fragile wrt to changing underlying network after configuration of ceph.

All installation done via command line due to gui not understanding the mesh network

This setup doesn't attempt to seperate the ceph public network and ceph cluster network (not same as proxmox clutser network), The goal is to get an easy working setup.

this gist is part of this series

Ceph Initial Install & monitor creation

  1. On all nodes execute the command pveceph install --repository no-subscription accept all the packages and install
  2. On node 1 execute the command pveceph init --network 10.0.0.81/24
  3. On node 1 execute the command pveceph mon create --mon-address 10.0.0.81
  4. On node 2 execute the command pveceph mon create --mon-address 10.0.0.82
  5. On node 3 execute the command pveceph mon create --mon-address 10.0.0.83

Now if you access the gui Datacenter > pve1 > ceph > monitor you should have 3 running monitors (ignore any errors on the root ceph UI leaf for now).

If so you can proceed to next step. If not you probably have something wrong in your network, check all settings.

Add Addtional managers

  1. On any node go to Datacenter > nodename > ceph > monitor and click create manager in the manager section.
  2. Selecty an node that doesn't have a manager from the drop dwon and click create 3 repeat step 2 as needed If this fails it probably means your networking is not working

Add OSDs

  1. On any node go to Datacenter > nodename > ceph > OSD
  2. click create OSDselect all the defaults (again this for a simple setup)
  3. repeat untill you have 3 nodes like this (note it can take 30 seconds for a new OSD to go green) image

If you find there are no availale disks when you try to add it probably means your dedicated nvme/ssd has some other filesystem or old osd on it. To wipe the disk use the following UI. Becareful not to wipe your OS disk. image

Create Pool

  1. On any node go to Datacenter > nodename > ceph > pools and click create
  2. name the volume, e.g. vm-disks and leave defaults as is and click create

Configure HA

  1. On any node go to Datacenter > options
  2. Set Cluster Resource Scheduling to ha-rebalance-on-start=1 (this will rebalance nodes as needed)
  3. Set HA Settings to shutdown_policy=migrate (this will migrate VMs and CTs if you gracefully shutdown a node).
  4. Set migration settings leave as default (seperate gist will talk about seperating migration network later)
@mrkhachaturov
Copy link

mrkhachaturov commented Jan 9, 2025

@IndianaJoe1216 check this guide

With 6 nodes I think I will use Thunderbolt network only for migration and maybe Ceph cluster network.
For Ceph public network I think better is to use 10G interface.

@IndianaJoe1216
Copy link

@mrkhachaturov reviewing this now. I am doing the same. Thunderbolt network only for ceph backend and then the public network I need to be on the 10G interface because that is essentially what the VM's will have access to.

@taslabs-net
Copy link

taslabs-net commented Feb 28, 2025

After many nights, at least 4, I have this working with 10gbe sfp+ for my public network, and TB4 for my ceph cluster. I got into, blacked out, and now here I am. Comes up after reboot. I feel like I’m late to this party.

Screenshot 2025-02-27 at 23 59 59

I promise I'm being serious, but is this good? Or should I be able to move faster? Or am I reaching limits of my drives?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment