Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Drallas/84ece855dc39b6af33f25d4b9f3a1fe3 to your computer and use it in GitHub Desktop.
Save Drallas/84ece855dc39b6af33f25d4b9f3a1fe3 to your computer and use it in GitHub Desktop.

Build a Proxmox High Available cluster with Ceph

Part of collection: Hyper-converged Homelab with Proxmox

This is part 2 focussing on building the Proxmox Cluster and setting up Ceph.

See also Part 1 about Setup Networking for a High Available cluster with Ceph, and see Part 2 for how to setup the Proxmox and Ceph Cluster itself, and part 3 focussing on Managing and Troubleshooting Proxmox and Ceph.

If everything went well in part 1, setting up Proxmox and Ceph should be 'a walk in the park'!

Build Proxmox Cluster

  • Create cluster on PVE01 called “Homelab” using the Proxmox GUI
    • Cluster network Link #0 should be the fc00:1 interface
    • Add a second link (Link #1) with the IPv4 management IP

Screenshot 2023-09-15 at 09 45 16

Screenshot 2023-08-30 at 12 07 13

Information to join other nodes to the cluster Screenshot 2023-09-15 at 09 46 21

Make sure to select the proper Cluster Network addresses. Screenshot 2023-09-15 at 09 47 28

Note: The screen might blank out, give it a minute, and check on the first node if the joining node is present. Then refresh the browser on the joining node to log in again.

Build Ceph Cluster

  • Install Ceph on PVE01 from the Proxmox GUI (No-Subscription)

Screenshot 2023-09-15 at 09 54 14

  • Don’t install it on the other nodes yet
  • When configuring, make sure to set the fc00::1/128 network as the public and cluster network!

Screenshot 2023-09-15 at 09 58 13

  • Finish the configuration wizard on the first node
  • Edit the ceph config file only the first Proxmox node: nano /etc/ceph/ceph.conf

Change these two lines:

cluster_network = fc00::1/128
public_network = fc00::1/128

To this:

cluster_network = fc00::/64
public_network = fc00::/64
  • Save the file and exit the editor.
  • Restart Ceph on the first node: systemctl restart ceph.target
  • Install/configure Ceph on the additional nodes.

Finish the Setup

Beofore the cluster can be used, monitors and managers need to be added, also OSD (disk) need to be setup. It's recommended to setup the Ceph Dashboard, unless it's intended to use the commandline for most advanched tasks!

Monitors

The Ceph Monitor (MON) maintains a master copy of the cluster map. For high availability, at least 3 monitors are needed.

Cluster State: Monitors play a crucial role in tracking the status and configuration of the Ceph cluster.

Monitoring OSDs: Monitors oversee the status of the OSDs (Object Storage Daemons) within the cluster.

Distribution of Cluster Maps: They distribute cluster information maps to all nodes, ensuring consistency is maintained.

Decision-Making: Monitors are involved in quorum decision-making and assist in making significant decisions within the cluster.

Managers

The Ceph Manager daemon runs alongside the monitors. It provides an interface to monitor the cluster. Multiple Managers can be installed, but only one Manager is active at any given time.

Service Management: Ceph Managers provide an interface for managing services, including OSDs, Monitors, and MDSs.

Dashboard and Monitoring: They offer a dashboard for monitoring cluster performance, health, and usage.

Dynamic Configuration: Managers support updating configuration settings without requiring downtime.

Health Insights: They generate reports on the health and condition of the Ceph cluster.

OSD's and Managers can be created via de Proxmox WebGUI or Commandline.

OSD's

In Ceph, each disk is an OSD (Object Storage Daemon), the OSD store data blocks, and every OSD is be backed by a single raw disk (not RAID or Partition). Each data disk then has its own daemon which communicates with the Ceph monitor and other OSDs, so there’s an almost 1:1 correlation between the daemon and physical data disks.

For Background Information see this paragraph on Object Storage Daemons.

OSD's can be created via de Proxmox WebGUI or Commandline.

It's possible to set a 'Device Class' for the OSD, those are needed when using Crush Rules to certain data on fast NVMe drive and other on slowe SSD drives.

Setup the NVMe

Screenshot 2023-09-15 at 10 06 31

Screenshot 2023-09-15 at 10 09 48

When they are green and in sync

Setup the SSD

Screenshot 2023-09-15 at 10 10 59

Screenshot 2023-09-15 at 10 15 46

Check if everything is green

Screenshot 2023-09-15 at 10 17 13

Setup Ceph Dashboard

The Dashboard GUI makes it easier to manage more complex pool arrangements, than using the commandline.

The dashboard runs as part of the Ceph Manager, so it only needs to installed on nodes that have a Ceph Manager Daemon. In case of change of active Ceph Manager the Dashboard fails over to the Node with the Manager.

The setup steps, from the shell of the Proxmox system:

  1. Install the manager package with apt install ceph-mgr-dashboard
  2. Enable the dashboard module with ceph mgr module enable dashboard
  3. Create a self-signed certificate with ceph dashboard create-self-signed-cert
  4. Create a password for the new admin user and store it to a file. Ceph is actually picky about password rules here. echo MyPassword1 > password.txt
  5. Create a new admin user in the Ceph dashboard with ceph dashboard ac-user-create <name> -i password.txt adminstrator - ‘administrator’ is the role that Ceph has by default, so this user can then create more users through the dashboard
  6. Delete the password file - rm password.txt
  7. Restart the manager or disable and re-enable the dashboard (ceph mgr module disable dashboard and ceph mgr module enable dashboard). I rebooted the node here. The documentation suggests this shouldn’t be required.

Access the Dashboard via: ([https://:8443](https://:8443/#/dashboard)).

Add Pools

Pools can be either replicated or erasure-coded; you can configure this using the Ceph Dashboard. A Crush ruleset can specify where a pool stores its data. NVMe is used for fast storage, while SSDs are used for slower storage.

To keep it simple, I use lowercase pool names.

Add CephFS

Just go to the Proxmox Gui and add it!

Add Metada Servers

If you are actively using CephFS and require high performance, it can be beneficial to start with 1 or 2 Metadata Servers to facilitate parallel processing and distribute the load.

CephFS Support: Metadata Servers support the Ceph File System (CephFS) by efficiently managing metadata operations.

File Attributes and Mappings: They keep track of file attributes and mappings, which are crucial for file systems.

Parallel Processing: Metadata Servers enable parallel processing of metadata mutations, improving performance.

Scalability: They contribute to the scalability of CephFS, allowing it to handle large numbers of files and directories.


Now the cluster is built and ready for workloads. This series continues at Manage & Troubleshooting Ceph- Part 3

@pkess
Copy link

pkess commented Apr 29, 2024

Hi, i tried your paragraph for setting up the ceph dashboard on Proxmox 8.2.2. I was not able to use the command

ceph dashboard create-self-signed-certificate

but the error message displayed the steps that were necessary to create and import the certificate

When i opened the dashboard then and logged in i had no permissions. Those were added with

ceph dashboard ac-user-add-roles <name> administrator

Valid roles can be listed with

ceph dashboard ac-role-show

@Drallas
Copy link
Author

Drallas commented May 5, 2024

@pkess Thanks for reporting, so far my Dashboard is still running fine, but good to know in case i have to set it up again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment