Skip to content

Instantly share code, notes, and snippets.

@ifeulner
Last active March 20, 2025 11:21
Longhorn hcloud best practices

Longhorn best practices

The following settings are provided as an example how longhorn should be configured in a production cluster, especially if it is deployed on Hetzner Cloud infrastructure.

Hetzner server nodes provide local storage and allow up to five attached volumes (with a size of up to 10TiB each) Local storage is provided by NVMe storage and therefore is much faster than the attached volumes, but limited in size (max 300GiB usable).

It is assumed that the cluster creation is already done, e.g. via terraform scripts provided by the great kube-hetzner project.

Initial configuration

Also you want to control the nodes that are used for storage. So it is suggested to set the option Create default disk only on labeled node to true This is done via a setting in the longhorn helm chart:

defaultSettings:
  createDefaultDiskLabeledNodes: true
  kubernetesClusterAutoscalerEnabled: true # if autoscaler is active in the cluster
  defaultDataPath: /var/lib/longhorn
  # ensure pod is moved to an healthy node if current node is down:
  node-down-pod-deletion-policy: delete-both-statefulset-and-deployment-pod
persistence:
  defaultClass: true
  defaultFsType: ext4
  defaultClassReplicaCount: 3

Node preparation

Label the node

The following label signals longhorn to use a dedicated config. The config must be provided via an annotation. If the label is present, longhorn will not create a default disk but follows the config provided in the annotation.

add the 'config' label

kubectl label node <node> node.longhorn.io/create-default-disk='config'

remove the label

# remove label
kubectl label node <node> node.longhorn.io/create-default-disk-

Longhorn-relevant annotation

See also longhorn default disk configuration

The default disk configuration is provided by annotating the node. In the example below you find a configuration for a two disk setup - the internal disk and one external hcloud volume (mounted at /var/longhorn).

storageReserved should be 25% for the local disk, 10% for attached dedicated hcloud volumes. So if the local disk space is 160GiB, 40GiB (42949672960 bytes) should be defined as reserved. Also we define tags for the different disks - "nvme" for the fast, internal disk, "ssd" for the slow, hcloud volume.

kubectl annotate node <storagenode> node.longhorn.io/default-disks-config='[ { "path":"/var/lib/longhorn","allowScheduling":true, "storageReserved":21474836240, "tags":[ "nvme" ]}, { "name":"hcloud-volume", "path":"/var/longhorn","allowScheduling":true, "storageReserved":10737418120,"tags":[ "ssd" ] }]'

StorageClasses

To ensure that the volume is using the right storage, the corresponding StorageClass needs to be defined:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: longhorn-fast
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880" # 48 hours in minutes
  fromBackup: ""
  fsType: "ext4"
  diskSelector: "nvme"
  
  #  backingImage: "bi-test"
  #  backingImageDataSourceType: "download"
  #  backingImageDataSourceParameters: '{"url": "https://backing-image-example.s3-region.amazonaws.com/test-backing-image"}'
  #  backingImageChecksum: "SHA512 checksum of the backing image"
  #  diskSelector: "ssd,fast"
  #  nodeSelector: "storage,fast"
  #  recurringJobSelector: '[
  #   {
  #     "name":"snap",
  #     "isGroup":true,
  #   },
  #   {
  #     "name":"backup",
  #     "isGroup":false,
  #   }
  #  ]'

Benchmarks

Below find some results with different StorageClasses using the benchmark utility dbench. Besides disk selection and replicas als the used server type for the storage nodes has a huge impact (vCPU, RAM).

Dbench

# StorageClass longhorn (3 replicas, hcloud-volume)
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 4202/240. BW: 228MiB/s / 26.2MiB/s
Average Latency (usec) Read/Write: 2716.04/18.33
Sequential Read/Write: 305MiB/s / 66.3MiB/s
Mixed Random Read/Write IOPS: 823/272


# StorageClass longhorn-fast (3 replicas, internal disk)
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 7305/3737. BW: 250MiB/s / 62.1MiB/s
Average Latency (usec) Read/Write: 1789.55/
Sequential Read/Write: 299MiB/s / 67.8MiB/s
Mixed Random Read/Write IOPS: 4446/1482

Random Read/Write IOPS: 6914/3661. BW: 256MiB/s / 57.9MiB/s
Average Latency (usec) Read/Write: 1801.84/
Sequential Read/Write: 275MiB/s / 62.6MiB/s
Mixed Random Read/Write IOPS: 4908/1632

# StorageClass longhorn-fast-xfs (3 replicas, internal disk)
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 5887/4083. BW: 244MiB/s / 44.8MiB/s
Average Latency (usec) Read/Write: 2720.66/
Sequential Read/Write: 278MiB/s / 46.2MiB/s
Mixed Random Read/Write IOPS: 3880/1298
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: longhorn-fast
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "2880" # 48 hours in minutes
fromBackup: ""
fsType: "ext4"
diskSelector: "nvme"
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: longhorn-fast-xfs
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "2880" # 48 hours in minutes
fromBackup: ""
fsType: "xfs"
diskSelector: "nvme"
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: longhorn-fast-2
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "2"
staleReplicaTimeout: "2880" # 48 hours in minutes
fromBackup: ""
fsType: "ext4"
diskSelector: "nvme"
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: longhorn-fast-2-xfs
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "2"
staleReplicaTimeout: "2880" # 48 hours in minutes
fromBackup: ""
fsType: "xfs"
diskSelector: "nvme"
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: dbench-pv-claim
spec:
# storageClassName: longhorn
storageClassName: longhorn-fast
# storageClassName: local-path
# storageClassName: gp2
# storageClassName: local-storage
# storageClassName: ibmc-block-bronze
# storageClassName: ibmc-block-silver
# storageClassName: ibmc-block-gold
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 25Gi
---
apiVersion: batch/v1
kind: Job
metadata:
name: dbench
spec:
template:
spec:
containers:
- name: dbench
image: storageos/dbench:latest
imagePullPolicy: Always
env:
- name: DBENCH_MOUNTPOINT
value: /data
# - name: DBENCH_QUICK
# value: "yes"
# - name: FIO_SIZE
# value: 1G
# - name: FIO_OFFSET_INCREMENT
# value: 256M
# - name: FIO_DIRECT
# value: "0"
volumeMounts:
- name: dbench-pv
mountPath: /data
restartPolicy: Never
volumes:
- name: dbench-pv
persistentVolumeClaim:
claimName: dbench-pv-claim
backoffLimit: 4
@Nospamas
Copy link

Nospamas commented Mar 7, 2025

Thanks for this. Cool little gist and helped me find dbench! I have a few results from my own metal cluster. The nodes are currently very quiet in terms of IO so these results are probably as close to a best case scenario.

Hardware wise I run 3 beelink 5800H nodes with a local nvme drive and an attached sata SSD. They're backed by a 2.5G network.

# openebs localpv-hostpath
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 178k/192k. BW: 2718MiB/s / 3167MiB/s
Average Latency (usec) Read/Write: 78.19/28.13
Sequential Read/Write: 3268MiB/s / 3190MiB/s
Mixed Random Read/Write IOPS: 130k/43.3k


# longhorn strict-local against nvme on the shared OS drive
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 21.9k/24.2k. BW: 1146MiB/s / 278MiB/s
Average Latency (usec) Read/Write: 284.80/296.29
Sequential Read/Write: 460MiB/s / 296MiB/s
Mixed Random Read/Write IOPS: 17.7k/5647


# longhorn strict-local against a sata attached SSD
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 20.1k/21.7k. BW: 268MiB/s / 265MiB/s
Average Latency (usec) Read/Write: 464.62/313.35
Sequential Read/Write: 437MiB/s / 282MiB/s
Mixed Random Read/Write IOPS: 15.7k/5228

# longhorn 2 replicas, best-effort against nvme on the shared OS drive
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 17.3k/10.7k. BW: 131MiB/s / 75.9MiB/s
Average Latency (usec) Read/Write: 554.70/773.96
Sequential Read/Write: 134MiB/s / 81.6MiB/s
Mixed Random Read/Write IOPS: 11.8k/3886

# longhorn 2 replicas, best-effort on sata attached SSD
==================
= Dbench Summary =
==================
Random Read/Write IOPS: 16.7k/9746. BW: 171MiB/s / 67.9MiB/s
Average Latency (usec) Read/Write: 650.77/769.99
Sequential Read/Write: 185MiB/s / 75.4MiB/s
Mixed Random Read/Write IOPS: 10.6k/3549

@Adrelien
Copy link

Adrelien commented Mar 8, 2025

Here’s my tests too
‘’’

local nvme rancher local provisioner

= Dbench Summary =

Random Read/Write IOPS: 46.9k/44.2k. BW: 2451MiB/s / 2221MiB/s
Average Latency (usec) Read/Write: 301.31/179.96
Sequential Read/Write: 3212MiB/s / 3264MiB/s
Mixed Random Read/Write IOPS: 29.9k/9947

hetzner Cloud nvme

= Dbench Summary =

Random Read/Write IOPS: 7495/7495. BW: 300MiB/s / 299MiB/s
Average Latency (usec) Read/Write: 822.06/1954.15
Sequential Read/Write: 300MiB/s / 299MiB/s
Mixed Random Read/Write IOPS: 7496/2498

longhorn locality strict nvme

= Dbench Summary =

Random Read/Write IOPS: 15.8k/5352. BW: 835MiB/s / 276MiB/s
Average Latency (usec) Read/Write: 624.10/722.11
Sequential Read/Write: 58.4MiB/s / 278MiB/s
Mixed Random Read/Write IOPS: 8056/2677

longhorn normal nvme

= Dbench Summary =

Random Read/Write IOPS: 11.5k/4599. BW: 193MiB/s / 202MiB/s
Average Latency (usec) Read/Write: 1207.49/1123.83
Sequential Read/Write: 133MiB/s / 205MiB/s
Mixed Random Read/Write IOPS: 5484/1840

‘’’

@yaroslavTheO
Copy link

hi @ifeulner, thanks for this guide. I am following it to set up longhorn on a new hetzner cluster which I am creating using kube-hetzner project. However I am a bit stuck with adding node.longhorn.io/default-disks-config annotation. I would prefer if it was added to the resources declaration rather than executed manually for each node. As kube-hetzner does not support annotating nodes by itself the only way to achieve it, I came up so far is to have a seperate terraform project, import kube-hetzner to create a cluster, and then to add my own custom commands which adds annotations to be exectued after kube-hezner terraform. It seems to be a slight overcomplication and maybe I am missing an easy way to declare multiple disks on a node from inside terraform?

@Adrelien
Copy link

hi @ifeulner, thanks for this guide. I am following it to set up longhorn on a new hetzner cluster which I am creating using kube-hetzner project. However I am a bit stuck with adding node.longhorn.io/default-disks-config annotation. I would prefer if it was added to the resources declaration rather than executed manually for each node. As kube-hetzner does not support annotating nodes by itself the only way to achieve it, I came up so far is to have a seperate terraform project, import kube-hetzner to create a cluster, and then to add my own custom commands which adds annotations to be exectued after kube-hezner terraform. It seems to be a slight overcomplication and maybe I am missing an easy way to declare multiple disks on a node from inside terraform?

https://github.com/MahdadGhasemian/kubernetes-hetzner-lac-template/blob/main/kube.tf

Check this out

@yaroslavTheO
Copy link

hi @ifeulner, thanks for this guide. I am following it to set up longhorn on a new hetzner cluster which I am creating using kube-hetzner project. However I am a bit stuck with adding node.longhorn.io/default-disks-config annotation. I would prefer if it was added to the resources declaration rather than executed manually for each node. As kube-hetzner does not support annotating nodes by itself the only way to achieve it, I came up so far is to have a seperate terraform project, import kube-hetzner to create a cluster, and then to add my own custom commands which adds annotations to be exectued after kube-hezner terraform. It seems to be a slight overcomplication and maybe I am missing an easy way to declare multiple disks on a node from inside terraform?

https://github.com/MahdadGhasemian/kubernetes-hetzner-lac-template/blob/main/kube.tf

Check this out

Thanks for sharing. However its not really what I need to achieve. I want each node in the specific nodepool to have two disk, one using node storage and another using hetzner volume

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment