Created
August 16, 2025 14:37
-
-
Save tyriis/8b44a4cd705303ead62d6a14c0f5dd09 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 NIC is running with 100Mbits only :( | |
``` | |
Connecting to host 192.168.100.101, port 5201 | |
[ 5] local 192.168.100.103 port 42392 connected to 192.168.100.101 port 5201 | |
[ ID] Interval Transfer Bitrate Retr Cwnd | |
[ 5] 0.00-1.00 sec 11.9 MBytes 99.5 Mbits/sec 0 42.4 KBytes | |
[ 5] 1.00-2.00 sec 11.1 MBytes 93.3 Mbits/sec 0 45.2 KBytes | |
[ 5] 2.00-3.00 sec 11.2 MBytes 94.4 Mbits/sec 0 45.2 KBytes | |
[ 5] 3.00-4.00 sec 11.1 MBytes 93.3 Mbits/sec 0 48.1 KBytes | |
[ 5] 4.00-5.00 sec 10.6 MBytes 89.1 Mbits/sec 0 45.2 KBytes | |
[ 5] 5.00-6.00 sec 10.6 MBytes 89.1 Mbits/sec 0 41.0 KBytes | |
[ 5] 6.00-7.00 sec 11.1 MBytes 93.3 Mbits/sec 0 50.9 KBytes | |
[ 5] 7.00-8.00 sec 11.2 MBytes 94.4 Mbits/sec 0 50.9 KBytes | |
[ 5] 8.00-9.00 sec 11.1 MBytes 93.3 Mbits/sec 0 45.2 KBytes | |
[ 5] 9.00-10.00 sec 11.2 MBytes 94.4 Mbits/sec 0 5.66 KBytes | |
- - - - - - - - - - - - - - - - - - - - - - - - - | |
[ ID] Interval Transfer Bitrate Retr | |
[ 5] 0.00-10.00 sec 111 MBytes 93.4 Mbits/sec 0 sender | |
[ 5] 0.00-10.01 sec 110 MBytes 92.0 Mbits/sec receiver | |
iperf Done. | |
``` | |
this was the reason for low rook-ceph performance: | |
after fix | |
``` | |
bash-5.1$ rados bench -p ceph-blockpool 10 write --no-cleanup | |
hints = 1 | |
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects | |
Object prefix: benchmark_data_talos01_1168 | |
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) | |
0 2 2 0 0 0 - 0 | |
1 16 66 50 199.967 200 0.375406 0.201107 | |
2 16 103 87 173.977 148 0.0129707 0.310708 | |
3 16 144 128 170.642 164 0.0156619 0.337213 | |
4 16 187 171 170.975 172 0.715322 0.33933 | |
5 16 222 206 164.778 140 0.929181 0.351855 | |
6 16 266 250 166.645 176 0.0181203 0.359934 | |
7 16 307 291 166.263 164 0.508394 0.36839 | |
8 16 352 336 167.975 180 0.394201 0.366298 | |
9 16 393 377 167.531 164 0.512105 0.369183 | |
10 16 431 415 165.975 152 0.435443 0.372946 | |
Total time run: 10.6541 | |
Total writes made: 431 | |
Write size: 4194304 | |
Object size: 4194304 | |
Bandwidth (MB/sec): 161.815 | |
Stddev Bandwidth: 17.3077 | |
Max bandwidth (MB/sec): 200 | |
Min bandwidth (MB/sec): 140 | |
Average IOPS: 40 | |
Stddev IOPS: 4.32692 | |
Max IOPS: 50 | |
Min IOPS: 35 | |
Average Latency(s): 0.385355 | |
Stddev Latency(s): 0.315917 | |
Max latency(s): 1.03777 | |
Min latency(s): 0.0125966 | |
``` |
with thunderbolt ring
bash-5.1$ rados bench -p ceph-blockpool 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_talos01_136
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 64 48 191.973 192 0.00969311 0.228637
2 16 104 88 175.98 160 0.0093995 0.307278
3 16 152 136 181.311 192 0.0134512 0.316678
4 16 199 183 182.977 188 0.00883172 0.322588
5 16 238 222 177.575 156 0.767103 0.334924
6 16 280 264 175.976 168 0.99548 0.335624
7 16 327 311 177.687 188 0.00979297 0.33931
8 16 374 358 178.973 188 0.788427 0.342855
9 16 421 405 179.972 188 0.777222 0.343079
10 16 467 451 180.371 184 0.428006 0.344738
Total time run: 10.6793
Total writes made: 467
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 174.918
Stddev Bandwidth: 13.6561
Max bandwidth (MB/sec): 192
Min bandwidth (MB/sec): 156
Average IOPS: 43
Stddev IOPS: 3.41402
Max IOPS: 48
Min IOPS: 39
Average Latency(s): 0.356975
Stddev Latency(s): 0.328332
Max latency(s): 1.04079
Min latency(s): 0.00808189
with params
Ceph Global Configuration
These settings apply to all Ceph daemons (MONs, OSDs, MGRs, etc.):
bdev_enable_discard: "true"
- Purpose: Enables TRIM/discard support for block devices
- Effect: When data is deleted, Ceph sends TRIM commands to SSDs to mark blocks as unused
- Benefits:
- Better SSD performance and longevity
- Reclaims space at the hardware level
- Reduces write amplification
- Requirements: Your NVMe/SSD must support TRIM
- Note: Quoted because YAML values should be strings for Ceph config
bdev_async_discard_threads: "1"
- Purpose: Number of threads for asynchronous discard operations
- Effect: Processes TRIM commands in background without blocking I/O
- Value:
1
thread is usually sufficient for most workloads - Benefits: Prevents discard operations from blocking regular I/O
osd_class_update_on_start: "false"
- Purpose: Controls whether OSDs update their device class on startup
- Effect: Prevents OSDs from changing their crush device class (ssd/hdd/nvme)
- Why disabled:
- Maintains consistent CRUSH map
- Prevents automatic reclassification that could affect placement
- Useful when you want manual control over device classes
device_failure_prediction_mode: local
- Purpose: Enables local device health monitoring
- Effect: Uses local SMART data to predict device failures
- Requirements: Requires the
diskprediction_local
MGR module (commented out in your config) - Benefits: Proactive failure detection and warnings
- Note: You should enable the corresponding MGR module:
mgr:
modules:
- name: diskprediction_local
enabled: true
Configuration Example
cephConfig:
global:
bdev_enable_discard: "true"
bdev_async_discard_threads: "1"
osd_class_update_on_start: "false"
device_failure_prediction_mode: local
bash-5.1$ rados bench -p ceph-blockpool 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_talos01_100
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 16 16 0 0 0 - 0
1 16 59 43 171.961 172 0.42745 0.248814
2 16 96 80 159.975 148 0.00822193 0.334767
3 16 134 118 157.312 152 0.499025 0.368772
4 16 176 160 159.98 168 0.712817 0.370929
5 16 209 193 154.382 132 0.895346 0.382279
6 16 255 239 159.316 184 0.53416 0.383384
7 16 296 280 159.983 164 0.593508 0.384546
8 16 333 317 158.484 148 0.813994 0.38577
9 16 371 355 157.762 152 0.00876288 0.387988
10 16 419 403 161.184 192 0.00834633 0.381558
Total time run: 10.6789
Total writes made: 419
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 156.945
Stddev Bandwidth: 18.2866
Max bandwidth (MB/sec): 192
Min bandwidth (MB/sec): 132
Average IOPS: 39
Stddev IOPS: 4.57165
Max IOPS: 48
Min IOPS: 33
Average Latency(s): 0.397735
Stddev Latency(s): 0.325796
Max latency(s): 1.12825
Min latency(s): 0.00732688
activating
imageFeatures: fast-diff,object-map,deep-flatten,exclusive-lock
bash-5.1$ rados -p ceph-blockpool cleanup
Removed 409 objects
bash-5.1$ rados bench -p ceph-blockpool 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_talos01_150
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 16 16 0 0 0 - 0
1 16 63 47 187.957 188 0.0112248 0.235739
2 16 97 81 161.971 136 0.357594 0.326104
3 16 133 117 155.97 144 0.47757 0.366872
4 16 174 158 157.973 164 0.00957896 0.371427
5 16 220 204 163.174 184 0.00931906 0.361775
6 16 262 246 163.972 168 0.957549 0.362961
7 16 301 285 162.83 156 0.124948 0.367963
8 16 338 322 160.973 148 0.0813832 0.379436
9 16 377 361 160.416 156 0.231777 0.379363
10 16 411 395 157.971 136 0.963272 0.388629
Total time run: 10.6665
Total writes made: 411
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 154.128
Stddev Bandwidth: 18.2087
Max bandwidth (MB/sec): 188
Min bandwidth (MB/sec): 136
Average IOPS: 38
Stddev IOPS: 4.55217
Max IOPS: 47
Min IOPS: 34
Average Latency(s): 0.40438
Stddev Latency(s): 0.361115
Max latency(s): 1.07227
Min latency(s): 0.00752398
enabling 10G NIC for public
remove osd.* osd_mclock_max_capacity_iops_ssd values
ceph config rm osd.1 osd_mclock_max_capacity_iops_ssd
bash-5.1$ rados bench -p ceph-blockpool 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_talos01_290
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 2 2 0 0 0 - 0
1 16 367 351 1403.72 1404 0.0306127 0.0449315
2 16 769 753 1505.77 1608 0.0133677 0.0399588
3 16 1225 1209 1611.79 1824 0.0116776 0.0373528
4 16 1660 1644 1643.79 1740 0.0116182 0.0382942
5 16 2035 2019 1614.96 1500 0.0147896 0.0381899
6 16 2508 2492 1661.1 1892 0.0227459 0.0383829
7 16 2763 2747 1569.5 1020 0.232462 0.0406331
8 16 3133 3117 1558.27 1480 0.0143704 0.0399879
9 16 3621 3605 1601.99 1952 0.0140347 0.0398062
10 16 4003 3987 1594.57 1528 0.0160011 0.0399998
Total time run: 10.1875
Total writes made: 4003
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1571.73
Stddev Bandwidth: 275.507
Max bandwidth (MB/sec): 1952
Min bandwidth (MB/sec): 1020
Average IOPS: 392
Stddev IOPS: 68.8768
Max IOPS: 488
Min IOPS: 255
Average Latency(s): 0.0406496
Stddev Latency(s): 0.0495828
Max latency(s): 0.251322
Min latency(s): 0.00817766
mixed public network thunderbolt + 10GB NIC
bash-5.1$ rados bench -p ceph-blockpool 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_talos01_141
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 16 16 0 0 0 - 0
1 16 307 291 1163.73 1164 0.032906 0.0473561
2 16 576 560 1119.75 1076 0.0136398 0.0535368
3 16 791 775 1033.14 860 0.0345474 0.0617631
4 16 1128 1112 1111.81 1348 0.0138425 0.0563346
5 16 1530 1514 1211.01 1608 0.0197638 0.0526762
6 16 1764 1748 1165.16 936 0.0167593 0.0546279
7 16 2136 2120 1211.25 1488 0.228938 0.051471
8 16 2510 2494 1246.83 1496 0.0162949 0.0512529
9 16 2830 2814 1250.49 1280 0.0116342 0.0506672
10 16 3203 3187 1274.61 1492 0.0170389 0.0498197
Total time run: 10.1559
Total writes made: 3203
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1261.53
Stddev Bandwidth: 257.468
Max bandwidth (MB/sec): 1608
Min bandwidth (MB/sec): 860
Average IOPS: 315
Stddev IOPS: 64.3671
Max IOPS: 402
Min IOPS: 215
Average Latency(s): 0.0506837
Stddev Latency(s): 0.0632691
Max latency(s): 0.261037
Min latency(s): 0.0108483
rook-ceph with 10G NIC node01
TEST_FILE: /volume/test
TEST_OUTPUT_PREFIX: test_device
TEST_SIZE: 30G
Benchmarking iops.fio into test_device-iops.json
Benchmarking bandwidth.fio into test_device-bandwidth.json
Benchmarking latency.fio into test_device-latency.json
=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
Random: 69,040 / 5,700
Sequential: 7,825 / 6,705
Bandwidth in KiB/sec (Read/Write)
Random: 988,501 / 564,709
Sequential: 766,049 / 542,822
Latency in ns (Read/Write)
Random: 292,516 / 5,599,995
Sequential: 62,989 / 4,561,601
rook-ceph with 10G NIC node02
TEST_FILE: /volume/test
TEST_OUTPUT_PREFIX: test_device
TEST_SIZE: 30G
Benchmarking iops.fio into test_device-iops.json
Benchmarking bandwidth.fio into test_device-bandwidth.json
Benchmarking latency.fio into test_device-latency.json
=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
Random: 70,899 / 5,680
Sequential: 36,895 / 6,668
Bandwidth in KiB/sec (Read/Write)
Random: 943,277 / 539,948
Sequential: 1,014,213 / 574,627
Latency in ns (Read/Write)
Random: 98,570 / 5,580,540
Sequential: 61,411 / 4,389,462
rook-ceph with 10G NIC node03
TEST_FILE: /volume/test
TEST_OUTPUT_PREFIX: test_device
TEST_SIZE: 30G
Benchmarking iops.fio into test_device-iops.json
Benchmarking bandwidth.fio into test_device-bandwidth.json
Benchmarking latency.fio into test_device-latency.json
=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
Random: 66,197 / 5,694
Sequential: 6,504 / 6,594
Bandwidth in KiB/sec (Read/Write)
Random: 939,746 / 545,931
Sequential: 739,534 / 417,548
Latency in ns (Read/Write)
Random: 301,277 / 9,419,883
Sequential: 111,137 / 8,106,560
rook-ceph thunderbolt (cluster) + 10G (public)
node01
TEST_FILE: /volume/test
TEST_OUTPUT_PREFIX: test_device
TEST_SIZE: 30G
Benchmarking iops.fio into test_device-iops.json
Benchmarking bandwidth.fio into test_device-bandwidth.json
Benchmarking latency.fio into test_device-latency.json
=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
Random: 71,967 / 5,736
Sequential: 7,534 / 6,774
Bandwidth in KiB/sec (Read/Write)
Random: 984,150 / 580,090
Sequential: 788,600 / 368,704
Latency in ns (Read/Write)
Random: 286,365 / 4,812,697
Sequential: 60,550 / 4,528,310
node02
TEST_FILE: /volume/test
TEST_OUTPUT_PREFIX: test_device
TEST_SIZE: 30G
Benchmarking iops.fio into test_device-iops.json
Benchmarking bandwidth.fio into test_device-bandwidth.json
Benchmarking latency.fio into test_device-latency.json
=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
Random: 74,968 / 5,695
Sequential: 36,010 / 6,543
Bandwidth in KiB/sec (Read/Write)
Random: 993,236 / 591,083
Sequential: 1,043,011 / 658,092
Latency in ns (Read/Write)
Random: 97,216 / 6,354,677
Sequential: 62,772 / 4,478,008
node03
TEST_FILE: /volume/test
TEST_OUTPUT_PREFIX: test_device
TEST_SIZE: 30G
Benchmarking iops.fio into test_device-iops.json
Benchmarking bandwidth.fio into test_device-bandwidth.json
Benchmarking latency.fio into test_device-latency.json
=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
Random: 70,447 / 5,523
Sequential: 6,457 / 5,771
Bandwidth in KiB/sec (Read/Write)
Random: 949,631 / 572,395
Sequential: 761,615 / 450,329
Latency in ns (Read/Write)
Random: 293,823 / 10,886,182
Sequential: 118,248 / 8,036,728
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
compared to old cluser