Created
May 5, 2018 12:42
-
-
Save Millnert/601924584bfe4b7743d01fe9c1f13967 to your computer and use it in GitHub Desktop.
ceph recovery when FUBAR (mem / cpu looping crashing OSDs with OOM)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Stop all OSDs | |
## Set OSD nodown | |
ceph osd set nodown | |
##Set OSD nobackfill | |
ceph osd set nobackfill | |
## Set OSD noup | |
ceph osd set noup | |
## Set map cache size smaller to reduce the overall memory footprint. In the [osd] section on each OSD node add: | |
[osd] | |
osd map cache size = 50 | |
osd map max advance = 25 | |
osd map share max epochs = 25 | |
osd pg epoch persisted max stale = 25 | |
## Start all OSDs, and let them catch up on their maps. | |
## Unset noup to trigger peering across all pgs at once. | |
ceph osd unset noup | |
##Once peering has completed, unset noout, nodown and nobackfill. | |
ceph osd unset noout | |
ceph osd unset nodown | |
ceph osd unset nobackfill | |
This should allow the recovery to complete using a smaller memory footprint. | |
## | |
systemctl stop ceph.target | |
ps aux | grep ceph | |
systemctl start ceph.target | |
top -cuceph |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment