Snapshots allow you to remember state of the disk (block device/filesystem) and give you ability restore or create new volumes from it in the future.
Volume managers that supports snapshots: LVM thin and ZFS. Snapshots there works almost similarly, but have nuances. When we create snapshot, we create new volume with zero size (metadata doesn't count), when you rewrite block in original volume, volume manager copies old block to snapshot volume and writes new block to original volume. With that strategy you now blocks, that was replaced after snapshot creation and that blocks lie in snapshot volume. When you create new volume from snapshot you copy original volume with replace of changed block from snapshot volume.
sudo mkdir /mnt/thin # create mount point
sudo lvcreate -L 100M -T ubuntu-vg/mythinpool # create thin pool
sudo lvcreate -V20M -T ubuntu-vg/mythinpool -n thinvolume # create thin volume
sudo mkfs.ext4 /dev/ubuntu-vg/thinvolume # create filesystem
sudo mount /dev/ubuntu-vg/thinvolume /mnt/thin
sudo bash -c 'date > /mnt/thin/date1'
sudo lvcreate -s -L 20M ubuntu-vg/thinvolume # create snapshot
sudo bash -c 'date > /mnt/thin/date2'
ls /mnt/thin
sudo umount /mnt/thin
sudo lvconvert --mergesnapshot ubuntu-vg/lvol1 # apply snapshot on its origin logical volume
sudo mount /dev/ubuntu-vg/thinvolume /mnt/thin
ls /mnt/thin
# Delete created volumes
sudo umount /mnt/thin
sudo lvremove ubuntu-vg/thinvolume
sudo lvremove ubuntu-vg/mythinpool
Linstor has its own api to work with snapshots.
➜ ~ linstor snapshot -h
usage: linstor snapshot [-h]
{create, create-multiple, delete, list, resource,
rollback, ship, ship-list, volume-definition} ...
Snapshot subcommands
optional arguments:
-h, --help show this help message and exit
shapshot commands:
- create (c)
- create-multiple (cm)
- delete (d)
- list (l)
- resource (r)
- rollback (rb)
- ship (sh)
- ship-list (shl)
- volume-definition (vd)
{create, create-multiple, delete, list, resource, rollback, ship, ship-list, volume-definition}
create-multiple: create snapshots on multiple resources with the same snapshot name.
create - create snapshot for resource
➜ ~ linstor snapshot r -h
usage: linstor snapshot resource [-h] ...
resource subcommands
optional arguments:
-h, --help show this help message and exit
resource subcommands:
- restore (rst)
linstor s r restore
- create resource from snapshot. You must have resource definition to create resource for.
We have one resource in storage pool backed by lvm thin.
➜ ~ linstor r list
+---------------------------------------------------------------------------------------------------------------+
| ResourceName | Node | Port | Usage | Conns | State | CreatedOn |
|===============================================================================================================|
| pvc-46b0153d-1057-4172-a154-a7a763ec07ad | worker-c-1 | 7001 | InUse | Ok | UpToDate | 2023-08-30 07:42:27 |
+---------------------------------------------------------------------------------------------------------------+
We can create snaphot from this resource
➜ ~ linstor s create pvc-46b0153d-1057-4172-a154-a7a763ec07ad inital_snapshot
SUCCESS:
Description:
New snapshot 'inital_snapshot' of resource 'pvc-46b0153d-1057-4172-a154-a7a763ec07ad' registered.
Details:
Snapshot 'inital_snapshot' of resource 'pvc-46b0153d-1057-4172-a154-a7a763ec07ad' UUID is: 33cdbf44-b7a9-478e-a120-32066e3edf1c
SUCCESS:
Suspended IO of '[pvc-46b0153d-1057-4172-a154-a7a763ec07ad]' on 'worker-c-1' for snapshot
SUCCESS:
Took snapshot of '[pvc-46b0153d-1057-4172-a154-a7a763ec07ad]' on 'worker-c-1'
SUCCESS:
Resumed IO of '[pvc-46b0153d-1057-4172-a154-a7a763ec07ad]' on 'worker-c-1' after snapshot
Time: 0h:00m:03s
Now we can see out new snapshot in linstor.
➜ ~ linstor s list
+-----------------------------------------------------------------------------------------------------------------------+
| ResourceName | SnapshotName | NodeNames | Volumes | CreatedOn | State |
|=======================================================================================================================|
| pvc-46b0153d-1057-4172-a154-a7a763ec07ad | inital_snapshot | worker-c-1 | 0: 1 GiB | 2023-08-30 10:08:59 | Successful |
+-----------------------------------------------------------------------------------------------------------------------+
Now we can delete resource and create another from snapshot
➜ ~ linstor r delete worker-c-1 pvc-46b0153d-1057-4172-a154-a7a763ec07ad
SUCCESS:
Description:
Node: worker-c-1, Resource: pvc-46b0153d-1057-4172-a154-a7a763ec07ad preparing for deletion.
Details:
Node: worker-c-1, Resource: pvc-46b0153d-1057-4172-a154-a7a763ec07ad UUID is: 73cfa58e-30ce-46ca-9668-66ecb7477fa1
SUCCESS:
Preparing deletion of resource on 'worker-c-1'
SUCCESS:
Description:
Node: worker-c-1, Resource: pvc-46b0153d-1057-4172-a154-a7a763ec07ad marked for deletion.
Details:
Node: worker-c-1, Resource: pvc-46b0153d-1057-4172-a154-a7a763ec07ad UUID is: 73cfa58e-30ce-46ca-9668-66ecb7477fa1
SUCCESS:
Cleaning up 'pvc-46b0153d-1057-4172-a154-a7a763ec07ad' on 'worker-c-1'
SUCCESS:
Description:
Node: worker-c-1, Resource: pvc-46b0153d-1057-4172-a154-a7a763ec07ad deletion complete.
Details:
Node: worker-c-1, Resource: pvc-46b0153d-1057-4172-a154-a7a763ec07ad UUID was: 73cfa58e-30ce-46ca-9668-66ecb7477fa1
➜ ~ linstor s r restore --fr pvc-46b0153d-1057-4172-a154-a7a763ec07ad --fs inital_snapshot --tr pvc-46b0153d-1057-4172-a154-a7a763ec07ad
SUCCESS:
Description:
Resource 'pvc-46b0153d-1057-4172-a154-a7a763ec07ad' restored from resource 'pvc-46b0153d-1057-4172-a154-a7a763ec07ad', snapshot 'inital_snapshot'.
Details:
Resource UUIDs: c2c320b3-3ba2-46cb-a598-40b2aa339522
SUCCESS:
Created resource 'pvc-46b0153d-1057-4172-a154-a7a763ec07ad' on 'worker-c-1'
SUCCESS:
Description:
Resource 'pvc-46b0153d-1057-4172-a154-a7a763ec07ad' on 'worker-c-1' ready
Details:
Resource: pvc-46b0153d-1057-4172-a154-a7a763ec07ad
In --tr
we can pass name of another resource definition, but it must bre created first with no resources.
Operator for snapshots in kubernetes - https://github.com/kubernetes-csi/external-snapshotter
Here we install necessary crd for controller: VolumeSnapshot, VolumeSnapshotContent, VolumeSnapshotClass. And after that install operator in kube-system namespace. You can change namespace on whatever you want.
https://kubernetes-csi.github.io/docs/snapshot-controller.html#deployment
git clone [email protected]:kubernetes-csi/external-snapshotter.git
cd external-snapshotter
git checkout release-6.2
kubectl kustomize client/config/crd | kubectl apply -f -
kubectl -n kube-system kustomize deploy/kubernetes/snapshot-controller | kubectl create -f -
A little analogy of snapshots resources with volume resources.
VolumeSnapshotClass -> StorageClass # What we have
| |
v v
VolumeSnapshot -> PersistentVolumeClaim # What we want
| |
v v
VolumeSnapshotContent -> PersistentVolume # Whe we get
snapshot controller will watch for VolumeSnapshot
resources, then look up for VolumeSnapshotClass
-> get provisioner -> send request to csi controller, in our case its linstor-csi-controller
,
there is csi-snapshotter
container that receive this request.
It is assumed that you have a volume-logger
from the previous demo.
Create VolumeSnapshotClass
.
apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Delete
driver: linstor.csi.linbit.com
kind: VolumeSnapshotClass
metadata:
name: piraeus-snapshots
Create snapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
namespace: volume-logger
name: volume-logger-thin-snapshot-1
spec:
volumeSnapshotClassName: piraeus-snapshots
source:
persistentVolumeClaimName: volume-logger-thin
Getting describe of VolumeSnapshot
...
Status:
Bound Volume Snapshot Content Name: snapcontent-ab9f76d6-6289-4a24-a011-7b4fe2c763a7
Creation Time: 2023-08-30T11:03:55Z
Ready To Use: true
Restore Size: 1Gi
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreatingSnapshot 10s snapshot-controller Waiting for a snapshot volume-logger/volume-logger-thin-snapshot-1 to be created by the CSI driver.
Normal SnapshotCreated 8s snapshot-controller Snapshot volume-logger/volume-logger-thin-snapshot-1 was successfully created by the CSI driver.
Normal SnapshotReady 8s snapshot-controller Snapshot volume-logger/volume-logger-thin-snapshot-1 is ready to use.
Let's look in linstor.
➜ ~ linstor s list
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| ResourceName | SnapshotName | NodeNames | Volumes | CreatedOn | State |
|=====================================================================================================================================================|
| pvc-46b0153d-1057-4172-a154-a7a763ec07ad | snapshot-ab9f76d6-6289-4a24-a011-7b4fe2c763a7 | worker-c-1 | 0: 1 GiB | 2023-08-30 11:03:55 | Successful |
+-----------------------------------------------------------------------------------------------------------------------------------------------------+
Let's look content of volume-logger
➜ ~ kubectl exec deploy/volume-logger -n volume-logger -- cat /volume/hello
Hello from volume-logger-5f978744b9-tqndw, running on worker-c-1, started at Wed Aug 30 07:42:44 UTC 2023
Hello from volume-logger-5f978744b9-6z8dl, running on worker-c-1, started at Wed Aug 30 09:38:26 UTC 2023
Hello from volume-logger-5f978744b9-s7jnz, running on worker-c-1, started at Wed Aug 30 11:02:32 UTC 2023
Now we can restart volume-logger to add new line to hello file. Anw we have
➜ ~ kubectl exec deploy/volume-logger -n volume-logger -- cat /volume/hello
Hello from volume-logger-5f978744b9-tqndw, running on worker-c-1, started at Wed Aug 30 07:42:44 UTC 2023
Hello from volume-logger-5f978744b9-6z8dl, running on worker-c-1, started at Wed Aug 30 09:38:26 UTC 2023
Hello from volume-logger-5f978744b9-s7jnz, running on worker-c-1, started at Wed Aug 30 11:02:32 UTC 2023
Hello from volume-logger-5f978744b9-hzfvr, running on worker-a-1, started at Wed Aug 30 11:08:53 UTC 2023
Let's create new pvc from previosly created snapshot, and start volume-logger with it.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: volume-logger-recreated
namespace: volume-logger
spec:
storageClassName: big-storage
resources:
requests:
storage: 1Gi
dataSource:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: volume-logger-thin-snapshot-1
accessModes:
- ReadWriteOnce
And here we have last message different from previous check.
➜ ~ kubectl exec deploy/volume-logger -n volume-logger -- cat /volume/hello
Hello from volume-logger-5f978744b9-tqndw, running on worker-c-1, started at Wed Aug 30 07:42:44 UTC 2023
Hello from volume-logger-5f978744b9-6z8dl, running on worker-c-1, started at Wed Aug 30 09:38:26 UTC 2023
Hello from volume-logger-5f978744b9-s7jnz, running on worker-c-1, started at Wed Aug 30 11:02:32 UTC 2023
Hello from volume-logger-76f59ccb5f-zsc69, running on worker-c-1, started at Wed Aug 30 11:15:30 UTC 2023
You can pass names of the nodes, where snapshots should be created. If none are given, the snapshot will be taken on all nodes where the given resource is present.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: volume-logger-recreated-cloned
namespace: volume-logger
spec:
storageClassName: big-storage
resources:
requests:
storage: 1Gi
dataSource:
kind: PersistentVolumeClaim
name: volume-logger-recreated
accessModes:
- ReadWriteOnce
What's there in describe?
Name: volume-logger-recreated-cloned
Namespace: volume-logger
StorageClass: big-storage
Status: Bound
Volume: pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209
Labels: <none>
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: linstor.csi.linbit.com
volume.kubernetes.io/selected-node: worker-c-1
volume.kubernetes.io/storage-provisioner: linstor.csi.linbit.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 1Gi
Access Modes: RWO
VolumeMode: Filesystem
DataSource:
Kind: PersistentVolumeClaim
Name: volume-logger-recreated
Used By: volume-logger-774bd4f9cf-96sph
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 95s (x2 over 101s) persistentvolume-controller waiting for first consumer to be created before binding
Normal Provisioning 88s linstor.csi.linbit.com_linstor-csi-controller-55f8d4646d-462qt_d3d16baa-4de4-4c16-9c8f-91d610f90a6b External provisioner is provisioning volume for claim "volume-logger/volume-logger-recreated-cloned"
Normal ExternalProvisioning 88s persistentvolume-controller waiting for a volume to be created, either by external provisioner "linstor.csi.linbit.com" or manually created by system administrator
Normal ProvisioningSucceeded 82s linstor.csi.linbit.com_linstor-csi-controller-55f8d4646d-462qt_d3d16baa-4de4-4c16-9c8f-91d610f90a6b Successfully provisioned volume pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209
And what linstor tells us
➜ ~ linstor v list [30.08 | 16:24:03 | 0]
+---------------------------------------------------------------------------------------------------------------------------------------+
| Node | Resource | StoragePool | VolNr | MinorNr | DeviceName | Allocated | InUse | State |
|=======================================================================================================================================|
...
| worker-c-1 | pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 | vg-storage | 0 | 1002 | /dev/drbd1002 | 49.34 MiB | InUse | UpToDate |
+---------------------------------------------------------------------------------------------------------------------------------------+
Let's check volume-logger state.
➜ ~ kubectl exec deploy/volume-logger -n volume-logger -- cat /volume/hello
Hello from volume-logger-5f978744b9-tqndw, running on worker-c-1, started at Wed Aug 30 07:42:44 UTC 2023
Hello from volume-logger-5f978744b9-6z8dl, running on worker-c-1, started at Wed Aug 30 09:38:26 UTC 2023
Hello from volume-logger-5f978744b9-s7jnz, running on worker-c-1, started at Wed Aug 30 11:02:32 UTC 2023
Hello from volume-logger-76f59ccb5f-zsc69, running on worker-c-1, started at Wed Aug 30 11:15:30 UTC 2023
Hello from volume-logger-774bd4f9cf-96sph, running on worker-c-1, started at Wed Aug 30 11:20:58 UTC 2023
And here we see all previous logs and new one.
You must have master passphrase configured in linstor to encrypt backups. Create secret with master passphrase
apiVersion: v1
kind: Secret
metadata:
# in the same namespace as operator
namespace: piraeus-datastore
name: linstor-master-passphrase
immutable: true
stringData:
MASTER_PASSPHRASE: "verysecretstring"
Add reference to this secret to linstor cluster
apiVersion: piraeus.io/v1
kind: LinstorCluster
metadata:
name: linstorcluster
spec:
#...
linstorPassphraseSecret: linstor-master-passphrase
#...
Create remote
➜ ~ linstor remote create s3 --use-path-style very-safe-s3 https://storage.yandexcloud.net bulat-k8s ru-central1 <access_key> <secret_key>
SUCCESS:
Remote created
SUCCESS:
(worker-c-1) Node changes applied.
SUCCESS:
(worker-b-1) Node changes applied.
SUCCESS:
(worker-a-2) Node changes applied.
SUCCESS:
(worker-a-1) Node changes applied.
List remotes
➜ ~ linstor remote list [31.08 | 15:50:05 | 0]
+-----------------------------------------------------------------------------+
| Name | Type | Info |
|=============================================================================|
| very-safe-s3 | S3 | ru-central1.https://storage.yandexcloud.net/bulat-k8s |
+-----------------------------------------------------------------------------+
List resources
➜ ~ linstor r list
+----------------------------------------------------------------------------------------------------------------+
| ResourceName | Node | Port | Usage | Conns | State | CreatedOn |
|================================================================================================================|
...
| pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 | worker-c-1 | 7002 | InUse | Ok | UpToDate | 2023-08-30 11:20:43 |
+----------------------------------------------------------------------------------------------------------------+
We have resource in use, let's create backup of this resource.
➜ ~ linstor backup create very-safe-s3 pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209
SUCCESS:
Suspended IO of '[back_20230831_105940]' on 'worker-c-1' for snapshot
SUCCESS:
Took snapshot of '[back_20230831_105940]' on 'worker-c-1'
SUCCESS:
Resumed IO of '[back_20230831_105940]' on 'worker-c-1' after snapshot
INFO:
Generated snapshot name for backup of resourcepvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 to remote very-safe-s3
INFO:
Shipping of resource pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 to remote very-safe-s3 in progress.
Time: 0h:00m:05s
Let's see our backups
➜ ~ linstor backup list very-safe-s3 [31.08 | 16:01:09 | 2]
+------------------------------------------------------------------------------------------------------------+
| Resource | Snapshot | Finished at | Based On | Status |
|============================================================================================================|
| pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 | back_20230831_105940 | 2023-08-31 10:59:45 | | Success |
+------------------------------------------------------------------------------------------------------------+
And how they lie in s3
➜ ~ aws s3 --profile k8s-bulat --endpoint-url https://storage.yandexcloud.net ls --recursive --human-readable s3://bulat-k8s
2023-08-31 15:59:45 5.9 KiB pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_00000_back_20230831_105940
2023-08-31 15:59:45 2.5 KiB pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_105940.meta
We have backup and meta files. Let's see the content of meta file
~ aws s3 --profile k8s-bulat --endpoint-url https://storage.yandexcloud.net cp s3://bulat-k8s/pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_105940.meta /tmp/ && cat /tmp/pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_105940.meta | jq
download: s3://bulat-k8s/pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_105940.meta to ../../tmp/pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_105940.meta
{
"rscName": "pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209",
"nodeName": "worker-c-1",
"startTimestamp": 1693479580686,
"finishTimestamp": 1693479585127,
"layerData": {
"kind": "drbd",
"children": [
{
"kind": "storage",
"children": [],
"rscNameSuffix": "",
"volumeList": [
{
"kind": "lvmThin",
"vlmNr": 0,
"storPoolApi": {
"kind": "storPool",
"storPoolName": "vg-storage",
"deviceProviderKind": "LVM_THIN"
},
"usableSize": 1052672,
"allocatedSize": 1052672,
"snapshotUsableSize": 1052672,
"snapshotAllocatedSize": 50528
}
]
}
],
"rscNameSuffix": "",
"drbdRscDfn": {
"rscNameSuffix": "",
"peerSlots": 7,
"alStripes": 1,
"alStripeSize": 32,
"transportType": "IP"
},
"nodeId": 0,
"peerSlots": 7,
"alStripes": 1,
"alStripeSize": 32,
"flags": 128,
"volumeList": [
{
"kind": "drbd",
"vlmNr": 0,
"drbdVlmDfn": {
"rscNameSuffix": "",
"vlmNr": 0
}
}
]
},
"rscDfn": {
"props": {
"Aux/csi-provisioning-completed-by": "linstor-csi/v1.1.0-20969df70962927a06cbdf714e9ca8cc3912cb4d",
... and other props
"StorPoolName": "vg-storage"
},
"flags": 0,
"vlmDfns": {
"0": {
"props": {
... volume props
},
"flags": 0,
"size": 1048576
}
}
},
"rsc": {
"props": {
"BackupShipping/BackupTargetRemote": "very-safe-s3",
"BackupShipping/BackupNodeIdsToReset": "0",
"StorPoolName": "vg-storage"
},
"flags": 0,
"vlms": {
"0": {
"props": {
"Aux/csi-publish-readonly": "false"
},
"flags": 0
}
}
},
"backups": {
"0": [
{
"name": "pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_00000_back_20230831_105940",
"finishedTimestamp": 1693479585121,
"node": "worker-c-1"
}
]
},
"clusterId": "6b926199-d9b7-4df0-92f6-d77d1e739a25",
"snapDfnUuid": "40b7874e-5426-41e5-8b90-b71580fdf98c"
}
Let's write some data to resource and make another backup. Download picture with cute cat.
root@volume-logger-774bd4f9cf-96sph:/volume# curl -OL https://images.wallpaperscraft.com/image/single/cat_surprise_look_96597_3840x2160.jpg
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1240k 100 1240k 0 0 1393k 0 --:--:-- --:--:-- --:--:-- 1393k
root@volume-logger-774bd4f9cf-96sph:/volume# ls -lh
total 1.3M
-rw-r--r-- 1 root root 1.3M Aug 31 11:16 cat_surprise_look_96597_3840x2160.jpg
-rw-r--r-- 1 root root 530 Aug 30 11:20 hello
drwx------ 2 root root 16K Aug 30 07:42 lost+found
Make another backup, and now we have state like this
➜ ~ linstor backup list very-safe-s3
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Resource | Snapshot | Finished at | Based On | Status |
|=================================================================================================================================================================|
| pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 | back_20230831_105940 | 2023-08-31 10:59:45 | | Success |
| pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 | back_20230831_111749 | 2023-08-31 11:17:52 | pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_105940 | Success |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
S3 files
➜ ~ aws s3 --profile k8s-bulat --endpoint-url https://storage.yandexcloud.net ls --recursive --human-readable s3://bulat-k8s
2023-08-31 15:59:45 5.9 KiB pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_00000_back_20230831_105940
2023-08-31 16:17:52 1.2 MiB pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_00000_back_20230831_111749
2023-08-31 15:59:45 2.5 KiB pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_105940.meta
2023-08-31 16:17:52 2.7 KiB pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_111749.meta
We have two new files.
Let's restore from these backups. We accidentally delete important production files. What to do to save your job?
Scale resource consumers to zero to release resource.
kubectl scale -n volume-logger deploy/volume-logger --replicas=0
➜ ~ linstor r delete worker-c-1 pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209
SUCCESS:
Description:
Node: worker-c-1, Resource: pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 preparing for deletion.
Details:
Node: worker-c-1, Resource: pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 UUID is: 7af230bb-d5fe-4f40-92f6-558eb7de7cfb
SUCCESS:
Preparing deletion of resource on 'worker-c-1'
SUCCESS:
Description:
Node: worker-c-1, Resource: pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 marked for deletion.
Details:
Node: worker-c-1, Resource: pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 UUID is: 7af230bb-d5fe-4f40-92f6-558eb7de7cfb
SUCCESS:
Cleaning up 'pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209' on 'worker-c-1'
SUCCESS:
Description:
Node: worker-c-1, Resource: pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 deletion complete.
Details:
Node: worker-c-1, Resource: pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 UUID was: 7af230bb-d5fe-4f40-92f6-558eb7de7cfb
Make restore. Linstor might say that you have snapshots already use it, you can delete them and download from s3 or restore as from snapshot.
➜ ~ linstor backup restore very-safe-s3 worker-c-1 pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 --id pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209_back_20230831_111749
SUCCESS:
Suspended IO of '[back_20230831_105940]' on 'worker-c-1' for snapshot
SUCCESS:
Took snapshot of '[back_20230831_105940]' on 'worker-c-1'
SUCCESS:
Resumed IO of '[back_20230831_105940]' on 'worker-c-1' after snapshot
INFO:
Restoring backup of resource pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 from remote very-safe-s3 into resource pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 in progress.
Time: 0h:00m:05s
If you see this error, you should do recover from snapshots, or you can delete snapshots and try again.
ERROR:
Snapshot back_20230831_111749 already exists, please use snapshot restore instead.
If you see these warnings, linstor decided just to download snapshot and not to touch resources.
WARNING:
The target resource-definition is already deployed on nodes. After downloading the Backup Linstor will NOT restore the data to prevent unintentional data-loss.
WARNING:
The target resource-definition is already deployed on nodes. After downloading the Backup Linstor will NOT restore the data to prevent unintentional data-loss.
Our resource available
➜ ~ linstor r list [31.08 | 16:41:35 | 0]
+----------------------------------------------------------------------------------------------------------------+
| ResourceName | Node | Port | Usage | Conns | State | CreatedOn |
|================================================================================================================|
| pvc-c693dcca-23e3-4b87-9b07-bd7952ce0209 | worker-c-1 | 7002 | Unused | Ok | UpToDate | 2023-08-31 11:41:39 |
➜ ~ kubectl scale -n volume-logger deploy/volume-logger --replicas=1
deployment.apps/volume-logger scaled
➜ ~ kubectl exec deploy/volume-logger -n volume-logger -- ls /volume/
cat_surprise_look_96597_3840x2160.jpg
hello
lost+found
And voila, you restore vital production files.
Lightning don't strike twice, so you can delete your backups.
linstor backup delete all very-safe-s3
First of all let's set up remote in linstor through VolumeSnapshotClass
.
Create secrets with access-key and secret-key for s3 bucket.
apiVersion: v1
kind: Secret
metadata:
namespace: piraeus-datastore
name: s3-bulat-k8s
immutable: true
type: linstor.csi.linbit.com/s3-credentials.v1
stringData:
access-key: "YCAJEtSoKLr7CWzrnSsadkjfna"
secret-key: "YCKF99PU_whasfREiN2O61N6tmZi-5K2EDpyFpRK"
Set up other remote params.
kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
metadata:
name: linstor-s3
driver: linstor.csi.linbit.com
deletionPolicy: Retain
parameters:
snap.linstor.csi.linbit.com/type: S3
# See https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-shipping_snapshots-linstor for details
# on each parameter.
snap.linstor.csi.linbit.com/remote-name: bulat-k8s
# Delete local copy of the snapshot after uploading completes
snap.linstor.csi.linbit.com/delete-local: "false"
snap.linstor.csi.linbit.com/allow-incremental: "true"
snap.linstor.csi.linbit.com/s3-bucket: bulat-k8s
snap.linstor.csi.linbit.com/s3-endpoint: storage.yandexcloud.net
snap.linstor.csi.linbit.com/s3-signing-region: ru-central1
snap.linstor.csi.linbit.com/s3-use-path-style: "true" # use <bucket>.host or host/<bucket> style
# Refer here to the secret that holds access and secret key for the S3 endpoint. See below for an example.
csi.storage.k8s.io/snapshotter-secret-name: s3-bulat-k8s
csi.storage.k8s.io/snapshotter-secret-namespace: piraeus-datastore
Let's create volume snapshot from this class.
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
namespace: volume-logger
name: volume-logger-thin-backup-1
spec:
volumeSnapshotClassName: linstor-s3
source:
persistentVolumeClaimName: volume-logger-thin
Describe VolumeSnapshot
➜ ~ kubectl -n volume-logger describe volumesnapshot volume-logger-thin-backup-1
...
Status:
Bound Volume Snapshot Content Name: snapcontent-8cc0e770-7b8b-441d-8fb1-bd4d6d2f02ea
Creation Time: 2023-09-01T12:24:03Z
Ready To Use: true
Restore Size: 1Gi
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreatingSnapshot 81s snapshot-controller Waiting for a snapshot volume-logger/volume-logger-thin-backup-1 to be created by the CSI driver.
Normal SnapshotCreated 76s snapshot-controller Snapshot volume-logger/volume-logger-thin-backup-1 was successfully created by the CSI driver.
Normal SnapshotReady 74s snapshot-controller Snapshot volume-logger/volume-logger-thin-backup-1 is ready to use.
We can see our new remote appears in linstor.
➜ ~ linstor remote list -p
+------------------------------------------------------------------+
| Name | Type | Info |
|==================================================================|
| bulat-k8s | S3 | ru-central1.storage.yandexcloud.net/bulat-k8s |
+------------------------------------------------------------------+
We can see this backup in linstor and with aws command
➜ ~ linstor backup list bulat-k8s
+-------------------------------------------------------------------------------------------------------------------------------------+
| Resource | Snapshot | Finished at | Based On | Status |
|=====================================================================================================================================|
| pvc-46b0153d-1057-4172-a154-a7a763ec07ad | snapshot-8cc0e770-7b8b-441d-8fb1-bd4d6d2f02ea | 2023-09-01 12:24:04 | | Success |
+-------------------------------------------------------------------------------------------------------------------------------------+
➜ ~ aws s3 --profile k8s-bulat --endpoint-url https://storage.yandexcloud.net ls --recursive --human-readable s3://bulat-k8s
2023-09-01 17:24:04 5.9 KiB pvc-46b0153d-1057-4172-a154-a7a763ec07ad_00000_back_20230901_122401^snapshot-8cc0e770-7b8b-441d-8fb1-bd4d6d2f02ea
2023-09-01 17:24:05 2.6 KiB pvc-46b0153d-1057-4172-a154-a7a763ec07ad_back_20230901_122401^snapshot-8cc0e770-7b8b-441d-8fb1-bd4d6d2f02ea.meta
The process of recovery from snapshot the same as with local snapshots.
https://operatorhub.io/operator/snapscheduler - operator to schedule snapshots of volumes
git clone [email protected]:backube/snapscheduler.git
helm upgrade --install --create-namespaces snapscheduler ./helm
First, label the PVC for which we want to schedule snapshots; otherwise, the scheduler will create snapshots for all PVC in the namespace.
➜ ~ kubectl -n volume-logger label pvc volume-logger-thin scheduleSnapshot=true # You are free to choose a label for PVC selection.
persistentvolumeclaim/volume-logger-thin labeled
➜ ~ kubectl -n volume-logger describe pvc volume-logger-thin
...
Labels: scheduleSnapshot=true
...
Create snapshot schedule
---
apiVersion: snapscheduler.backube/v1
kind: SnapshotSchedule
metadata:
name: volume-logger-thin
namespace: volume-logger
spec:
claimSelector:
matchLabels:
scheduleSnapshot: "true"
disabled: false
retention:
maxCount: 7
schedule: "*/2 * * * *" # every 2 minutes
snapshotTemplate:
# labels can be added to VolumeSnapshot
labels:
vital: "false"
snapshotClassName: "piraeus-snapshots"
Let's see our schedule.
➜ ~ kubectl -n volume-logger get snapshotschedule
NAME SCHEDULE MAX AGE MAX NUM DISABLED NEXT SNAPSHOT
volume-logger-thin */2 * * * * 7 false 2023-09-05T14:14:00Z
You can set quote on how many snapshots can be created in namespace.
apiVersion: v1
kind: ResourceQuota
metadata:
name: snapshots
namespace: volume-logger
spec:
hard:
count/volumesnapshots.snapshot.storage.k8s.io: "50"
Let's see how our snapshots are created.
➜ ~ kubectl -n volume-logger get volumesnapshot
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
volume-logger-thin-volume-logger-thin-202309051412 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-95b25799-f013-4473-9905-0f819f7f2f76 45s 48s
Here controller created 7 snapshots, next time controller should delete first one.
➜ ~ kubectl -n volume-logger get volumesnapshot
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
volume-logger-thin-volume-logger-thin-202309051412 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-95b25799-f013-4473-9905-0f819f7f2f76 12m 12m
volume-logger-thin-volume-logger-thin-202309051414 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-cd77ddd8-1908-470e-ad8b-a22cc01e6ffb 10m 10m
volume-logger-thin-volume-logger-thin-202309051416 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-93a1e4e4-d328-49d1-b1e2-deebcabc95a1 8m7s 8m10s
volume-logger-thin-volume-logger-thin-202309051418 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-7213d936-3148-493d-b18f-f7452a25c221 6m8s 6m10s
volume-logger-thin-volume-logger-thin-202309051420 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-11e26657-724a-42f6-996b-8a77720d5371 4m7s 4m10s
volume-logger-thin-volume-logger-thin-202309051422 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-c1bbec4c-3e93-4929-8f43-9d701d91684b 2m7s 2m9s
volume-logger-thin-volume-logger-thin-202309051424 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-27313642-890d-48de-83f7-0b633b3b3ddd 7s 10s
Indeed, controller deleted first snapshot.
➜ ~ kubectl -n volume-logger get volumesnapshot
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
volume-logger-thin-volume-logger-thin-202309051414 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-cd77ddd8-1908-470e-ad8b-a22cc01e6ffb 13m 13m
volume-logger-thin-volume-logger-thin-202309051416 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-93a1e4e4-d328-49d1-b1e2-deebcabc95a1 11m 11m
volume-logger-thin-volume-logger-thin-202309051418 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-7213d936-3148-493d-b18f-f7452a25c221 9m15s 9m17s
volume-logger-thin-volume-logger-thin-202309051420 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-11e26657-724a-42f6-996b-8a77720d5371 7m14s 7m17s
volume-logger-thin-volume-logger-thin-202309051422 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-c1bbec4c-3e93-4929-8f43-9d701d91684b 5m14s 5m16s
volume-logger-thin-volume-logger-thin-202309051424 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-27313642-890d-48de-83f7-0b633b3b3ddd 3m14s 3m17s
volume-logger-thin-volume-logger-thin-202309051426 true volume-logger-thin 1Gi piraeus-snapshots snapcontent-f118d82c-03fc-4e58-96c0-98951bf06791 73s 76s
- You cannot create snapshots with LVM (not thin) as it doesn't support it.
- Volume clones implemented through snapshots, cause of that you cannot create volume clones with LVM (not thin). piraeusdatastore/linstor-csi#210
- In way how ZFS pool works, volume cloning doesn't work, you cannot delete snapshot if it's used. https://github.com/piraeusdatastore/linstor-csi
- You cannot clone volumes or from snapshots on new storage class. These resources will be allocated in the same storage pool as original volume or snapshot.
In kernel docs
https://docs.kernel.org/admin-guide/device-mapper/snapshot.html