This message was deleted.
# longhorn-storage
a
This message was deleted.
👀 1
l
Hello. @acceptable-engineer-69679, You can start by looking in the Velero logs. And then tell us if you’re using the CSI Snapshot Data Mover feature of Velero 1.12+? And when you restore. Are you using
velero restore ...
?
a
this is my backup command
Copy code
v create backup csi-app --include-namespaces csi-app  --wait
l
Ahh so you’re not actually backing up PVC’s
a
and then
Copy code
k delete ns csi-app
and then
Copy code
v restore create --from-backup csi-app --restore-volumes=true --include-namespaces csi-app
Copy code
▶ v get backup csi-app
NAME      STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
csi-app   Completed   0        0          2023-12-20 09:15:31 +0100 CET   29d       default            <none>
Copy code
▶ v describe backups csi-app --details
Name:         csi-app
Namespace:    velero
Labels:       <http://velero.io/storage-location=default|velero.io/storage-location=default>
Annotations:  <http://velero.io/resource-timeout=10m0s|velero.io/resource-timeout=10m0s>
              <http://velero.io/source-cluster-k8s-gitversion=v1.22.10|velero.io/source-cluster-k8s-gitversion=v1.22.10>
              <http://velero.io/source-cluster-k8s-major-version=1|velero.io/source-cluster-k8s-major-version=1>
              <http://velero.io/source-cluster-k8s-minor-version=22|velero.io/source-cluster-k8s-minor-version=22>

Phase:  Completed


Namespaces:
  Included:  csi-app
  Excluded:  <none>

Resources:
  Included:        *
  Excluded:        <none>
  Cluster-scoped:  auto

Label selector:  <none>

Storage Location:  default

Velero-Native Snapshot PVs:  auto

TTL:  720h0m0s

CSISnapshotTimeout:    10m0s
ItemOperationTimeout:  4h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2023-12-20 09:56:57 +0100 CET
Completed:  2023-12-20 09:57:22 +0100 CET

Expiration:  2024-01-19 09:56:57 +0100 CET

Total items to be backed up:  12
Items backed up:              12

Resource List:
  <http://snapshot.storage.k8s.io/v1/VolumeSnapshot|snapshot.storage.k8s.io/v1/VolumeSnapshot>:
    - csi-app/velero-data-test-qmflb
  <http://snapshot.storage.k8s.io/v1/VolumeSnapshotClass|snapshot.storage.k8s.io/v1/VolumeSnapshotClass>:
    - longhorn-backup
  <http://snapshot.storage.k8s.io/v1/VolumeSnapshotContent|snapshot.storage.k8s.io/v1/VolumeSnapshotContent>:
    - snapcontent-d5bdd6bd-e20f-41d7-80e8-d9e63d3bd58f
  v1/ConfigMap:
    - csi-app/kube-root-ca.crt
  v1/Event:
    - csi-app/data-test.17a27e04126a8ac2
    - csi-app/data-test.17a27e0412ab892f
    - csi-app/data-test.17a27e0494461731
  v1/Namespace:
    - csi-app
  v1/PersistentVolume:
    - pvc-edbb643c-ec8b-4421-98c1-d3b4fdb3b875
  v1/PersistentVolumeClaim:
    - csi-app/data-test
  v1/Secret:
    - csi-app/default-token-jcpvl
  v1/ServiceAccount:
    - csi-app/default

Velero-Native Snapshots: <none included>

CSI Volume Snapshots:
Snapshot Content Name: snapcontent-d5bdd6bd-e20f-41d7-80e8-d9e63d3bd58f
  Storage Snapshot ID: <bak://pvc-edbb643c-ec8b-4421-98c1-d3b4fdb3b875/backup-526a5d85b3bb4eb2>
  Snapshot Size (bytes): 209715200
  Ready to use: true
l
Read: https://velero.io/docs/v1.12/csi-snapshot-data-movement/#to-back-up and you would want to use:
--features=EnableCSI
when installing Velero.
a
you mean helm values --set configuration.features?
l
Use the CSI Snapshot Data Mover going forward …
a
iv set it up like this
Copy code
backupsEnabled: true
snapshotsEnabled: true
deployNodeAgent: true

initContainers:
  - name: velero-plugin-for-aws
    image: velero/velero-plugin-for-aws:v1.8.0
    imagePullPolicy: IfNotPresent
    volumeMounts:
      - mountPath: /target
        name: plugins
  - name: velero-plugin-for-csi
    image: velero/velero-plugin-for-csi:v0.5.0
    imagePullPolicy: IfNotPresent
    volumeMounts:
      - mountPath: /target
        name: plugins

configuration:
  backupStorageLocation:
    - name: default
      provider: aws
      bucket: rancher-isa-dev
      default: true
      accessMode: ReadWrite
      config:
        region: minio
        s3ForcePathStyle: true
        s3Url: <http://10.97.10.57:9091>
        publicUrl: <http://10.97.10.57:9091>
        insecureSkipTLSVerify: true
  volumeSnapshotLocation:
    - name: longhorn
      provider: csi
  defaultVolumeSnapshotLocations: csi:longhorn
  features: EnableCSI
l
I’m pretty sure you need to disable the built-in Velero backup feature in order to use the CSI Snapshot controller .. and thereby make it work natively with Longhorn.
a
ok. how would i do that? and will restic still be working?
what’s your use-case for restic?
a
i have some volumes bound via nfs-client. i may need restic, not sure. try to switch from restic to CSI
l
aaah good ol’ NFS … you hardcore need that in your K8s setup?
Most definitely not a first-class citizen in the K8s world.
a
yeah, sadly. i know 😞 in this cases it is an sftp server which is mounted into the container.
at least i switched from longhorn-nfs provider to longhorn rwx volumes (in the end also a nfs provider). but there are cases when i need rwx volumes.
l
We chose the CSI route for persistent storage backup. For e.g. these reasons > https://velero.io/docs/v1.12/file-system-backup/#limitations
a
so in summary - for me to understand - the data mover will transfer the longhorn backup to a velero backup, right?
l
the data mover transfers to a backup repository … which is likely to S3 buckets … that of course should exist outside your K8s cluster/s
a
yes. it does.
i have two buckets, one for longhorn and one for velero
l
in the CSI Snapshot data mover scenario you now no longer need the longhorn one … as the mover moves Volume snapshots to a designated bucket
Making your setup simpler …
a
hm ... is it my client?
Copy code
v version
Client:
        Version: v1.11.1
        Git commit: bdbe7eb242b0f64d5b04a7fea86d1edbb3a3587c
Server:
        Version: v1.12.2
# WARNING: the client version does not match the server version. Please update client
Copy code
v create backup csi-app --snapshot-move-data  --include-namespaces csi-app  --wait

An error occurred: unknown flag: --snapshot-move-data
l
oh yeah .. bump your velero cli on local
a
ok, now it runs
👍 1
output looks the same
Copy code
TTL:  720h0m0s

CSISnapshotTimeout:    10m0s
ItemOperationTimeout:  4h0m0s

Hooks:  <none>

Backup Format Version:  1.1.0

Started:    2023-12-20 10:15:27 +0100 CET
Completed:  2023-12-20 10:15:51 +0100 CET

Expiration:  2024-01-19 10:15:27 +0100 CET

Total items to be backed up:  15
Items backed up:              15

Resource List:
  <http://snapshot.storage.k8s.io/v1/VolumeSnapshot|snapshot.storage.k8s.io/v1/VolumeSnapshot>:
    - csi-app/velero-data-test-fsr6s
  <http://snapshot.storage.k8s.io/v1/VolumeSnapshotClass|snapshot.storage.k8s.io/v1/VolumeSnapshotClass>:
    - longhorn-backup
  <http://snapshot.storage.k8s.io/v1/VolumeSnapshotContent|snapshot.storage.k8s.io/v1/VolumeSnapshotContent>:
    - snapcontent-44a286bd-0ce6-4311-beb1-9edb7a0cb0eb
  v1/ConfigMap:
    - csi-app/kube-root-ca.crt
  v1/Event:
    - csi-app/data-test.17a27e04126a8ac2
    - csi-app/data-test.17a27e0412ab892f
    - csi-app/data-test.17a27e0494461731
    - csi-app/velero-data-test-qmflb.17a27e6f8d412362
    - csi-app/velero-data-test-qmflb.17a27e71f430ab4e
    - csi-app/velero-data-test-qmflb.17a27e7246d48af9
  v1/Namespace:
    - csi-app
  v1/PersistentVolume:
    - pvc-edbb643c-ec8b-4421-98c1-d3b4fdb3b875
  v1/PersistentVolumeClaim:
    - csi-app/data-test
  v1/Secret:
    - csi-app/default-token-jcpvl
  v1/ServiceAccount:
    - csi-app/default

Velero-Native Snapshots: <none included>

CSI Volume Snapshots:
Snapshot Content Name: snapcontent-44a286bd-0ce6-4311-beb1-9edb7a0cb0eb
  Storage Snapshot ID: <bak://pvc-edbb643c-ec8b-4421-98c1-d3b4fdb3b875/backup-2614f92a3fcb4c3e>
  Snapshot Size (bytes): 209715200
  Ready to use: true
l
But, have you actually followed the guides and docs I sent you? 😉
a
i try to read it as we speak
l
You need to ensure that all prerequisites is in place and that Velero is configured to use the CSI Snapshot Data mover ..
a
ok, no dataupload object, so i missed something
🎯 1
l
I might take a minute … I would think you can see something in the Velero logs …
do you have the snapshot-controller running?
a
yes.
l
good …
it might take a minute for it to complete the snapshot depending on size and so on
a
i see a lot of metadata errors in velero logs
Copy code
2023-12-20T10:19:02+01:00 time="2023-12-20T09:19:02Z" level=info msg="Attempting to sync backup into cluster" backup=hourly-20231219002114 backupLocation=velero/default controller=backup-sync logSource="pkg/controller/backup_sync_controller.go:143"
2023-12-20T10:19:02+01:00 time="2023-12-20T09:19:02Z" level=error msg="Error getting backup metadata from backup store" backup=hourly-20231219002114 backupLocation=velero/default controller=backup-sync error="rpc error: code = Unknown desc = error getting object backups/hourly-20231219002114/velero-backup.json: NoSuchKey: The specified key does not exist.\n\tstatus code: 404, request id: 17A27FA1FE178346, host id: " error.file="/go/src/github.com/vmware-tanzu/velero/pkg/persistence/object_store.go:302" error.function="<http://github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).GetBackupMetadata|github.com/vmware-tanzu/velero/pkg/persistence.(*objectBackupStore).GetBackupMetadata>" logSource="pkg/controller/backup_sync_controller.go:147"
l
you apparently have some schedule … executed hourly ….
a
yes
l
we should concentrate on the manually triggered flow to work first … the one you’re triggering via the velero backup … command
a
all info messages
l
But, ensure that Velero “knows” where to store data … S3 bucket name and user and creds are correct
a
Copy code
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="Skipping resource <http://customresourcedefinitions.apiextensions.k8s.io|customresourcedefinitions.apiextensions.k8s.io>, because it's cluster-scoped and only specific namespaces or namespace scope types are included in the backup." backup=velero/csi-app logSource="pkg/util/collections/includes_excludes.go:155"
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="Summary for skipped PVs: []" backup=velero/csi-app logSource="pkg/backup/backup.go:434"
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="Backed up a total of 18 items" backup=velero/csi-app logSource="pkg/backup/backup.go:436" progress=
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="backup SnapshotMoveData is set to true, skip VolumeSnapshot resource persistence." backup=velero/csi-app logSource="pkg/backup/snapshots.go:22"
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="Setting up backup store to persist the backup" backup=velero/csi-app logSource="pkg/controller/backup_controller.go:724"
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="Backup completed" backup=velero/csi-app logSource="pkg/controller/backup_controller.go:738"
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="Updating backup's final status" backuprequest=velero/csi-app controller=backup logSource="pkg/controller/backup_controller.go:309"
2023-12-20T10:21:15+01:00 time="2023-12-20T09:21:15Z" level=info msg="backup SnapshotMoveData is set to true, skip VolumeSnapshot resource persistence." backup=velero/csi-app controller=backup-finalizer logSource="pkg/backup/snapshots.go:22"
Copy code
backup SnapshotMoveData is set to tru
s3 must be working, i can see objects in the s3 bucket
l
that’s good
yup looks good.
I don’t think you should configure these two:
Copy code
backupsEnabled: true
snapshotsEnabled: true
When using the CSI snapshotter …
a
maybe i should use the newer 0.6 csi plugin instead 0.5?
l
and the CSI Snapshot Data Mover … then different features paths of Velero are stepping on each others toes
yes bump the plugin versions
Always read the upgrade notes > https://velero.io/docs/v1.12/upgrade-to-1.12/
if you’re upgrading
a
ok
hm, now i can't delete my backup
Copy code
2023-12-20T10:31:14+01:00 time="2023-12-20T09:31:14Z" level=info msg="Removing existing deletion requests for backup" backup=csi-app controller=backup-deletion deletebackuprequest=velero/csi-app-cph25 logSource="pkg/controller/backup_deletion_controller.go:483"
2023-12-20T10:31:14+01:00 time="2023-12-20T09:31:14Z" level=info msg="The request has status 'Processed', skip." controller=backup-deletion deletebackuprequest=velero/csi-app-cph25 logSource="pkg/controller/backup_deletion_controller.go:145"
2023-12-20T10:31:36+01:00 time="2023-12-20T09:31:36Z" level=info msg="Removing existing deletion requests for backup" backup=csi-app controller=backup-deletion deletebackuprequest=velero/csi-app-6lc6p logSource="pkg/controller/backup_deletion_controller.go:483"
2023-12-20T10:31:36+01:00 time="2023-12-20T09:31:36Z" level=info msg="deletion request 'csi-app-cph25' removed." backup=csi-app controller=backup-deletion deletebackuprequest=velero/csi-app-6lc6p logSource="pkg/controller/backup_deletion_controller.go:500"
2023-12-20T10:31:36+01:00 time="2023-12-20T09:31:36Z" level=info msg="The request has status 'Processed', skip." controller=backup-deletion deletebackuprequest=velero/csi-app-6lc6p logSource="pkg/controller/backup_deletion_controller.go:145"
l
Of course we’re working on production right now right right right 😉
a
no, its test
production is still using restic
l
aight
a
now at least i got an error
Copy code
v create backup csi-app2 --snapshot-move-data  --include-namespaces csi-app  --wait
Backup request "csi-app2" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.

Backup completed with status: FailedValidation. You may check for more information using the commands `velero backup describe csi-app2` and `velero backup logs csi-app2`.
Copy code
velero backup describe csi-app2 --details
Name:         csi-app2
Namespace:    velero
Labels:       <http://velero.io/storage-location=default|velero.io/storage-location=default>
Annotations:  <http://velero.io/resource-timeout=10m0s|velero.io/resource-timeout=10m0s>
              <http://velero.io/source-cluster-k8s-gitversion=v1.22.10|velero.io/source-cluster-k8s-gitversion=v1.22.10>
              <http://velero.io/source-cluster-k8s-major-version=1|velero.io/source-cluster-k8s-major-version=1>
              <http://velero.io/source-cluster-k8s-minor-version=22|velero.io/source-cluster-k8s-minor-version=22>

Phase:  FailedValidation

Validation errors:  an existing backup storage location wasn't specified at backup creation time and the server default 'default' doesn't exist. Please address this issue (see `velero backup-location -h` for options) and create a new backup. Error: <http://BackupStorageLocation.velero.io|BackupStorageLocation.velero.io> "default" not found
l
yeah that’s an issue … you never set a backup storage location?
a
i guess the helm chart has changed
configuration: backupStorageLocation: - name: default provider: aws bucket: rancher-isa-dev default: true accessMode: ReadWrite config: region: minio s3ForcePathStyle: true s3Url: http://10.97.10.57:9091 publicUrl: http://10.97.10.57:9091 insecureSkipTLSVerify: true volumeSnapshotLocation: - name: longhorn provider: csi defaultVolumeSnapshotLocations: csi:longhorn features: EnableCSI
l
or did the creds change … as they where re-initialized
Yeah a lot have happened to the Helm Chart
a
or, may me backup location have disapeared, because the backups are no longer enabled?
let me reverse that change to double check
yeah, backup location is back
my bad on that one
a
so, snap gets created
Copy code
csi-app                    0s          Normal   CreatingSnapshot        volumesnapshot/velero-data-test-t2f27                                            Waiting for a snapshot csi-app/velero-data-test-t2f27 to be created by the CSI driver.
longhorn-system            1s          Normal   SnapshotCreate          snapshot/snapshot-b01a4b2e-ac3b-43b4-a488-b94516af17c1                           successfully provisioned the snapshot
longhorn-system            1s          Normal   SnapshotUpdate          snapshot/snapshot-b01a4b2e-ac3b-43b4-a488-b94516af17c1                           snapshot becomes ready to use
👍 1
wow
look at this. seems to be the plugin version
Copy code
k get datauploads -A
NAMESPACE   NAME            STATUS      STARTED   BYTES DONE   TOTAL BYTES   STORAGE LOCATION   AGE    NODE
velero      csi-app-bnzb2   Completed   103s                                 default            103s   node5-isadev
👏 1
if i'm reading this correct, the option
--snapshot-move-data
should not be needed, right https://velero.io/docs/v1.12/csi-snapshot-data-movement/#to-back-up
Copy code
CSI plugin checks if a data movement is required, if so it creates a DataUpload CR and then returns to Velero backup.
but - no. if i don't specify it, i got an CSI Snapshot
🎯 1
cool, restore with data-mover enabled backup worked
and now the restore using csi snapshot seems to work as well.
so it was the plugin version. Thx!!!
❤️ 1
can i ask something else?
l
Hit me
a
i try to use zsh autocompletion for velero. but i won't complete the backup name. and i don't know if i'm to dump or if it is not implemented
so this works
but then no backup names
l
Hmm I don’t know. Don’t use this feature. I would ask in the Velero channel over on Kubernetes Slack.
a
ok, nevermind. thx
🎯 1
b
velero docs could use a whole lot of work - keep in mind they use a static restic repository password as well, so there’s no real off-cloud encryption
a
does this apply to kopia as well?
l
I don’t know … but one can set a random password when using the CSI Snapshot Data Mover