Attempting GCP backups, but lh-engine still return...
# longhorn-storage
b
Attempting GCP backups, but lh-engine still returns errors that look like they're from AWS
my google bucket is
bigasterisk-longhorn
and when I edit the backup target, I get this in the json reply: (newlines added here)
'failed to list backup volumes in <s3://bigasterisk-longhorn>@US/:
error listing backup volume names: failed to execute:
/var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-v1.8.0/longhorn
[/var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-v1.8.0/longhorn backup ls --volume-only <s3://bigasterisk-longhorn>@US/],
output failed to list objects with param:
{\n  Bucket: "bigasterisk-longhorn",\n  Delimiter: "/",\n  Prefix: "/"\n}
error: AWS Error:  AccessDenied Access denied. <nil>
\n403 \n\n,
stderr time="2025-02-16T05:18:42Z"
level=error
msg="Failed to list s3"
func="s3.(*BackupStoreDriver).List"
file="s3.go:116"
error="failed to list objects with param: {\\n
Bucket: \\"bigasterisk-longhorn\\",\\n
Delimiter: \\"/\\",\\n
Prefix: \\"/\\"\\n}
error: AWS Error:  AccessDenied Access denied. <nil>\\n
403 \\n"
pkg=s3\n
time="2025-02-16T05:18:42Z"
level=error msg="failed to list objects with param: {\\n  Bucket: \\"bigasterisk-longhorn\\",\\n  Delimiter: \\"/\\",\\n  Prefix: \\"/\\"\\n} error: AWS Error:  AccessDenied Access denied. <nil>\\n403 \\n" func=main.ResponseLogAndError file="main.go:47"\n: exit status 1'
1. i only see that message in chrome devtools- the UI just says 'error'. I filed https://github.com/longhorn/longhorn/issues/10428 about that. 2. I believe that the string "AWS Error:" means we were talking to actual amazon AWS, and my endpoint is not being respected. But this could be wrong. 3. My credential secret has this:
apiVersion: v1
kind: Secret
metadata:
name: longhorn-gcp-backups
namespace: longhorn-system
data:
AWS_ACCESS_KEY_ID: ...
AWS_ENDPOINTS: <https://storage.googleapis.com>
AWS_SECRET_ACCESS_KEY: ...
VIRTUAL_HOSTED_STYLE: "true"
4. I'd love to test the command that's within the error message, but when I go to a longhorn-manager pod and run this, it fails:
longhorn-manager-2kqp9:/ # /var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-v1.8.0/longhorn info
FATA[2025-02-16T05:50:29Z]volume.go:16 main.longhornCli.InfoCmd.func17 Error running info command: failed to get volume localhost:9501: rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: read tcp 127.0.0.1:48230->127.0.0.1:9501: read: connection reset by peer"
well, the docs says "Note: Consider creating an IAM condition to reduce how many buckets this serviceaccount has object admin access to." so i tried to do that. But, when I went to the bucket view and clicked 'grant access' and made my serviceaccount be a object admin (of just that bucket, hopefully), the LH Backup Target status went from error->available.
so, we learn that 'error' is hiding a lot of useful help, and google returns "AWS Error:" strings, and IAM conditions are tricksy