https://rancher.com/ logo
Title
q

quaint-alarm-7893

01/05/2023, 1:08 AM
anyone having issues exporting 100gb volumes? i keep getting to 28% and then the manager pod for longhorn craps out.
Node harvester-01 is ready	2.2 mins ago 
Node harvester-01 is ready	2.2 mins ago 
Node harvester-01 is ready	2.2 mins ago 
Node harvester-01 is ready	2.2 mins ago 
Node harvester-01 is down: the manager pod longhorn-manager-fdc28 is not running	2.2 mins ago 
Node harvester-01 is down: the manager pod longhorn-manager-fdc28 is not running	2.2 mins ago 
Node harvester-01 is down: the manager pod longhorn-manager-fdc28 is not running	2.2 mins ago
in the longhorn ui, the backing image its trying to create gets to 28%, then dies, and starts over
b

billowy-painting-56466

01/05/2023, 2:28 AM
What is your Longhorn version? Can you provide the manger log? cc @famous-shampoo-18483
f

famous-shampoo-18483

01/05/2023, 2:29 AM
And what’s your backing image size?
q

quaint-alarm-7893

01/05/2023, 2:29 AM
1.3.2. it's running with harvester. 1.1.2 as for the log, how can i pull it?
it's 100gb
actual size is like 40gb
check-that, 28gb
f

famous-shampoo-18483

01/05/2023, 2:30 AM
OK. I guess you are uploading the file as a backing image. There is a bug for the Longhorn UI: https://github.com/longhorn/longhorn/issues/4902
q

quaint-alarm-7893

01/05/2023, 2:30 AM
i ran: k logs longhorn-manager-9d78v -n longhorn-system earlier, and it shows the logs, but it seems like it reset swhenver ti resets the pod
i'm not uploading, it's an export from a volume.
harvester generates something like this to create it:
apiVersion: <http://longhorn.io/v1beta2|longhorn.io/v1beta2>
kind: BackingImage
metadata:
  annotations:
    <http://harvesterhci.io/imageId|harvesterhci.io/imageId>: default/image-fs65q
  name: default-image-fs65q
  namespace: longhorn-system
spec:
  sourceParameters:
    export-type: raw
    volume-name: pvc-7ba4e01b-8be4-4a3c-9f43-1564eaac9027
  sourceType: export-from-volume
b

billowy-painting-56466

01/05/2023, 2:35 AM
i ran: k logs longhorn-manager-9d78v -n longhorn-system earlier, and it shows the logs, but it seems like it reset swhenver ti resets the pod
Try
k logs longhorn-manager-9d78v -n longhorn-system --previous
f

famous-shampoo-18483

01/05/2023, 2:36 AM
For the backing image part, you can check the log of pod
backing-image-ds-<backing image name>
in namespace
longhorn-system
q

quaint-alarm-7893

01/05/2023, 2:44 AM
okay, give me a few... running it now so i can get fresh/clean logs
i streamed the log for the backing image
but because the manager changes depending on what node it fails with, (it seems to do it with all of them), i had to grab it with the --previous option you gave me. (that's nice to know btw)
f

famous-shampoo-18483

01/05/2023, 2:54 AM
The error log in the backing-image-ds pod:
It seems that somehow the HTTP connection between this pod and the replica process of the running volume is closed.
q

quaint-alarm-7893

01/05/2023, 2:56 AM
the line prior to that is the last line before the pod terminates. i check it with --previous as well
any ideas on how to troubleshoot why? should i just down a bunch of vms see if it's a load issue or something?
f

famous-shampoo-18483

01/05/2023, 3:03 AM
1. Make sure there is no network issue or other kind of resource exhaustion issues. 2. Need to figure out which replica of the volume is responsible for exporting the data to the backing image. Then check the log. The replica processes are in instance-manager-r pods hence you need to check the pod logs.
b

billowy-painting-56466

01/05/2023, 3:06 AM
I've created and issue for this: https://github.com/longhorn/longhorn/issues/5209
q

quaint-alarm-7893

01/05/2023, 3:09 AM
okay, i'll mess around with it, and keep an eye on the issue. thanks!
f

famous-shampoo-18483

01/05/2023, 3:12 AM
BTW, you can directly generate the support bundle and upload it in this GitHub ticket.
👍 1