Is there a limit on how big uploaded images can be...
# harvester
a
Is there a limit on how big uploaded images can be? Im trying to migrate over some libvirt virtual machines to harvester and the qcow images are around ~100gb in size (windows/linux). Trying to upload the image on v1.4.2 results in “network error”
Im uploading the file from my laptop to the cluster, latency between me and the cluster is less than 40ms and I don’t believe I have bandwidth issues
f
I’d try running a simple HTTP server and try downloading image via HTTP in harvester. You can set one up easily with python or something along those lines
👍 1
a
I will try serving the libvirt image directory and download them to harvester.
f
worth a shot, I’m not actually aware if there are any size limitations or not but I know I have used HTTP to send images to Harvester successfully in the past so its worth a shot
a
So far that suggestion has worked! the image is successfully transferring into harvester
I guess the Image upload from the harvester UI (and from an external rancher UI that has this harvester cluster registered) just cant handle the large files for some reason
g
i think this is a known issue.. i will try and find it
👍 2
m
I've check with image up to 70GB, can you provide the SB?
a
Should I try to upload again then generate the sb? its been a few hours since the issue, how far back do support bundles cover?
g
it would be great if you trigger the failure scenario and then generate a support bundle
that way we will have all the objects included..
i would suggest import the image in
default
namespace since that is included in the bundle
for non default namespaces please update this setting https://docs.harvesterhci.io/v1.5/advanced/index#support-bundle-namespaces
a
Reproduced, generating a support bundle shortly
supportbundle_8e1dc8b4-54f0-47b7-9b8d-bbfbce594473_2025-04-30T03-39-04Z.zip
g
are you using a storage network?
a
not for this lab cluster no, i believe storage is off of the default management bond currently
I can easily span a vlan across the 10gb ports and add the storage network however
g
Copy code
- backing-image-manager-5ae6-ce9a › backing-image-manager
backing-image-manager-5ae6-f3ff backing-image-manager time="2025-04-29T21:05:54Z" level=debug msg="failed to get IP from lhnet1 interface, fallback to use the default pod IP 10.52.1.79" func=util.GetIPForPod file="http_util.go:65" error="route ip+net: no such network interface"
- backing-image-manager-5ae6-6569 › backing-image-manager
backing-image-manager-5ae6-f3ff backing-image-manager time="2025-04-29T21:18:36Z" level=debug msg="failed to get IP from lhnet1 interface, fallback to use the default pod IP 10.52.1.79" func=util.GetIPForPod file="http_util.go:65" error="route ip+net: no such network interface"
i only asked because i saw these errors
could be a red herring i will check the logs
a
that might be, i'm not seeing any pod communication errors nor am i aware of any intra pod networking issues
m
can your backingimage manager pods talks to instance manager pods and backingimage datasource pod?
a
checking
yes, i am able to communicate from the backing-image-manager and instance-manager pods
pod to pod on the same node
Copy code
backing-image-manager-5ae6-6569                     1/1     Running   0              20h   10.52.2.240   saclab-harvester-3   <none>           <none>
instance-manager-38959b8b7aa74f877f9b5e25352cc4ad   1/1     Running   0              20h   10.52.2.241   saclab-harvester-3   <none>           <none>


backing-image-manager-5ae6-6569:/ # tracepath 10.52.2.241
 1?: [LOCALHOST]                      pmtu 1450
 1:  10.8.129.50                                           0.059ms
 1:  10.8.129.50                                           0.054ms
 2:  10.52.2.241                                           0.120ms reached
     Resume: pmtu 1450 hops 2 back 2
backing-image-manager-5ae6-6569:/ #
pod to pod across two different nodes
Copy code
backing-image-manager-5ae6-6569:/ # tracepath 10.52.0.133
 1?: [LOCALHOST]                      pmtu 1450
 1:  10.8.129.50                                           0.085ms
 1:  10.8.129.50                                           0.072ms
 2:  10.52.0.0                                             0.485ms
 3:  10.52.0.133                                           0.468ms reached
     Resume: pmtu 1450 hops 3 back 3
I do not see a backing image datasource pod though
Copy code
saclab-harvester-1:~ # kubectl get pods -o wide -n longhorn-system
NAME                                                READY   STATUS    RESTARTS       AGE   IP            NODE                 NOMINATED NODE   READINESS GATES
backing-image-manager-5ae6-6569                     1/1     Running   0              20h   10.52.2.240   saclab-harvester-3   <none>           <none>
backing-image-manager-5ae6-ce9a                     1/1     Running   0              24h   10.52.0.134   saclab-harvester-2   <none>           <none>
backing-image-manager-5ae6-f3ff                     1/1     Running   0              20h   10.52.1.79    saclab-harvester-1   <none>           <none>
csi-attacher-b9799988f-4xjcp                        1/1     Running   1 (20h ago)    21h   10.52.0.205   saclab-harvester-2   <none>           <none>
csi-attacher-b9799988f-564sf                        1/1     Running   0              21h   10.52.0.204   saclab-harvester-2   <none>           <none>
csi-attacher-b9799988f-qqtlk                        1/1     Running   0              20h   10.52.1.95    saclab-harvester-1   <none>           <none>
csi-provisioner-86b8584475-2tpcw                    1/1     Running   0              21h   10.52.0.208   saclab-harvester-2   <none>           <none>
csi-provisioner-86b8584475-l5q4f                    1/1     Running   0              20h   10.52.1.105   saclab-harvester-1   <none>           <none>
csi-provisioner-86b8584475-rtvz2                    1/1     Running   0              21h   10.52.0.210   saclab-harvester-2   <none>           <none>
csi-resizer-7d8b96d567-bbcz5                        1/1     Running   0              20h   10.52.1.100   saclab-harvester-1   <none>           <none>
csi-resizer-7d8b96d567-k72vp                        1/1     Running   0              21h   10.52.0.207   saclab-harvester-2   <none>           <none>
csi-resizer-7d8b96d567-pjd95                        1/1     Running   0              20h   10.52.1.106   saclab-harvester-1   <none>           <none>
csi-snapshotter-6495bc948d-fhl6v                    1/1     Running   2 (20h ago)    21h   10.52.0.203   saclab-harvester-2   <none>           <none>
csi-snapshotter-6495bc948d-s9879                    1/1     Running   0              20h   10.52.1.120   saclab-harvester-1   <none>           <none>
csi-snapshotter-6495bc948d-x46w7                    1/1     Running   0              20h   10.52.1.121   saclab-harvester-1   <none>           <none>
engine-image-ei-51cc7b9c-ngkr6                      1/1     Running   4 (20h ago)    39d   10.52.1.69    saclab-harvester-1   <none>           <none>
engine-image-ei-51cc7b9c-z22cm                      1/1     Running   2 (20h ago)    82d   10.52.2.237   saclab-harvester-3   <none>           <none>
engine-image-ei-51cc7b9c-zbzls                      1/1     Running   2 (25h ago)    83d   10.52.0.128   saclab-harvester-2   <none>           <none>
instance-manager-079bddaa8d9cf91219928a266fcf5cc6   1/1     Running   0              24h   10.52.0.133   saclab-harvester-2   <none>           <none>
instance-manager-38959b8b7aa74f877f9b5e25352cc4ad   1/1     Running   0              20h   10.52.2.241   saclab-harvester-3   <none>           <none>
instance-manager-55179165c67e7fedbca8aac79f56b50c   1/1     Running   0              20h   10.52.1.80    saclab-harvester-1   <none>           <none>
longhorn-csi-plugin-5qp7z                           3/3     Running   12 (20h ago)   39d   10.52.1.72    saclab-harvester-1   <none>           <none>
longhorn-csi-plugin-l2sxh                           3/3     Running   9 (20h ago)    82d   10.52.2.238   saclab-harvester-3   <none>           <none>
longhorn-csi-plugin-v5qrm                           3/3     Running   7 (25h ago)    83d   10.52.0.122   saclab-harvester-2   <none>           <none>
longhorn-driver-deployer-d5cfcbcfc-5n6pk            1/1     Running   0              20h   10.52.1.110   saclab-harvester-1   <none>           <none>
longhorn-loop-device-cleaner-6qm2x                  1/1     Running   3 (20h ago)    82d   10.8.129.50   saclab-harvester-3   <none>           <none>
longhorn-loop-device-cleaner-6xvt8                  1/1     Running   4 (20h ago)    39d   10.8.129.48   saclab-harvester-1   <none>           <none>
longhorn-loop-device-cleaner-dpwdw                  1/1     Running   2 (25h ago)    83d   10.8.129.49   saclab-harvester-2   <none>           <none>
longhorn-manager-7mf2t                              2/2     Running   4 (25h ago)    83d   10.52.0.131   saclab-harvester-2   <none>           <none>
longhorn-manager-dn45w                              2/2     Running   8 (20h ago)    39d   10.52.1.77    saclab-harvester-1   <none>           <none>
longhorn-manager-fs4lv                              2/2     Running   5 (20h ago)    82d   10.52.2.239   saclab-harvester-3   <none>           <none>
longhorn-ui-6884f7f499-8nr5t                        1/1     Running   0              21h   10.52.0.202   saclab-harvester-2   <none>           <none>
longhorn-ui-6884f7f499-f5gpb                        1/1     Running   0              20h   10.52.1.92    saclab-harvester-1   <none>           <none>
huh. that log error gauravm posted might actually be an issue
oh wait nevermind. that pods address is the address its trying to get to.
Copy code
backing-image-manager-5ae6-f3ff:/ # tracepath 10.52.1.79
 1:  backing-image-manager-5ae6-f3ff                       0.045ms reached
     Resume: pmtu 65535 hops 1 back 1
m
The second API for uploading can't reach the Harvester deployment API server. For example, consider the image
vmi image-ccgl8
, which uses the storage class
longhorn-image-ccgl8
, resulting in a BackingImage resource:
Copy code
apiVersion: <http://longhorn.io/v1beta2|longhorn.io/v1beta2>
kind: BackingImage
metadata:
  name: vmi-0cf8dbd1-fb38-4a5f-ab94-9fa63c96707b
  namespace: longhorn-system
spec:
  diskFileSpecMap:
    6569f723-c441-4a23-ac1d-75359ccc5dc4:
      evictionRequested: false
  disks: {}
  minNumberOfCopies: 3
  sourceParameters: {}
  sourceType: upload
status:
  diskFileStatusMap:
    6569f723-c441-4a23-ac1d-75359ccc5dc4:
      lastStateTransitionTime: "2025-04-29T16:26:18Z"
      message: timeout waiting for the datasource file processing begin
      progress: 0
      state: failed-and-cleanup
The message
timeout waiting for the datasource file processing begin
indicates that Longhorn waited up to 60 seconds for the upload request, but it never arrived. This behavior is controlled by the logic in this part of the Longhorn Backing Image Manager code. This issue affects all images with
sourceType: upload
. As far as I understand, it's similar to the known limitations described in the Harvester documentation: • HTTP 413 error in Rancher multi-cluster managementProlonged uploading of large images Both cases stem from proxy behavior, which is outside of Harvester’s direct control.
🙌 1
👀 1
a
I see! I can confirm that uploading it via the harvester UI worked as expected. I also tinkered with my rancher cluster and changed the ingress proxy-body-size/proxy-request-buffering to make the multi cluster management side of it work. Thank you for the explanation and documentation! I will make sure to specify next time if I’m doing things via MCM/Rancher or the Harvester UI
👍 2