This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

10/17/2023, 2:22 PM

This message was deleted.

rough-teacher-88372

10/17/2023, 2:26 PM

Copy code

error: deployment "harvester-cluster-repo" exceeded its progress deadline
Waiting for harvester-cluster-repo deployment ready...

Back-off pulling image "rancher/harvester-cluster-repo:v1.1.2"

ancient-pizza-13099

10/17/2023, 3:03 PM

kubectl get vm -A

kubectl get vmi -A

If your cluster has limited CPU, memory, storage resources, it is possible that the upgrade repo

vm

is not started correctly, which caused the above error.

rough-teacher-88372

10/17/2023, 3:16 PM

harvester-system   upgrade-repo-hvst-upgrade-6h7l4   5h47m   Running   True

harvester-system   upgrade-repo-hvst-upgrade-6h7l4   5h47m   Running   10.52.0.205     canary     True

Resources aren’t the constraint. The deployment, however, has this event:

Copy code

Events:
  Type    Reason   Age                       From     Message
  ----    ------   ----                      ----     -------
  Normal  BackOff  3m13s (x1483 over 5h38m)  kubelet  Back-off pulling image "rancher/harvester-cluster-repo:v1.1.2"

ancient-pizza-13099

10/17/2023, 7:58 PM

crictl image ls | grep "cluster-repo"

ancient-pizza-13099

10/17/2023, 8:00 PM

also suspect it may hit https://github.com/harvester/harvester/issues/4210, the

harvester-cluster-repo

cluster-repo is garbage collection due to space usage cc @prehistoric-balloon-31801 @red-king-19196 please double confirm, thanks.

rough-teacher-88372

10/18/2023, 8:07 AM

Copy code

canary:/home/rancher # crictl image ls | grep "cluster-repo"
<http://docker.io/rancher/harvester-cluster-repo|docker.io/rancher/harvester-cluster-repo>                                    v1.1.1                         172a09728709f       183MB

rough-teacher-88372

10/18/2023, 8:09 AM

There’s no v1.1.2 locally, and the tag does not exist on docker hub. It is, however, the tag being loaded:

Back-off pulling image “rancher/harvester-cluster-repo:v1.1.2”

rough-teacher-88372

10/18/2023, 8:12 AM

https://registry.hub.docker.com/r/rancher/harvester-cluster-repo/tags?page=1&name=v1.1.2 In case there’s any doubt about the image existing.

ancient-pizza-13099

10/18/2023, 9:12 AM

The cluster-repo v1.1.2 was not published to docker hub then. But when you select an

v1.1.2

to upgrade, it will download the Harvester v1.1.2 ISO, and the

cluster-repo

image is extracted from the ISO to containerd repository, then use it to start the POD. But the

crictl image ls

output has no v1.1.2 cluster-repo image, it is possible that the image was garbeged after it was added.

ancient-pizza-13099

10/18/2023, 9:15 AM

in such case, you can try to build the

cluster-repo

v1.1.2 image locally and add it to NODE, per: https://github.com/harvester/harvester/issues/4537#issuecomment-1730230548

👍 1

red-king-19196

10/20/2023, 2:43 AM

yes, it’s likely the cluster-repo image was garbage collected due to insufficient disk space specifically for the container image store. you can take a look at the node-related events to see if there’s any disk pressure event that happened recently by

kubectl describe node <node-name>

or check the kubelet’s log for image GC log lines.

rough-teacher-88372

11/24/2023, 10:40 AM

Right, I finally got around to trying this and I did get the repo pod to work. I’m running into a different issue now:

Copy code

hvst-upgrade-6h7l4-apply-manifests-fq8fz deployment "harvester-cluster-repo" successfully rolled out
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://clusterrepo.catalog.cattle.io/harvester-charts|clusterrepo.catalog.cattle.io/harvester-charts> patched
hvst-upgrade-6h7l4-apply-manifests-fq8fz Upgrading Harvester
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester-crd|managedchart.management.cattle.io/harvester-crd> patched (no change)
hvst-upgrade-6h7l4-apply-manifests-fq8fz + CHART_NAME=harvester
hvst-upgrade-6h7l4-apply-manifests-fq8fz + CHART_MANIFEST=harvester.yaml
hvst-upgrade-6h7l4-apply-manifests-fq8fz + case $CHART_NAME in
hvst-upgrade-6h7l4-apply-manifests-fq8fz + patch_harvester_ignore_default_sc
hvst-upgrade-6h7l4-apply-manifests-fq8fz + yq e '.spec.diff.comparePatches += [{"apiVersion": "<http://storage.k8s.io/v1|storage.k8s.io/v1>", "kind": "StorageClass", "name": "harvester-longhorn", "jsonPointers":["/metadata/annotations"]}]' harvester.yaml -i
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester|managedchart.management.cattle.io/harvester> configured
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester|managedchart.management.cattle.io/harvester> patched
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester-crd|managedchart.management.cattle.io/harvester-crd> patched
hvst-upgrade-6h7l4-apply-manifests-fq8fz Waiting for ManagedChart fleet-local/harvester from generation 2
hvst-upgrade-6h7l4-apply-manifests-fq8fz Target version: 1.1.2, Target state: ready
hvst-upgrade-6h7l4-apply-manifests-fq8fz Current version: 1.1.2, Current state: null, Current generation: 2
hvst-upgrade-6h7l4-apply-manifests-fq8fz Sleep for 5 seconds to retry
hvst-upgrade-6h7l4-apply-manifests-fq8fz Current version: 1.1.2, Current state: null, Current generation: 2
hvst-upgrade-6h7l4-apply-manifests-fq8fz Sleep for 5 seconds to retry
hvst-upgrade-6h7l4-apply-manifests-fq8fz Current version: 1.1.2, Current state: null, Current generation: 2

the last couple of lines then loop infinitely. to me it seems like it can’t read the state on a retry for some reason?

3 Views

Open in Slack

Previous Next