This message was deleted.
# harvester
a
This message was deleted.
r
Copy code
error: deployment "harvester-cluster-repo" exceeded its progress deadline
Waiting for harvester-cluster-repo deployment ready...
Back-off pulling image "rancher/harvester-cluster-repo:v1.1.2"
a
kubectl get vm -A
,
kubectl get vmi -A
If your cluster has limited CPU, memory, storage resources, it is possible that the upgrade repo
vm
is not started correctly, which caused the above error.
r
harvester-system   upgrade-repo-hvst-upgrade-6h7l4   5h47m   Running   True
harvester-system   upgrade-repo-hvst-upgrade-6h7l4   5h47m   Running   10.52.0.205     canary     True
Resources aren’t the constraint. The deployment, however, has this event:
Copy code
Events:
  Type    Reason   Age                       From     Message
  ----    ------   ----                      ----     -------
  Normal  BackOff  3m13s (x1483 over 5h38m)  kubelet  Back-off pulling image "rancher/harvester-cluster-repo:v1.1.2"
a
crictl image ls | grep "cluster-repo"
also suspect it may hit https://github.com/harvester/harvester/issues/4210, the
harvester-cluster-repo
cluster-repo is garbage collection due to space usage cc @prehistoric-balloon-31801 @red-king-19196 please double confirm, thanks.
r
Copy code
canary:/home/rancher # crictl image ls | grep "cluster-repo"
<http://docker.io/rancher/harvester-cluster-repo|docker.io/rancher/harvester-cluster-repo>                                    v1.1.1                         172a09728709f       183MB
There’s no v1.1.2 locally, and the tag does not exist on docker hub. It is, however, the tag being loaded:
Back-off pulling image “rancher/harvester-cluster-repo:v1.1.2”
a
The cluster-repo v1.1.2 was not published to docker hub then. But when you select an
v1.1.2
to upgrade, it will download the Harvester v1.1.2 ISO, and the
cluster-repo
image is extracted from the ISO to containerd repository, then use it to start the POD. But the
crictl image ls
output has no v1.1.2 cluster-repo image, it is possible that the image was garbeged after it was added.
in such case, you can try to build the
cluster-repo
v1.1.2 image locally and add it to NODE, per: https://github.com/harvester/harvester/issues/4537#issuecomment-1730230548
👍 1
r
yes, it’s likely the cluster-repo image was garbage collected due to insufficient disk space specifically for the container image store. you can take a look at the node-related events to see if there’s any disk pressure event that happened recently by
kubectl describe node <node-name>
or check the kubelet’s log for image GC log lines.
r
Right, I finally got around to trying this and I did get the repo pod to work. I’m running into a different issue now:
Copy code
hvst-upgrade-6h7l4-apply-manifests-fq8fz deployment "harvester-cluster-repo" successfully rolled out
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://clusterrepo.catalog.cattle.io/harvester-charts|clusterrepo.catalog.cattle.io/harvester-charts> patched
hvst-upgrade-6h7l4-apply-manifests-fq8fz Upgrading Harvester
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester-crd|managedchart.management.cattle.io/harvester-crd> patched (no change)
hvst-upgrade-6h7l4-apply-manifests-fq8fz + CHART_NAME=harvester
hvst-upgrade-6h7l4-apply-manifests-fq8fz + CHART_MANIFEST=harvester.yaml
hvst-upgrade-6h7l4-apply-manifests-fq8fz + case $CHART_NAME in
hvst-upgrade-6h7l4-apply-manifests-fq8fz + patch_harvester_ignore_default_sc
hvst-upgrade-6h7l4-apply-manifests-fq8fz + yq e '.spec.diff.comparePatches += [{"apiVersion": "<http://storage.k8s.io/v1|storage.k8s.io/v1>", "kind": "StorageClass", "name": "harvester-longhorn", "jsonPointers":["/metadata/annotations"]}]' harvester.yaml -i
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester|managedchart.management.cattle.io/harvester> configured
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester|managedchart.management.cattle.io/harvester> patched
hvst-upgrade-6h7l4-apply-manifests-fq8fz <http://managedchart.management.cattle.io/harvester-crd|managedchart.management.cattle.io/harvester-crd> patched
hvst-upgrade-6h7l4-apply-manifests-fq8fz Waiting for ManagedChart fleet-local/harvester from generation 2
hvst-upgrade-6h7l4-apply-manifests-fq8fz Target version: 1.1.2, Target state: ready
hvst-upgrade-6h7l4-apply-manifests-fq8fz Current version: 1.1.2, Current state: null, Current generation: 2
hvst-upgrade-6h7l4-apply-manifests-fq8fz Sleep for 5 seconds to retry
hvst-upgrade-6h7l4-apply-manifests-fq8fz Current version: 1.1.2, Current state: null, Current generation: 2
hvst-upgrade-6h7l4-apply-manifests-fq8fz Sleep for 5 seconds to retry
hvst-upgrade-6h7l4-apply-manifests-fq8fz Current version: 1.1.2, Current state: null, Current generation: 2
the last couple of lines then loop infinitely. to me it seems like it can’t read the state on a retry for some reason?