This message was deleted.
# harvester
a
This message was deleted.
p
A support bundle will help!
a
p
Did you delete the upgrade? there is no upgrade resource in the bundle. what's the last state
a
@prehistoric-balloon-31801 yes, I initally hit the bug where it stalled out at 50%. Ifound a few similar issues in. It was suggested elsewhere (I forget now) to delete the bundle and try it again. But when I deleted it the UI showed it was at 1.2.2. So I figured it was solved, until I tried to activate the monitoring plugins. It's up and running right now, with guest vms running, though I can't enable addons. I attempted to rollback a couple days ago, hoping I could restart the upgrade, but that didn't seem to do anything.
p
It might be off-topic, Is this a low-spec machine cluster? I saw a lot of complaints in the etcd log. The upgrade is deleted and it's hard to tell where it stuck. To me, the system software has been upgraded but might not be complete. And nodes/RKE2 are waiting for an upgrade.
I'd say it might be no harm to trigger an upgrade again and see how it goes
a
It’s very low spec. 3 4 core mini pcs. Its just for own education
The problem is ui says it’s 1.2.2 and upgrade button is gone.
p
can you try
kubectl create -f <https://releases.rancher.com/harvester/v1.2.2/version.yaml>
and see if the button comes up
a
it did, I'll try that and see what happens
so it shows 1.3.1 as the available version, but when I select it I get
admission webhook "<http://validator.harvesterhci.io|validator.harvesterhci.io>" denied the request: managed chart harvester is not ready, please wait for it to be ready
Copy code
kubectl get bundledeployment -A
NAMESPACE                                NAME                         DEPLOYED   MONITORED   STATUS
cluster-fleet-local-local-1a3d67d0a899   fleet-agent-local            True       True        
cluster-fleet-local-local-1a3d67d0a899   local-managed-system-agent   True       True        
cluster-fleet-local-local-1a3d67d0a899   mcc-harvester                True       True        deployment.apps harvester-system/harvester-webhook [progressing] Pending termination: 1
cluster-fleet-local-local-1a3d67d0a899   mcc-harvester-crd            True       True        
cluster-fleet-local-local-1a3d67d0a899   mcc-rancher-logging-crd      True       True        
cluster-fleet-local-local-1a3d67d0a899   mcc-rancher-monitoring-crd   True       True
p
there are paused managed charts from the previous upgrade, you can unpause them first to let fleet reconcile, e.g.
Copy code
kubectl edit managedchart harvester -n fleet-local
And edit the path
spec.paused
to false
Copy code
paused: false
a
that fixed that error, and it's changed - lol
admission webhook "<http://validator.harvesterhci.io|validator.harvesterhci.io>" denied the request: managed chart harvester-crd is not ready, please wait for it to be ready
Copy code
kubectl get bundledeployment -A
NAMESPACE                                NAME                         DEPLOYED   MONITORED   STATUS
cluster-fleet-local-local-1a3d67d0a899   fleet-agent-local            True       True        
cluster-fleet-local-local-1a3d67d0a899   local-managed-system-agent   True       True        
cluster-fleet-local-local-1a3d67d0a899   mcc-harvester                True       True        
cluster-fleet-local-local-1a3d67d0a899   mcc-harvester-crd            True       True        
cluster-fleet-local-local-1a3d67d0a899   mcc-rancher-logging-crd      True       True        
cluster-fleet-local-local-1a3d67d0a899   mcc-rancher-monitoring-crd   True       True
p
Hi Bryan, sorry for the late reply, can you paste a new support bundle. You can also check bundles not bundle deployment.
Copy code
kubectl get bundles -A
a
Copy code
kubectl get bundles -A
NAMESPACE     NAME                         BUNDLEDEPLOYMENTS-READY   STATUS
fleet-local   fleet-agent-local            1/1                       
fleet-local   local-managed-system-agent   1/1                       
fleet-local   mcc-harvester                1/1                       
fleet-local   mcc-harvester-crd            0/1                       OutOfSync(1) [Cluster fleet-local/local]
fleet-local   mcc-rancher-logging-crd      0/1                       OutOfSync(1) [Cluster fleet-local/local]
fleet-local   mcc-rancher-monitoring-crd   0/1                       OutOfSync(1) [Cluster fleet-local/local]
p
Hi Bryan, please help repeat the unpause operation on the remaining charts
Copy code
harvester-crd
rancher-logging-crd
rancher-monitoring-crd
a
ok current status:
Copy code
kubectl get bundles -A
NAMESPACE     NAME                                         BUNDLEDEPLOYMENTS-READY   STATUS
fleet-local   fleet-agent-local                            1/1                       
fleet-local   local-managed-system-agent                   1/1                       
fleet-local   mcc-harvester                                1/1                       
fleet-local   mcc-harvester-crd                            1/1                       
fleet-local   mcc-hvst-upgrade-dqqhz-upgradelog-operator   1/1                       
fleet-local   mcc-rancher-logging-crd                      0/1                       OutOfSync(1) [Cluster fleet-local/local]
fleet-local   mcc-rancher-monitoring-crd                   0/1                       OutOfSync(1) [Cluster fleet-local/local]
when I look at
kubectl edit managedchart rancher-logging-crd -n fleet-local
I see this
Copy code
lastUpdateTime: "2024-06-05T16:45:44Z"
    message: no chart version found for rancher-logging-crd-102.0.0+up3.17.10
p
@ancient-pizza-13099 can you help Bryan on this? I think the previous upgrade somehow failed in the middle of apply_manifest. cluster-repo, rancher, and harvester are already upgraded. We need to check where the applied manifest terminates and move the logging/monitoring chart to correct versions. Then he can re-initiate the upgrade again.
a
Ya I was trying to figure out if there was a way to manually pull the right image versions down and try again
p
@alert-london-41705 can you edit the
ancher-monitoring-crd
managedchart and its
spec.version
to
103.0.3+up45.31.1
. Then do the same thing to
rancher-logging-crd
amnagedchart and its
spec.version
to
103.0.0+up3.17.10
Similar commands:
Copy code
kubectl edit managedchart rancher-monitoring-crd -n fleet-local
kubectl edit managedchart rancher-logging-crd -n fleet-local
a
Copy code
kubectl get bundles -A
NAMESPACE     NAME                                         BUNDLEDEPLOYMENTS-READY   STATUS
fleet-local   fleet-agent-local                            1/1                       
fleet-local   local-managed-system-agent                   1/1                       
fleet-local   mcc-harvester                                1/1                       
fleet-local   mcc-harvester-crd                            1/1                       
fleet-local   mcc-hvst-upgrade-dqqhz-upgradelog-operator   1/1                       
fleet-local   mcc-rancher-logging-crd                      1/1                       
fleet-local   mcc-rancher-monitoring-crd                   1/1
So that looks good now. The upgrade to 1.3.1 is still stuck on pending.
p
Hi Bryan, wasn't the previous failed upgrade from v1.2.1 to v1.2.2? The current running upgrade in the system is 1.3.1. It's not recommended to jump to v1.3.1 if the previous one is unfinished because Kubernetes will jump to two minor versions. Can you try these steps: • Delete the current upgrade:
hvst-upgrade-dqqhz
• Ensure plan
hvst-upgrade-dqqhz-prepare
is gone:
Copy code
kubectl get plans hvst-upgrade-dqqhz-prepare -n cattle-system
# nothing should display
• patch some fields
Copy code
kubectl label -n fleet-local cluster.provisioning local "<http://provisioning.cattle.io/management-cluster-name=local|provisioning.cattle.io/management-cluster-name=local>" --overwrite=true
  kubectl patch -n fleet-local cluster.provisioning local --subresource=status --type=merge --patch '{"status":{"fleetWorkspaceName": "fleet-local"}}'
• Ensure SUC is running, the deployment should be created after a while
Copy code
$ kubectl get deployment system-upgrade-controller -n cattle-system
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
system-upgrade-controller   1/1     1            1           14d
• start a 1.2.2 upgrade again
a
yeah, I actually wasn't trying to get to 1.3.1, that showed up after we did this step: https://rancher-users.slack.com/archives/C01GKHKAG0K/p1719542486858259?thread_ts=1719439845.734179&amp;cid=C01GKHKAG0K I'm totally fine to just get to a stable 1.2.2. Trying the above right now.
seems like I might be back on my way...
Copy code
kubectl get deployment system-upgrade-controller -n cattle-system
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
system-upgrade-controller   1/1     1            1           9s
The UI, still shows it's at 1.2.2 and 1.3.1 is the only upgrade that's available... hrmm.
a
seems to have
says it's downloading the image now
Was able to successfully get through the 1.2.2 and the 1.3.1 upgrades today! The addons are all working again as well! Thanks so much @prehistoric-balloon-31801 🙏 🎆
🙌 1
p
Good to hear. Do you mind sharing an SB so I can check if everything goes well? Unfortunately, the first upgrade attempt was deleted, I can't figure out why it failed with the SBs so far.
a