This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

01/08/2024, 9:46 AM

This message was deleted.

fast-plumber-26155

01/08/2024, 9:48 AM

I don't understand the WaitApplied state:

fast-plumber-26155

01/08/2024, 9:48 AM

Looks applied to me?

fast-plumber-26155

01/08/2024, 9:51 AM

Is there any trick to get this rolling? Perhaps manual helm rollback?

fast-plumber-26155

01/08/2024, 6:42 PM

@average-energy-22726 Can you help here? 🙂 perhaps.. would be much appreciated

average-energy-22726

01/08/2024, 6:48 PM

Is the current upgrade stuck at "Upgrading System Service" on that phase? https://docs.harvesterhci.io/v1.2/upgrade/troubleshooting#phase-3-upgrade-system-services This also sort of looks similar to: https://docs.harvesterhci.io/v1.2/upgrade/v1-1-2-to-v1-2-0/#10-an-upgrade-is-stuck-in-the-upgrading-sy[…]ificate-without-ip-san-error-in-fleet-agent

fast-plumber-26155

01/08/2024, 6:50 PM

Is the current upgrade stuck at "Upgrading System Service" on that phase?

Yes The apply-manifests job just continuously logs that it is waiting for manifests to be applied. Which is what let me to look at bundles.

fast-plumber-26155

01/08/2024, 6:51 PM

Rancher is accessed through DNS name, not ip. There are no x509-related errors in the fleet-agent log. So I don't think that's it.

average-energy-22726

01/08/2024, 7:09 PM

Oh so there are no x509 errors present in the fleet-agent pod or

failed to register agent

? Are there some other noticeable errors present in the fleet-agent pod? ( https://github.com/harvester/harvester/issues/4519#issuecomment-1727132383 ) None of the bundle's status seem to have the

ErrApplied

For the fleet-agent-local, what is the full status message on the bundle? Does the controller echo out something like:

Copy code

time="2023-10-06T18:57:52Z" level=error msg="error syncing 'fleet-local/local': handler import-cluster: missing apiServerURL in fleet config for cluster auto registration, requeuing"

With something like:

Copy code

kubectl logs deployments/fleet-controller -n cattle-fleet-system --follow

fast-plumber-26155

01/08/2024, 9:42 PM

fleet-controller logs:

Copy code

time="2024-01-07T22:31:52Z" level=info msg="While calculating status.ResourceKey, error running helm template for bundle mcc-local-managed-system-upgrade-controller with target options from : chart requires kubeVersion: >= 1.23.0-0 which is incompatible with Kubernetes v1.20.0"

that's strange... but maybe snafu... More interesting is probably:

Copy code

time="2024-01-08T10:32:06Z" level=error msg="error syncing 'fleet-local/import-token-local': handler cluster-group-token: failed to delete fleet-local/import-token-local-5c6d15ce-9d17-4956-805c-42054f4de721-to-role rbac.authorization.k8s.io/v1, Kind=RoleBinding for cluster-group-token fleet-local/import-token-local: rolebindings.rbac.authorization.k8s.io \"import-token-local-5c6d15ce-9d17-4956-805c-42054f4de721-to-role\" not found, requeuing"

fast-plumber-26155

01/08/2024, 9:46 PM

Are there some other noticeable errors present in the fleet-agent pod

Oh and the fleet-agent shows dns errors... Didn't see those before

average-energy-22726

01/08/2024, 9:48 PM

When you set up the Harvester cluster, did all your nodes know about your DNS server? https://docs.harvesterhci.io/v1.2/install/harvester-configuration#osdns_nameservers https://docs.harvesterhci.io/v1.2/install/update-harvester-configuration#dns-servers

fast-plumber-26155

01/08/2024, 9:59 PM

it was coredns on harvester misbehaving.. Because it got i/o timeout on upstream forwards.. Because canal was sick.. But.... I don't think that was the thing holding back the update, since it was "just" a dead watch on bundles. I think I need to dig some more into these logs.

fast-plumber-26155

01/08/2024, 10:58 PM

How can I recreate bundles or force resync of managed charts?

fast-plumber-26155

01/09/2024, 10:13 AM

I'm kinda tired of messing around with failing updates 🙂 Considering just reinstalling Harvester 1.2.1 from scratch.

average-energy-22726

01/17/2024, 12:18 AM

There has been a known issue of an mcc chart being in an "ErrApplied" state "after upgrade": https://docs.harvesterhci.io/v1.2/upgrade/v1-1-to-v1-1-2/#4-after-an-upgrade-a-fleet-bundles-status-i[…]ration-installupgraderollback-is-in-progress Where helm rollback is utilized as a workaround: https://github.com/harvester/harvester/issues/3616#issuecomment-1489892688 You could open up an issue and upload your support bundle 😄 cc: @dry-dress-24460

average-energy-22726

02/07/2024, 12:21 AM

@fast-plumber-26155 I was just curious -> have you been able to upgrade successfully 😄? I did just want to mention that you could open up an issue if some errors are still present.

36 Views

Open in Slack

Previous Next