adamant-kite-43734
09/11/2023, 4:54 PMwitty-jelly-95845
09/11/2023, 5:19 PMwitty-jelly-95845
09/11/2023, 5:20 PMsticky-summer-13450
09/11/2023, 7:51 PMcurl <https://releases.rancher.com/harvester/v1.2.0/version.yaml> | kubectl apply --context=harvester-cluster -f -
Now I have everything crossed 🙂sticky-summer-13450
09/12/2023, 7:54 AM$ kubectl --context harvester003 -n harvester-system logs hvst-upgrade-6hp8q-apply-manifests-9j9m6 --tail=10
instance-manager-r pod count is not 1 on node harvester001, will retry...
instance-manager-r pod count is not 1 on node harvester001, will retry...
instance-manager-r pod count is not 1 on node harvester001, will retry...
instance-manager-r pod count is not 1 on node harvester001, will retry...
And it's true - there are two instance-manager-r
pods on that node - one 11 hours old running longhorn-instance-manager:v1.4.3
and the other 12 days old running longhorn-instance-manager:v1_20221003
.
I suppose I could delete the old one - but would like a little bit of confidence that this would be the correct remedy.salmon-city-57654
09/12/2023, 9:08 AMsticky-summer-13450
09/12/2023, 9:19 AMsalmon-city-57654
09/12/2023, 9:22 AMsalmon-city-57654
09/12/2023, 10:37 AM$ kubectl get instancemanager instance-manager-r-1503169c -n longhorn-system -o yaml |yq -e ".status.instances" |grep name: > replica-list.txt
$ cat replica-list.txt |awk '{print $2}' |xargs -I {} kubectl get replicas {} -n longhorn-system
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-0ca5a4f3-d641-4b31-b33d-96b925d9af04-r-b0367b94" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-0ca5a4f3-d641-4b31-b33d-96b925d9af04-r-e861fab9" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-3f9a22e4-df30-45fc-b4c7-baed0c4ff217-r-894e8723" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-3f9a22e4-df30-45fc-b4c7-baed0c4ff217-r-29032c1b" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-3f9a22e4-df30-45fc-b4c7-baed0c4ff217-r-b6ada661" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-3f9a22e4-df30-45fc-b4c7-baed0c4ff217-r-cb42c033" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-3f9a22e4-df30-45fc-b4c7-baed0c4ff217-r-eb83b435" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-67e7a314-7384-469f-9268-bdcd8728e526-r-eaeeb01b" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-67e7a314-7384-469f-9268-bdcd8728e526-r-f4fcc792" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-160d8d70-01d1-4a13-abd5-11cff2be6071-r-a2afb620" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-160d8d70-01d1-4a13-abd5-11cff2be6071-r-e056af48" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-a5b5fe4c-eca4-4c97-a3db-f9490980c044-r-a8d11e24" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-a5b5fe4c-eca4-4c97-a3db-f9490980c044-r-d66c99b6" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-a5b5fe4c-eca4-4c97-a3db-f9490980c044-r-eab71d39" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-b5885e18-cc31-4ee1-8c91-afe881e09930-r-df556704" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-c4dfa684-2e3a-496f-9396-0e137a8f85e7-r-f1dd92d2" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-c108c3a1-bf5c-4d93-bb2b-99f1db4cc11c-r-1e8b6fa3" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-d01443ee-14fa-42c0-8721-b08935d5eaae-r-36c82c20" not found
Error from server (NotFound): <http://replicas.longhorn.io|replicas.longhorn.io> "pvc-d01443ee-14fa-42c0-8721-b08935d5eaae-r-79da0593" not found
But somehow, they all exist on the instancemanager
. That’s why this im-r could not be deleted.
I checked the whole attached volumes, and it looks like they are all healthy. So you could directly remove this im-r to make the upgrade continue.sticky-summer-13450
09/12/2023, 11:15 AMinstance-manager-r
pod.salmon-city-57654
09/12/2023, 11:27 AMsticky-summer-13450
09/12/2023, 11:30 AMsticky-summer-13450
09/12/2023, 11:40 AM$ kubectl delete pod instance-manager-r-1503169c --context harvester003 -n longhorn-system
pod "instance-manager-r-1503169c" deleted
the hvst-upgrade apply-manifests job has moved on:
2023-09-12T12:28:34+01:00 instance-manager-r pod count is not 1 on node harvester001, will retry...
2023-09-12T12:28:39+01:00 instance-manager-r pod count is not 1 on node harvester001, will retry...
2023-09-12T12:28:45+01:00 instance-manager-r pod image is not longhornio/longhorn-instance-manager:v1.4.3, will retry...
2023-09-12T12:28:50+01:00 Checking instance-manager-r pod on node harvester001 OK.
2023-09-12T12:28:50+01:00 Checking instance-manager-r pod on node harvester002...
2023-09-12T12:28:51+01:00 Checking instance-manager-r pod on node harvester002 OK.
2023-09-12T12:28:51+01:00 Checking instance-manager-r pod on node harvester003...
2023-09-12T12:28:51+01:00 Checking instance-manager-r pod on node harvester003 OK.
2023-09-12T12:28:51+01:00 Upgrading Managedchart rancher-monitoring-crd to 102.0.0+up40.1.2
2023-09-12T12:28:54+01:00 <http://managedchart.management.cattle.io/rancher-monitoring-crd|managedchart.management.cattle.io/rancher-monitoring-crd> patched
2023-09-12T12:28:55+01:00 <http://managedchart.management.cattle.io/rancher-monitoring-crd|managedchart.management.cattle.io/rancher-monitoring-crd> patched
2023-09-12T12:28:55+01:00 Waiting for ManagedChart fleet-local/rancher-monitoring-crd from generation 15
2023-09-12T12:28:55+01:00 Target version: 102.0.0+up40.1.2, Target state: ready
2023-09-12T12:28:56+01:00 Current version: 102.0.0+up40.1.2, Current state: OutOfSync, Current generation: 15
2023-09-12T12:29:01+01:00 Sleep for 5 seconds to retry
2023-09-12T12:29:02+01:00 Current version: 102.0.0+up40.1.2, Current state: WaitApplied, Current generation: 17
2023-09-12T12:29:07+01:00 Sleep for 5 seconds to retry
2023-09-12T12:29:08+01:00 Current version: 102.0.0+up40.1.2, Current state: WaitApplied, Current generation: 17
2023-09-12T12:29:13+01:00 Sleep for 5 seconds to retry
but appears to be stuck again.salmon-city-57654
09/12/2023, 11:46 AMprehistoric-balloon-31801
09/12/2023, 12:17 PMrancher-monitor-crd
chart? Thanks:
helm history rancher-monitoring-crd -n cattle-monitoring-system
sticky-summer-13450
09/12/2023, 12:41 PM$ helm history rancher-monitoring-crd --kube-context harvester003 -n cattle-monitoring-system
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1174 Sun Jun 26 07:17:18 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1175 Sun Jun 26 17:17:18 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1176 Sun Jun 26 17:17:33 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1177 Sun Jun 26 17:19:08 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1178 Sun Jun 26 17:19:24 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1179 Mon Jun 27 03:17:18 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1180 Mon Jun 27 03:17:34 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1181 Mon Jun 27 03:20:29 2022 superseded rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1182 Mon Jun 27 03:20:44 2022 deployed rancher-monitoring-crd-100.1.0+up19.0.3 Upgrade complete
1183 Mon Jun 27 04:03:08 2022 pending-upgrade rancher-monitoring-crd-100.1.0+up19.0.3 Preparing upgrade
prehistoric-balloon-31801
09/12/2023, 12:58 PMsticky-summer-13450
09/12/2023, 1:02 PMsticky-summer-13450
09/12/2023, 1:15 PMsticky-summer-13450
09/22/2023, 9:29 AMprehistoric-balloon-31801
09/22/2023, 9:30 AMprehistoric-balloon-31801
09/22/2023, 9:53 AMprehistoric-balloon-31801
09/22/2023, 9:54 AMsticky-summer-13450
09/22/2023, 9:54 AMprehistoric-balloon-31801
09/22/2023, 9:56 AMsticky-summer-13450
09/22/2023, 9:57 AMred-king-19196
09/25/2023, 11:41 AMW0922 09:01:43.872649 1 reflector.go:442] pkg/mod/github.com/rancher/client-go@v0.24.0-fleet1/tools/cache/reflector.go:167: watch of *v1alpha1.BundleDeployment ended with: an error on the server ("unable to decode an event from the watch stream: stream error: stream ID 51; INTERNAL_ERROR; received from peer") has prevented the request from succeeding
Since we don’t collect all of the secret objects on the users’ cluster with support-bundle-kit, could you help us check the content of the fleet-agent
secret?
kubectl -n cattle-fleet-local-system get secret fleet-agent -o jsonpath='{.data.kubeconfig}' | base64 -
There might be inconsistent in the CA and the URL. Thank you for being with us!sticky-summer-13450
09/25/2023, 11:59 AM... | base64 -d -
to decode the data rather than encode it twice.
I think I can share that data - there is a token but I don't know how secure it needs to stay ...sticky-summer-13450
09/25/2023, 12:01 PM$ kubectl -n cattle-fleet-local-system --context=harvester003 get secret fleet-agent -o jsonpath='{.data.kubeconfig}' | base64 -d -
apiVersion: v1
clusters:
- cluster:
server: <https://harvester-cluster.lan.lxiv.uk>
name: cluster
contexts:
- context:
cluster: cluster
namespace: cluster-fleet-local-local-1a3d67d0a899
user: user
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: user
user:
token: eyJhbGciOiJSUzI1NiIsImtpZCI6Im1ZeEFtYXppNnVjbzBoV3BxNFE0YmdKWHFQa1c4STVtVE5aeDdUNFplQ0UifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJjbHVzdGVyLWZsZWV0LWxvY2FsLWxvY2FsLTFhM2Q2N2QwYTg5OSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJyZXF1ZXN0LXptNTI0LWIzZWFmY2Q5LTIxNWItNDU1Zi04YjQ3LTFlN2Q1ZDhmNzk0Ny10b2tlbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJyZXF1ZXN0LXptNTI0LWIzZWFmY2Q5LTIxNWItNDU1Zi04YjQ3LTFlN2Q1ZDhmNzk0NyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImJmZDlkZjI3LWUxYzctNDUyMC1hMTc1LWY4NzI1OTE1ZmZjZCIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpjbHVzdGVyLWZsZWV0LWxvY2FsLWxvY2FsLTFhM2Q2N2QwYTg5OTpyZXF1ZXN0LXptNTI0LWIzZWFmY2Q5LTIxNWItNDU1Zi04YjQ3LTFlN2Q1ZDhmNzk0NyJ9.mwEmXPlhkZbn_g0XCVwBOjO34dw0MkWw0-fLtCanJWZlbPi-cQUdDoMP3kUTqNu6KYfew-kTgA2THyVtpTVtnnYWe1gXnz4GXCqXrCNT7qLHg7zJzV0y4-2eaiM_1hJ9XToLodIMsHq7tNObDvc12fLLm91fnf17KkCuTdfYEbq9DlQi3_h2BEFCZLfN2R5T2VjBqKMJAujqZGlTmLAIYuPe4ITCk5F8dGbWfJyIOySsns9iEd8URQtSz3x44aLL37YhyMDfq-9sDiVTiw0dcG9IF2OZBRdy4vnj7ipYJyTYLxB-m7F4J7y9gS5Xb_sn8_cUS21sQ7idHds_I8Mfkw
red-king-19196
09/25/2023, 12:02 PMred-king-19196
09/25/2023, 12:03 PMcertificate-authority-data
field under the cluster
section, just server
red-king-19196
09/25/2023, 12:05 PM# kubectl -n cattle-fleet-local-system get secret fleet-agent -o jsonpath='{.data.kubeconfig}' | base64 -d
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJlVENDQVIrZ0F3SUJBZ0lCQURBS0JnZ3Foa2pPUFFRREFqQWtNU0l3SUFZRFZRUUREQmx5YTJVeUxYTmwKY25abGNpMWpZVUF4TmpVMk5qQTVOalUyTUI0WERUSXNRFl6TURFM01qQTFObG9YRFRNeU1EWXlOekUzTWpBMQpObG93SkRFaU1DQUdBMVVFQXd3WmNtdGxNaTF6WlhKMlpYSXRZMkZBTVRZMU5qWXdPVFkxTmpCWk1CTUdCeXFHClNNNDlBZ0VHQ0NxR1NNNDlBd0VIQTBJQUJOeHEvdU9NMS8wTlJ2eTlFM0Y0eis3NXZaVXFSNXI3REFTb3hwTWIKYzR6STZuV3VoU2grZVNqTG85SzYzOHN1TE9tZDhhM2tvMTZUT0dOZWNueEJCZE9qUWpCQU1BNEdBMVVkRHdFQgovd1FFQXdJQ3BEQVBCZ05WSFJNQkFmOEVCVEFEQVFIL01CMEdBMVVkRGdRV0JCU1Q1ckE1TXhxZnkwc21kaG1zCjZRbmN0d3RwQpBS0JnZ3Foa2pPUFFRREFnTklBREJGQWlBUTZqWUorQkFwMVh2RnRLQ0llVkVXaEc2akZiZmcKcUQ3U2J4UkQwd2tNVXdJaEFLV0RaVnZuKzF4d3IwRTliTnNqVERlVDdINEFYTGhlbHZxeCtmTTREeDl4Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
server: <https://10.53.0.1:443>
name: cluster
<redacted>
red-king-19196
09/25/2023, 12:08 PMsticky-summer-13450
09/25/2023, 12:56 PMCA
into ssl-certificates
in harvesterhci.io/v1beta1 setting, I have only needed to push the publicCertificate
(the fullchain.pem
from LE) and the privateKey
(the privkey.pem
from LE).
Is that a problem / assumption somewhere? My browser already trusts the certificate so I haven't needed to push a CA into Harvester.
Do the rest of the components in Harvester also trust the certificate?red-king-19196
09/25/2023, 4:09 PMssl-certificates
setting since it’s a well-known CA.
In fact, I’d suggest removing the FQDN from the server-url
setting because all of these fleet-agent things are meant to be internal communications from Harvester’s point of view. User-designated domain names and certificates don’t have to mess with the internal communication in design. And we’re reviewing a fix that removes the updating of server-url
so that it won’t change the value to the VIP address during the Harvester upgrade.
In addition to that, please also change the apiServerURL
to point to your internal IP address of the rancher
Service in the fleet-controller
ConfigMap as a workaround. I think, in your case, it’s 10.64.0.19
. Also, fill in apiServerCA
with the value of internal-cacerts
setting. And finally restart the fleet-controller
deployment after applying the changes.
$ kubectl -n cattle-fleet-system get cm fleet-controller -o yaml
apiVersion: v1
data:
config: |
{
"systemDefaultRegistry": "",
"agentImage": "rancher/fleet-agent:v0.7.0",
"agentImagePullPolicy": "IfNotPresent",
"apiServerURL": "<https://10.53.138.254>", # <-- please update this field with the value from rancher service's ip address
"apiServerCA": "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUJ2VENDQVdPZ0F3SUJBZ0lCQURBS0JnZ3Foa2pPUFFRREFqQkdNUnd3R2dZRFZRUUtFeE5rZVc1aGJXbGoKYkdsemRHVnVaWE
l0YjNKbk1TWXdKQVlEVlFRRERCMWtlVzVoYldsamJHbHpkR1Z1WlhJdFkyRkFNVFk1TlRZeQpNekUxTlRBZUZ3MHlNekE1TWpVd05qSTFOVFZhRncwek16QTVNakl3TmpJMU5UVmFNRVl4SERBYUJnTlZCQW9UCk
UyUjVibUZ0YVdOc2FYTjBaVzVsY2kxdmNtY3hKakFrQmdOVkJBTU1IV1I1Ym1GdGFXTnNhWE4wWlc1bGNpMWoKWVVBeE5qazFOakl6TVRVMU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRU1QVE
lFUDV6TjBISwpVWmtwbkNXN0xwN0JoOC9TRlEwbzU3UFFQNUdzQ0l1RlhhaENpekxKWHpKbkZhRi9qTmpTSEhXUmFkaGV5YXlBCks1TzlERTZVcUtOQ01FQXdEZ1lEVlIwUEFRSC9CQVFEQWdLa01BOEdBMVVkRX
dFQi93UUZNQU1CQWY4d0hRWUQKVlIwT0JCWUVGQmY3cVpMNlhQcEFkUjJSaWd4OUNoOVh5ejJBTUFvR0NDcUdTTTQ5QkFNQ0EwZ0FNRVVDSURGWgp2MzdzaW9wUElwR2tBcjJ2MzR0bDB0Q0g3S0d0cjhEZkttUT
BTNWVnQWlFQTlxSHE4M0RNZlh3YzdObURWd3U3Cm80enJQNUJBUVU2MEpLVFFxR3piRVNVPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==", # <-- please update this field with the value from intenral-cacerts setting
"agentCheckinInterval": "15m",
"ignoreClusterRegistrationLabels": false,
sticky-summer-13450
09/25/2023, 5:26 PM$ kubectl -n cattle-fleet-local-system --context=harvester003 get <http://setting.management.cattle.io|setting.management.cattle.io> internal-cacerts -o jsonpath='{.value}' | base64 -w0 -
to get the internal-cacerts
(took a while to get that to work 😵💫)
I updated the fleet-controller
ConfigMap with the apiServerURL
as <https://10.64.0.19>
which is the HA Harvester API (this IP address is available on my LAN, not just internal to Harvester) and apiServerCA
as the base64 got above. Then I deleted the current fleet-controller-*
pod. The deployment created a new pod which logged some startup stuff ending with msg="Cluster import for 'fleet-local/local'. Deployed new agent"
.
Um - what now? 🙂red-king-19196
09/26/2023, 12:05 AMsticky-summer-13450
09/26/2023, 7:15 AMtime="2023-09-26T07:14:31Z" level=error msg="Failed to register agent: looking up secret cattle-fleet-local-system/fleet-agent-bootstrap: Post \"<https://10.64.0.19/apis/fleet.cattle.io/v1alpha1/namespaces/fleet-local/clusterregistrations>\": tls: failed to verify certificate: x509: cannot validate certificate for 10.64.0.19 because it doesn't contain any IP SANs"
red-king-19196
09/26/2023, 7:24 AMfleet-agent
and fleet-agent-bootstrap
secrets in your cluster? We could check the content of them.sticky-summer-13450
09/26/2023, 7:28 AMred-king-19196
09/26/2023, 7:30 AMsticky-summer-13450
09/26/2023, 7:34 AMfleet-agent-bootstrap
secret is the same age as the fleet-agent-*
and fleet-controller-*
pods.
The CA in fleet-agent-bootstrap
secret is the same as the value from $ kubectl -n cattle-fleet-local-system --context=harvester003 get <http://setting.management.cattle.io|setting.management.cattle.io> internal-cacerts -o jsonpath='{.value}'
There is no fleet-agent
secret.red-king-19196
09/26/2023, 7:37 AMserver-url
setting already?sticky-summer-13450
09/26/2023, 7:41 AM''
now?red-king-19196
09/26/2023, 7:41 AMvalue: ""
sticky-summer-13450
09/26/2023, 7:42 AMred-king-19196
09/26/2023, 7:42 AMsticky-summer-13450
09/26/2023, 7:44 AMred-king-19196
09/26/2023, 7:46 AMsticky-summer-13450
09/26/2023, 7:48 AMsticky-summer-13450
09/26/2023, 7:50 AMfleet-controller
ConfigMap apiServerURL
be the same as the management
internal-server-url
value? Might that have the internal CA instead of the external certificate I've pushed in?red-king-19196
09/26/2023, 7:52 AM<https://10.64.0.19>
sticky-summer-13450
09/26/2023, 7:53 AMinternal-server-url
is <https://10.53.15.158>
where as apiServerURL
is the external IP address on my lan which is <https://10.64.0.19>
sticky-summer-13450
09/26/2023, 7:54 AMred-king-19196
09/26/2023, 7:54 AM10.64.0.19
is the internal VIP 🤯red-king-19196
09/26/2023, 7:54 AMred-king-19196
09/26/2023, 7:55 AM10.64.0.19
is the management address (the IP address you filled during Harvester installation), right?sticky-summer-13450
09/26/2023, 7:55 AMred-king-19196
09/26/2023, 7:56 AM10.53.15.158
is the internal cluster-ip of rancher
service objectred-king-19196
09/26/2023, 7:56 AMsticky-summer-13450
09/26/2023, 7:56 AMsticky-summer-13450
09/26/2023, 7:56 AMred-king-19196
09/26/2023, 7:56 AMred-king-19196
09/26/2023, 7:57 AM10.53.15.158
and try again!sticky-summer-13450
09/26/2023, 7:57 AMsticky-summer-13450
09/26/2023, 8:08 AMfleet-controller
ConfigMap, updated, fleet-controller-*
pod deleted, fleet-agent-*
pod logged this:
I0926 08:05:39.425969 1 leaderelection.go:248] attempting to acquire leader lease cattle-fleet-local-system/fleet-agent-lock...
I0926 08:05:42.421333 1 leaderelection.go:258] successfully acquired lease cattle-fleet-local-system/fleet-agent-lock
time="2023-09-26T08:05:44Z" level=info msg="Starting /v1, Kind=ConfigMap controller"
time="2023-09-26T08:05:44Z" level=info msg="Starting /v1, Kind=ServiceAccount controller"
time="2023-09-26T08:05:44Z" level=info msg="Starting /v1, Kind=Node controller"
time="2023-09-26T08:05:44Z" level=info msg="Starting /v1, Kind=Secret controller"
E0926 08:05:44.512487 1 memcache.go:206] couldn't get resource list for <http://management.cattle.io/v3|management.cattle.io/v3>:
time="2023-09-26T08:05:44Z" level=info msg="Starting <http://fleet.cattle.io/v1alpha1|fleet.cattle.io/v1alpha1>, Kind=BundleDeployment controller"
time="2023-09-26T08:05:44Z" level=info msg="getting history for release mcc-local-managed-system-upgrade-controller"
time="2023-09-26T08:05:44Z" level=info msg="getting history for release fleet-agent-local"
time="2023-09-26T08:05:44Z" level=info msg="getting history for release local-managed-system-agent"
W0926 08:05:44.797582 1 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0926 08:05:44.884917 1 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0926 08:05:46.942805 1 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0926 08:05:46.953756 1 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0926 08:05:47.017596 1 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0926 08:05:47.069599 1 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0926 08:05:47.109470 1 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
time="2023-09-26T08:05:49Z" level=info msg="Deleting orphan bundle ID rke2, release kube-system/rke2-canal"
red-king-19196
09/26/2023, 8:09 AMkubectl -n fleet-local get bundles
red-king-19196
09/26/2023, 8:10 AMsticky-summer-13450
09/26/2023, 8:10 AMsticky-summer-13450
09/26/2023, 8:10 AM$ kubectl -n fleet-local --context=harvester003 get bundles
NAME BUNDLEDEPLOYMENTS-READY STATUS
fleet-agent-local 1/1
local-managed-system-agent 1/1
mcc-harvester 0/1 NotReady(1) [Cluster fleet-local/local]; daemonset.apps kube-system/harvester-whereabouts [progressing] Updated: 2/3
mcc-harvester-crd 1/1
mcc-local-managed-system-upgrade-controller 1/1
mcc-rancher-logging 0/1 OutOfSync(1) [Cluster fleet-local/local]
mcc-rancher-logging-crd 0/1 OutOfSync(1) [Cluster fleet-local/local]
mcc-rancher-monitoring 0/1 OutOfSync(1) [Cluster fleet-local/local]
mcc-rancher-monitoring-crd 0/1 WaitApplied(1) [Cluster fleet-local/local]
red-king-19196
09/26/2023, 8:11 AMsticky-summer-13450
09/26/2023, 8:12 AMred-king-19196
09/26/2023, 8:13 AMsticky-summer-13450
09/26/2023, 8:25 AMfleet-agent-*
pod logs are just ticking along logging this:
time="2023-09-26T08:11:23Z" level=info msg="getting history for release local-managed-system-agent"
time="2023-09-26T08:16:25Z" level=info msg="getting history for release local-managed-system-agent"
time="2023-09-26T08:21:33Z" level=info msg="getting history for release local-managed-system-agent"
and so far the bundles have not changed.red-king-19196
09/26/2023, 9:10 AMhelm
command at your disposal?
helm history local-managed-system-agent -n cattle-system
red-king-19196
09/26/2023, 9:38 AMsticky-summer-13450
09/26/2023, 1:10 PM$ helm history local-managed-system-agent -n cattle-system --kube-context harvester003
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
4953 Mon Sep 11 07:25:32 2023 superseded local-managed-system-agent-v0.0.0+s-e6e150e25f6da0b545e400e61d9ee74f561acf20cb9ba33fcbdc3352724f1 Upgrade complete
4954 Mon Sep 11 07:25:39 2023 superseded local-managed-system-agent-v0.0.0+s-e6e150e25f6da0b545e400e61d9ee74f561acf20cb9ba33fcbdc3352724f1 Upgrade complete
4955 Mon Sep 11 17:18:56 2023 superseded local-managed-system-agent-v0.0.0+s-e6e150e25f6da0b545e400e61d9ee74f561acf20cb9ba33fcbdc3352724f1 Upgrade complete
4956 Mon Sep 11 17:19:01 2023 superseded local-managed-system-agent-v0.0.0+s-e6e150e25f6da0b545e400e61d9ee74f561acf20cb9ba33fcbdc3352724f1 Upgrade complete
4957 Mon Sep 11 17:27:18 2023 superseded local-managed-system-agent-v0.0.0+s-e6e150e25f6da0b545e400e61d9ee74f561acf20cb9ba33fcbdc3352724f1 Upgrade complete
4958 Mon Sep 11 20:44:10 2023 superseded local-managed-system-agent-v0.0.0+s-e6e150e25f6da0b545e400e61d9ee74f561acf20cb9ba33fcbdc3352724f1 Upgrade complete
4959 Mon Sep 11 20:44:16 2023 superseded local-managed-system-agent-v0.0.0+s-d3cb9a953dd679240b86c15757006baeaa3a5072a70879194e5abbb003513 Upgrade complete
4960 Mon Sep 11 20:44:27 2023 superseded local-managed-system-agent-v0.0.0+s-d3cb9a953dd679240b86c15757006baeaa3a5072a70879194e5abbb003513 Upgrade complete
4961 Mon Sep 11 20:44:31 2023 superseded local-managed-system-agent-v0.0.0+s-d3cb9a953dd679240b86c15757006baeaa3a5072a70879194e5abbb003513 Upgrade complete
4962 Mon Sep 11 20:44:33 2023 deployed local-managed-system-agent-v0.0.0+s-d3cb9a953dd679240b86c15757006baeaa3a5072a70879194e5abbb003513 Upgrade complete
sticky-summer-13450
09/26/2023, 1:10 PMred-king-19196
09/26/2023, 3:20 PM2023-09-26T13:05:13.972634730Z 2023/09/26 13:05:13 [ERROR] error syncing 'fleet-local/rancher-logging-crd': handler mcc-bundle: no chart version found for rancher-logging-crd-100.1.3+up3.17.7, requeuing
2023-09-26T13:05:19.226766849Z 2023/09/26 13:05:19 [ERROR] error syncing 'fleet-local/rancher-logging': handler mcc-bundle: no chart version found for rancher-logging-100.1.3+up3.17.7, requeuing
2023-09-26T13:05:21.936231014Z 2023/09/26 13:05:21 [ERROR] error syncing 'fleet-local/rancher-monitoring': handler mcc-bundle: no chart version found for rancher-monitoring-100.1.0+up19.0.3, requeuing
2023-09-26T13:06:01.406798090Z 2023/09/26 13:06:01 [INFO] Downloading repo index from <http://harvester-cluster-repo.cattle-system/charts/index.yaml>
2023-09-26T13:06:09.134873483Z 2023/09/26 13:06:09 [ERROR] rkecluster fleet-local/local: error while retrieving management cluster from cache: management cluster cache was nil
rancher couldn’t find the charts for those bundles because the harvester cluster repo was already upgraded. there are only new versions of the charts. the cluster is in a middle state due to the previous unsuccessful upgrade.
I have an idea: since harvester-cluster-repo
pod is just an http server serving chart files, maybe we could temporarily swap the container image tag from v1.2.0
to v1.1.2
for rancher and fleet to complete their jobs. once the status of the bundles is sorted out, we could change the image back and kick-start the upgrade again.red-king-19196
09/27/2023, 4:47 AMcat <<EOF | kubectl apply -f -
apiVersion: <http://harvesterhci.io/v1beta1|harvesterhci.io/v1beta1>
kind: Version
metadata:
name: v1.2.0
namespace: harvester-system
spec:
1. Create a customized Upgrade objectred-king-19196
09/27/2023, 4:53 AMcat <<EOF | kubectl apply -f -
apiVersion: <http://harvesterhci.io/v1beta1|harvesterhci.io/v1beta1>
kind: Version
metadata:
name: v1.2.0
namespace: harvester-system
spec:
isoChecksum: '267d65117f6d9601383150b4e513065e673cccba86db9a8c6e7d3cb36a04f6202162f1b95c3c545a7389c4f20f82f5fff6c6e498ff74fcb61e8513992b83e1fb'
isoURL: <https://releases.rancher.com/harvester/v1.2.0/harvester-v1.2.0-amd64.iso>
releaseDate: '20230908'
EOF
Then create a customized Upgrade object
cat <<EOF | kubectl apply -f -
apiVersion: <http://harvesterhci.io/v1beta1|harvesterhci.io/v1beta1>
kind: Upgrade
metadata:
annotations:
<http://harvesterhci.io/skipWebhook|harvesterhci.io/skipWebhook>: "true"
name: v1-2-0-skip-check
namespace: harvester-system
spec:
version: v1.2.0
EOF
sticky-summer-13450
09/27/2023, 7:51 AMsticky-summer-13450
09/28/2023, 7:42 AMfleet-agent-*
and fleet-controler-*
47 hours ago.red-king-19196
09/28/2023, 9:09 AMkubectl -n harvester-system get upgrade v1-2-0-skip-check -o yaml
The upgrade log might have difficulty spinning up due to the error bundle. Need to confirm it.sticky-summer-13450
09/28/2023, 4:11 PM$ kubectl -n harvester-system --context=harvester003 get upgrade v1-2-0-skip-check -o yaml
Error from server (NotFound): <http://plans.upgrade.cattle.io|plans.upgrade.cattle.io> "v1-2-0-skip-check" not found
sticky-summer-13450
09/28/2023, 4:33 PM$ kubectl --all-namespaces --context=harvester003 get upgrade
NAMESPACE NAME IMAGE CHANNEL VERSION
cattle-system hvst-upgrade-6hp8q-skip-restart-rancher-system-agent <http://registry.suse.com/bci/bci-base:15.4|registry.suse.com/bci/bci-base:15.4> 23a54be8
cattle-system sync-additional-ca <http://registry.suse.com/bci/bci-base:15.4|registry.suse.com/bci/bci-base:15.4> v1.1.0
cattle-system system-agent-upgrader rancher/system-agent v0.3.3-suc
cattle-system system-agent-upgrader-windows rancher/wins v0.4.11
red-king-19196
09/30/2023, 10:08 AMkubectl -n harvester-system get <http://upgrades.harvesterhci.io|upgrades.harvesterhci.io> v1-2-0-skip-check -o yaml
sticky-summer-13450
09/30/2023, 11:30 AM$ kubectl -n harvester-system --context=harvester003 get <http://upgrades.harvesterhci.io|upgrades.harvesterhci.io> v1-2-0-skip-check -o yaml
apiVersion: <http://harvesterhci.io/v1beta1|harvesterhci.io/v1beta1>
kind: Upgrade
metadata:
annotations:
<http://harvesterhci.io/skipWebhook|harvesterhci.io/skipWebhook>: "true"
<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
{"apiVersion":"<http://harvesterhci.io/v1beta1|harvesterhci.io/v1beta1>","kind":"Upgrade","metadata":{"annotations":{"<http://harvesterhci.io/skipWebhook|harvesterhci.io/skipWebhook>":"true"},"name":"v1-2-0-skip-check","namespace":"harvester-system"},"spec":{"version":"v1.2.0"}}
creationTimestamp: "2023-09-28T07:21:09Z"
finalizers:
- <http://wrangler.cattle.io/harvester-upgrade-controller|wrangler.cattle.io/harvester-upgrade-controller>
generation: 2
labels:
<http://harvesterhci.io/latestUpgrade|harvesterhci.io/latestUpgrade>: "true"
<http://harvesterhci.io/upgradeState|harvesterhci.io/upgradeState>: PreparingLoggingInfra
name: v1-2-0-skip-check
namespace: harvester-system
resourceVersion: "939012383"
uid: ef543f21-01c4-4256-9be4-76589b878b4d
spec:
image: ""
logEnabled: true
version: v1.2.0
status:
conditions:
- status: Unknown
type: Completed
- status: Unknown
type: LogReady
previousVersion: v1.2.0
upgradeLog: v1-2-0-skip-check-upgradelog
red-king-19196
10/02/2023, 2:27 AMlogEnabled: false
to prevent this from happening:
# remove the stuck upgrade resource
kubectl -n harvester-system delete upgrades v1-2-0-skip-check
# create the version resource if it's missing
cat <<EOF | kubectl apply -f -
apiVersion: <http://harvesterhci.io/v1beta1|harvesterhci.io/v1beta1>
kind: Version
metadata:
name: v1.2.0
namespace: harvester-system
spec:
isoChecksum: '267d65117f6d9601383150b4e513065e673cccba86db9a8c6e7d3cb36a04f6202162f1b95c3c545a7389c4f20f82f5fff6c6e498ff74fcb61e8513992b83e1fb'
isoURL: <https://releases.rancher.com/harvester/v1.2.0/harvester-v1.2.0-amd64.iso>
releaseDate: '20230908'
EOF
# create the upgrade resource w/ the skip-webhook annotation and log-disable toggle
cat <<EOF | kubectl apply -f -
apiVersion: <http://harvesterhci.io/v1beta1|harvesterhci.io/v1beta1>
kind: Upgrade
metadata:
annotations:
<http://harvesterhci.io/skipWebhook|harvesterhci.io/skipWebhook>: "true"
name: v1-2-0-skip-check
namespace: harvester-system
spec:
logEnabled: false
version: v1.2.0
EOF
And see if the upgrade proceeds.sticky-summer-13450
10/02/2023, 10:50 AMsticky-summer-13450
10/02/2023, 2:51 PMPre-drained
state.
I don't know why - I could not find any pod which was logging that it was waiting for something specific - although there was a Longhorn volume which was trying to attach in order to do a backup (which I started a week ago) but never managing to complete. I think that might have been me trying to backup my gparted-live
machine, which is actually just a CD-ROM image. That never managed to attach so never managed to be backed-up. I don't really care - I can always remake it when I next need to check out a volume.
Anyway - all has completed and I'm a very happy person. I hope that this has also helped make Harvester better for the future - which is really all that matters 🙂red-king-19196
10/03/2023, 2:46 AMkubectl -n fleet-local get bundles
See if there are any errors. We would like to ensure everything is fine since we applied many workarounds.
Thank you for being so supportive! I’m sure we found many issues and sorted out the workarounds and solutions during the journey 🙌sticky-summer-13450
10/03/2023, 6:39 AM$ kubectl -n fleet-local --context=harvester003 get bundles
NAME BUNDLEDEPLOYMENTS-READY STATUS
fleet-agent-local 1/1
local-managed-system-agent 1/1
mcc-harvester 1/1
mcc-harvester-crd 1/1
mcc-local-managed-system-upgrade-controller 1/1
mcc-rancher-logging 0/1 OutOfSync(1) [Cluster fleet-local/local]
mcc-rancher-logging-crd 1/1
mcc-rancher-monitoring 0/1 OutOfSync(1) [Cluster fleet-local/local]
mcc-rancher-monitoring-crd 1/1
red-king-19196
10/03/2023, 6:41 AMsticky-summer-13450
10/03/2023, 6:52 AMred-king-19196
10/03/2023, 7:50 AMrancher-logging
and rancher-monitoring
were incomplete. Later rounds of upgrades just skipped the conversions because the Harvester version was already v1.2.0. However, the functionality of both charts seems fine; they still run in the previous versions.
cc @ancient-pizza-13099 Do you know if doing a manual conversion for the two charts is possible?ancient-pizza-13099
10/04/2023, 6:37 AMancient-pizza-13099
10/04/2023, 6:53 AMrancher-monitoring
and rancher-logging
(2) delete the those 2 managedcharts, wait until all PODs are removed
(3) create addon, note to replace some fields of rancher-monitoring
addon, e.g. VIPsticky-summer-13450
10/04/2023, 7:29 AMancient-pizza-13099
10/04/2023, 7:32 AMsticky-summer-13450
10/05/2023, 7:49 AMsticky-summer-13450
10/05/2023, 7:49 AMancient-pizza-13099
10/05/2023, 8:16 AMancient-pizza-13099
10/05/2023, 8:16 AMsticky-summer-13450
10/05/2023, 1:03 PMprometheus
is at the top of top
most of the time on one node.
I haven't done this yet because I haven't had time to get my head around exactly what I need to do.sticky-summer-13450
10/05/2023, 1:03 PMtop
is:
top - 12:56:03 up 2 days, 23:53, 1 user, load average: 4.87, 3.61, 3.22
Tasks: 447 total, 1 running, 446 sleeping, 0 stopped, 0 zombie
%Cpu(s): 44.2 us, 5.7 sy, 0.0 ni, 48.9 id, 0.2 wa, 0.0 hi, 1.0 si, 0.0 st
MiB Mem : 63703.08+total, 26951.77+free, 20200.48+used, 17208.80+buff/cache
MiB Swap: 0.000 total, 0.000 free, 0.000 used. 43502.59+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8824 10001 20 0 1586468 664860 37160 S 107.9 1.019 1655:05 adapter
6671 1000 20 0 6993456 2.180g 189692 S 101.7 3.505 1269:25 prometheus
3803 root 20 0 2417388 846736 77560 S 100.3 1.298 726:05.37 rancher
5066 root 20 0 877820 141512 55484 S 22.85 0.217 281:45.03 containerd
15088 107 20 0 13.054g 4.259g 22036 S 11.59 6.846 194:04.76 qemu-system-x86
10765 root 20 0 4950184 2.477g 78820 S 8.278 3.982 914:47.21 kube-apiserver
2372 root 20 0 11.283g 440184 283816 S 7.616 0.675 474:24.04 etcd
5430 root 20 0 906188 169212 65172 S 6.623 0.259 275:38.97 kubelet
7661 root 20 0 1263936 417124 49976 S 5.298 0.639 264:06.55 harvester
7816 root 20 0 968436 203888 41424 S 2.649 0.313 96:55.79 longhorn-manage
3544 root 20 0 1040556 280356 65708 S 1.656 0.430 93:23.73 kube-controller
9524 root 20 0 2271644 37524 13880 S 1.325 0.058 98:39.46 longhorn-instan
1 root 20 0 205468 17684 9632 S 0.993 0.027 46:17.48 systemd
10135 root 20 0 1903568 42744 13508 S 0.993 0.066 48:40.50 longhorn
14177 root 20 0 1829900 37756 13572 S 0.993 0.058 8:55.57 longhorn
16269 root 20 0 2181364 84836 46736 S 0.993 0.130 65:37.15 calico-node
6897 root 20 0 723560 18196 9420 S 0.662 0.028 1:49.64 containerd-shim
7161 1001 20 0 1794104 209180 37784 S 0.662 0.321 23:58.47 virt-controller
9559 root 20 0 755312 60740 32916 S 0.662 0.093 3:33.47 harvester-netwo
21 root 20 0 0 0 0 S 0.331 0.000 1:03.77 ksoftirqd/1
27 root 20 0 0 0 0 S 0.331 0.000 62:42.22 ksoftirqd/2
2807 root 20 0 723560 16936 9328 S 0.331 0.026 1:39.28 containerd-shim
2865 root 20 0 723700 18396 9612 S 0.331 0.028 1:38.47 containerd-shim
3306 root 20 0 765904 76648 37272 S 0.331 0.118 12:31.75 kube-scheduler
sticky-summer-13450
10/05/2023, 1:03 PM$ kubectl top pods --sort-by='cpu' --context=harvester003 --all-namespaces
NAMESPACE NAME CPU(cores) MEMORY(bytes)
harvester-system harvester-77c7bdd669-c8cxb 727m 1411Mi
kube-system kube-apiserver-harvester002 515m 3914Mi
kube-system kube-apiserver-harvester003 332m 3619Mi
cattle-monitoring-system prometheus-rancher-monitoring-prometheus-0 235m 2063Mi
default virt-launcher-kube002-zpg6s 206m 7777Mi
kube-system kube-apiserver-harvester001 204m 1757Mi
cattle-fleet-local-system fleet-agent-75f5945649-8f6fp 199m 450Mi
default virt-launcher-nagioskube002-th6kz 183m 2631Mi
default virt-launcher-kube003-286s2 139m 5331Mi
kube-system etcd-harvester002 125m 567Mi
default virt-launcher-kube004-kw769 114m 4530Mi
default virt-launcher-home-assistant-jwmw5 112m 2579Mi
longhorn-system instance-manager-e-1041bf96596625fc7adf7838a77ad238 91m 234Mi
kube-system etcd-harvester003 86m 582Mi
kube-system etcd-harvester001 83m 522Mi
harvester-system harvester-77c7bdd669-jwfvl 63m 522Mi
harvester-system harvester-77c7bdd669-fb2wd 60m 592Mi
kube-system rke2-canal-gl52z 52m 243Mi
kube-system rke2-canal-v2vpf 52m 190Mi
cattle-monitoring-system rancher-monitoring-operator-559767d69b-lxkkp 39m 228Mi
longhorn-system longhorn-manager-8pkmn 39m 261Mi
kube-system rke2-canal-gkkw4 38m 191Mi
cattle-monitoring-system rancher-monitoring-prometheus-adapter-8846d4757-bp2gj 37m 737Mi
longhorn-system instance-manager-e-1e123034ec30fd9c07f37ce7446d272b 37m 117Mi
longhorn-system instance-manager-e-65edf6e430281d7f0bb5498a3eac3469 33m 82Mi
longhorn-system longhorn-manager-cbshl 30m 236Mi
default virt-launcher-wstunnel-wireguard-95pxq 29m 1172Mi
cattle-system rancher-576cf5cc45-j6kfn 26m 1649Mi
kube-system kube-controller-manager-harvester003 26m 264Mi
longhorn-system instance-manager-r-65edf6e430281d7f0bb5498a3eac3469 26m 774Mi
longhorn-system instance-manager-r-1041bf96596625fc7adf7838a77ad238 23m 814Mi
longhorn-system longhorn-manager-7ft4z 22m 216Mi
longhorn-system instance-manager-r-1e123034ec30fd9c07f37ce7446d272b 21m 838Mi
cattle-system rancher-576cf5cc45-5vvmr 19m 1012Mi
longhorn-system engine-image-ei-1d169b76-z8tds 18m 19Mi
cattle-monitoring-system rancher-monitoring-prometheus-node-exporter-7c7xf 17m 28Mi
longhorn-system engine-image-ei-1d169b76-mlkrd 15m 24Mi
longhorn-system engine-image-ei-1d169b76-dczbc 14m 20Mi
cattle-monitoring-system rancher-monitoring-prometheus-node-exporter-4sghm 11m 27Mi
kube-system rke2-metrics-server-74f878b999-gknt2 10m 40Mi
harvester-system harvester-webhook-7df6c7df75-44t7r 9m 250Mi
harvester-system harvester-webhook-7df6c7df75-n92ht 8m 182Mi
longhorn-system longhorn-recovery-backend-fb89c6ddd-6tvz6 8m 195Mi
harvester-system virt-api-6dc9cc7654-sxclk 7m 257Mi
harvester-system virt-controller-7468cc6d9-vq4gc 7m 133Mi
longhorn-system longhorn-admission-webhook-57cf4f4689-9gkpg 7m 300Mi
harvester-system virt-api-6dc9cc7654-wfdsf 7m 198Mi
kube-system kube-scheduler-harvester003 6m 50Mi
kube-system kube-scheduler-harvester002 6m 53Mi
kube-system rke2-ingress-nginx-controller-x579x 5m 201Mi
cattle-fleet-system fleet-controller-56786984f4-tctds 5m 157Mi
kube-system kube-scheduler-harvester001 5m 72Mi
cattle-logging-system rancher-logging-root-fluentbit-xkgq6 5m 43Mi
harvester-system virt-operator-77c86586f6-m8sss 5m 210Mi
harvester-system harvester-network-controller-manager-68fd49b88f-gkpz4 5m 49Mi
longhorn-system longhorn-admission-webhook-57cf4f4689-kp7fm 5m 256Mi
harvester-system harvester-load-balancer-6d89b964bb-ts8sp 5m 55Mi
kube-system rke2-ingress-nginx-controller-2dv6m 5m 260Mi
cattle-logging-system rancher-logging-root-fluentbit-jmmlp 5m 42Mi
harvester-system virt-controller-7468cc6d9-qw6nh 5m 211Mi
kube-system rke2-ingress-nginx-controller-2rvsr 4m 233Mi
kube-system kube-controller-manager-harvester001 4m 32Mi
kube-system kube-controller-manager-harvester002 4m 32Mi
harvester-system kube-vip-mhld7 4m 24Mi
cattle-logging-system rancher-logging-574448c578-sx2l2 4m 126Mi
longhorn-system longhorn-recovery-backend-fb89c6ddd-mdprq 4m 246Mi
kube-system cloud-controller-manager-harvester002 4m 32Mi
harvester-system harvester-webhook-7df6c7df75-kj66q 3m 153Mi
harvester-system harvester-network-webhook-697c754ffb-dn8x6 3m 154Mi
kube-system cloud-controller-manager-harvester003 3m 22Mi
harvester-system virt-handler-x4m7x 3m 252Mi
cattle-monitoring-system rancher-monitoring-kube-state-metrics-5bc8bb48bd-df45p 3m 44Mi
cattle-fleet-system gitjob-845b9dcc47-jzkvt 3m 99Mi
cattle-logging-system rancher-logging-root-fluentd-0 3m 319Mi
longhorn-system longhorn-loop-device-cleaner-7twf6 3m 3Mi
harvester-system harvester-load-balancer-webhook-6dd77c56bf-k4fgn 3m 153Mi
harvester-system harvester-network-controller-8kgnm 2m 67Mi
longhorn-system csi-provisioner-9674b9b-rc4tk 2m 17Mi
kube-system rke2-coredns-rke2-coredns-7f75564ff4-b4gmb 2m 37Mi
longhorn-system csi-provisioner-9674b9b-c8q8p 2m 22Mi
kube-system cloud-controller-manager-harvester001 2m 25Mi
harvester-system harvester-network-controller-2fvvc 2m 42Mi
longhorn-system longhorn-conversion-webhook-678ddcc967-kwrxg 2m 202Mi
harvester-system kube-vip-jc5cj 2m 19Mi
longhorn-system longhorn-conversion-webhook-678ddcc967-prg2x 2m 146Mi
longhorn-system backing-image-manager-36c7-45eb 2m 24Mi
kube-system rke2-coredns-rke2-coredns-7f75564ff4-b7hlj 2m 33Mi
cattle-logging-system rancher-logging-root-fluentbit-xjgwr 2m 51Mi
longhorn-system csi-resizer-76f769988f-kdmlb 2m 20Mi
harvester-system virt-handler-4hnzl 2m 225Mi
cattle-system system-upgrade-controller-5685d568ff-f76b8 2m 77Mi
cattle-system rancher-webhook-67bd6cf65d-6zd6s 2m 171Mi
harvester-system virt-handler-4tzdl 2m 238Mi
harvester-system harvester-node-disk-manager-xnzwp 1m 31Mi
harvester-system kube-vip-q5jbn 1m 19Mi
kube-system rke2-coredns-rke2-coredns-autoscaler-84d67b7c48-g79nd 1m 18Mi
kube-system kube-proxy-harvester002 1m 31Mi
kube-system kube-proxy-harvester001 1m 28Mi
kube-system harvester-whereabouts-bsc27 1m 28Mi
cattle-logging-system harvester-default-event-tailer-0 1m 14Mi
(limited by the amount I can post in a message)ancient-pizza-13099
10/06/2023, 6:52 AMsticky-summer-13450
10/06/2023, 7:53 AMrancher-monitoring
and rancher-logging
> (2) delete the those 2 managedcharts, wait until all PODs are removed
> (3) create addon, note to replace some fields of rancher-monitoring
addon, e.g. VIP
Could you help me a bit more with this, please?
I've copied the existing yaml for the ManagedCharts - here's one example.
kubectl get managedchart rancher-monitoring -n fleet-local --context=harvester003 -o yaml > tmp/managedchart_rancher-monitoring.yaml
I've removed the managed chart - but I I don't know which Pods would be removed.
For the rancher-monitoring I guessed that prometheus-rancher-monitoring-prometheus-0
would go, but it hasn't. So my guess is wrong 😞ancient-pizza-13099
10/06/2023, 8:00 AMkubectl get managedchart -A
kubectl get <http://addons.harvesterhci.io|addons.harvesterhci.io> -A
, check if rancher-monitoring
addons is enabledancient-pizza-13099
10/06/2023, 8:01 AMrancher-monitoring
is not there, then kubectl get deployment -n cattle-monitoring-system
, and get replicaset, then kubectl delete themancient-pizza-13099
10/06/2023, 8:01 AMancient-pizza-13099
10/06/2023, 8:02 AMkubectl get <http://addons.harvesterhci.io|addons.harvesterhci.io> -A
, check if rancher-monitoring
addons is enabled