adamant-kite-43734
11/11/2023, 11:47 AMwitty-jelly-95845
11/11/2023, 8:56 PMworried-state-78253
11/11/2023, 10:12 PMkubectl get jobs -n harvester-system -l <http://harvesterhci.io/upgradeComponent=node|harvesterhci.io/upgradeComponent=node>
and got no response - so assumed the jobs had completed - however I may have made a mistake there - fingers crossed we can get this sorted 😕worried-state-78253
11/11/2023, 11:33 PMprehistoric-balloon-31801
11/13/2023, 6:53 AMworried-state-78253
11/13/2023, 8:01 AMprehistoric-balloon-31801
11/14/2023, 9:39 AMIs the simplest thing to decommission the nodes that were pending and reinstall, enrol them as new nodes?No it would get worse because missing nodes from Rancher's perspective
prehistoric-balloon-31801
11/14/2023, 9:39 AMworried-state-78253
11/14/2023, 9:40 AMworried-state-78253
11/14/2023, 9:40 AMworried-state-78253
11/14/2023, 9:42 AMworried-state-78253
11/14/2023, 9:44 AMworried-state-78253
11/14/2023, 9:45 AMprehistoric-balloon-31801
11/14/2023, 9:45 AMworried-state-78253
11/14/2023, 9:45 AMprehistoric-balloon-31801
11/14/2023, 9:45 AMprehistoric-balloon-31801
11/14/2023, 9:45 AMworried-state-78253
11/14/2023, 9:48 AMbusy-photographer-15014
11/14/2023, 8:29 PMworried-state-78253
11/15/2023, 10:10 AMworried-state-78253
11/15/2023, 10:12 AMworried-state-78253
11/15/2023, 10:13 AMbland-baker-70724
11/15/2023, 10:13 AMprehistoric-balloon-31801
11/17/2023, 6:42 AMn2
is being drained. Right now the build-in Rancher still tries to evict pods from n2
, we need to finish this first.
Can you first check if log of the pod rancher-67c56bb6b4-pk5cq
and see if it's still outputting this:
evicting pod longhorn-system/instance-manager-r-ee5ad69319e908adc763f7b6833839a4
error when evicting pods/"instance-manager-r-ee5ad69319e908adc763f7b6833839a4" -n "longhorn-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
If yes, I think the reason is you have a VM called g*****-r****r-d11
on n4
, but its volume has only 1 replica and the replica is live on n2
. Can you confirm this first?worried-state-78253
11/17/2023, 8:17 AMbland-baker-70724
11/17/2023, 8:31 AMworried-state-78253
11/17/2023, 8:35 AMprehistoric-balloon-31801
11/17/2023, 8:39 AMbland-baker-70724
11/17/2023, 11:23 AMprehistoric-balloon-31801
11/17/2023, 11:34 AM./drain-status.sh
here: https://github.com/bk201/misc/tree/main/rancherprehistoric-balloon-31801
11/17/2023, 11:35 AMbland-baker-70724
11/17/2023, 11:43 AMprehistoric-balloon-31801
11/17/2023, 12:09 PM./post-drain.sh n2
bland-baker-70724
11/17/2023, 12:11 PMkian@Kians-MacBook-Air ~ % ./post-drain.sh n2
+ NODE=n2
++ kubectl get node n2 -o yaml
++ yq -e e '.metadata.annotations."<http://cluster.x-k8s.io/machine|cluster.x-k8s.io/machine>"' -
+ MACHINE=custom-607ddde4fbb8
++ kubectl get secret -n fleet-local custom-607ddde4fbb8-machine-plan -o yaml
++ yq -e e '.metadata.annotations."<http://rke.cattle.io/post-drain|rke.cattle.io/post-drain>"' -
+ data='{"deleteEmptyDirData":true,"disableEviction":false,"enabled":true,"force":true,"gracePeriod":0,"ignoreDaemonSets":true,"ignoreErrors":false,"postDrainHooks":[{"annotation":"<http://harvesterhci.io/post-hook|harvesterhci.io/post-hook>"}],"preDrainHooks":[{"annotation":"<http://harvesterhci.io/pre-hook|harvesterhci.io/pre-hook>"}],"skipWaitForDeleteTimeoutSeconds":0,"timeout":0}'
+ echo <http://harvester.cattle.io/post-hook|harvester.cattle.io/post-hook>: ''\''{"deleteEmptyDirData":true,"disableEviction":false,"enabled":true,"force":true,"gracePeriod":0,"ignoreDaemonSets":true,"ignoreErrors":false,"postDrainHooks":[{"annotation":"<http://harvesterhci.io/post-hook|harvesterhci.io/post-hook>"}],"preDrainHooks":[{"annotation":"<http://harvesterhci.io/pre-hook|harvesterhci.io/pre-hook>"}],"skipWaitForDeleteTimeoutSeconds":0,"timeout":0}'\'''
<http://harvester.cattle.io/post-hook|harvester.cattle.io/post-hook>: '{"deleteEmptyDirData":true,"disableEviction":false,"enabled":true,"force":true,"gracePeriod":0,"ignoreDaemonSets":true,"ignoreErrors":false,"postDrainHooks":[{"annotation":"<http://harvesterhci.io/post-hook|harvesterhci.io/post-hook>"}],"preDrainHooks":[{"annotation":"<http://harvesterhci.io/pre-hook|harvesterhci.io/pre-hook>"}],"skipWaitForDeleteTimeoutSeconds":0,"timeout":0}'
+ kubectl annotate secret -n fleet-local custom-607ddde4fbb8-machine-plan '<http://harvesterhci.io/post-hook={%22deleteEmptyDirData%22:true,%22disableEviction%22:false,%22enabled%22:true,%22force%22:true,%22gracePeriod%22:0,%22ignoreDaemonSets%22:true,%22ignoreErrors%22:false,%22postDrainHooks%22:[{%22annotation%22:%22harvesterhci.io/post-hook%22}],%22preDrainHooks%22:[{%22annotation%22:%22harvesterhci.io/pre-hook%22}],%22skipWaitForDeleteTimeoutSeconds%22:0,%22timeout%22:0}|harvesterhci.io/post-hook={"deleteEmptyDirData":true,"disableEviction":false,"enabled":true,"force":true,"gracePeriod":0,"ignoreDaemonSets":true,"ignoreErrors":false,"postDrainHooks":[{"annotation":"harvesterhci.io/post-hook"}],"preDrainHooks":[{"annotation":"harvesterhci.io/pre-hook"}],"skipWaitForDeleteTimeoutSeconds":0,"timeout":0}>'
secret/custom-607ddde4fbb8-machine-plan annotated
kian@Kians-MacBook-Air ~ %
bland-baker-70724
11/17/2023, 12:11 PMbland-baker-70724
11/17/2023, 12:12 PMprehistoric-balloon-31801
11/17/2023, 12:12 PMkubectl get <http://clusters.provisioning.cattle.io|clusters.provisioning.cattle.io> local -n fleet-local -o yaml
can you share the output, to check cluster status.bland-baker-70724
11/17/2023, 12:13 PMkian@Kians-MacBook-Air ~ % kubectl get <http://clusters.provisioning.cattle.io|clusters.provisioning.cattle.io> local -n fleet-local -o yaml
apiVersion: <http://provisioning.cattle.io/v1|provisioning.cattle.io/v1>
kind: Cluster
metadata:
annotations:
<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
{"apiVersion":"<http://provisioning.cattle.io/v1|provisioning.cattle.io/v1>","kind":"Cluster","metadata":{"annotations":{},"name":"local","namespace":"fleet-local"},"spec":{"kubernetesVersion":"v1.25.9+rke2r1","rkeConfig":{"controlPlaneConfig":{"disable":["rke2-snapshot-controller","rke2-snapshot-controller-crd","rke2-snapshot-validation-webhook"]}}}}
<http://objectset.rio.cattle.io/applied|objectset.rio.cattle.io/applied>: H4sIAAAAAAAA/4yQzU7DMBCEXwXt2Slt079Y4oAQ4sCVF9jYS2Ow15G9CYfK746SVqJC4udo78xovjlBIEGLgqBPgMxRUFzkPD1j+0ZGMskiubgwKOJp4eKts6ChT3F02UV2fKyMH7JQqkwiFAL1ozV+MKXqOL6DhoCMRwrEciUYa3Xz7NjePZwj/8xiDAQafDTo/yXOPZrJAUXB3NdFfnGBsmDoQfPgvQKPLflfR+gwd6Bhu9ztt3XdUGNwc7Crdr9u6jW1y/pg91vb2LXdbHarA6jzYpbSVwho6DCNNIMWBd9Yrtu+eiKpzpeiIPdkpnbzx2Wq+0G6R7Z9dCygT2WSCcpwwciURrJPxJRmZtDLUj4DAAD//5CVWGcAAgAA
<http://objectset.rio.cattle.io/id|objectset.rio.cattle.io/id>: provisioning-cluster-create
<http://objectset.rio.cattle.io/owner-gvk|objectset.rio.cattle.io/owner-gvk>: <http://management.cattle.io/v3|management.cattle.io/v3>, Kind=Cluster
<http://objectset.rio.cattle.io/owner-name|objectset.rio.cattle.io/owner-name>: local
<http://objectset.rio.cattle.io/owner-namespace|objectset.rio.cattle.io/owner-namespace>: ""
creationTimestamp: "2023-10-27T14:18:54Z"
finalizers:
- <http://wrangler.cattle.io/provisioning-cluster-remove|wrangler.cattle.io/provisioning-cluster-remove>
- <http://wrangler.cattle.io/rke-cluster-remove|wrangler.cattle.io/rke-cluster-remove>
generation: 4
labels:
<http://objectset.rio.cattle.io/hash|objectset.rio.cattle.io/hash>: 50675339e9ca48d1b72932eb038d75d9d2d44618
<http://provider.cattle.io|provider.cattle.io>: harvester
name: local
namespace: fleet-local
resourceVersion: "29740662"
uid: 7aec5041-e2e1-4469-909e-314a62837976
spec:
kubernetesVersion: v1.25.9+rke2r1
localClusterAuthEndpoint: {}
rkeConfig:
chartValues: null
machineGlobalConfig: null
provisionGeneration: 1
upgradeStrategy:
controlPlaneDrainOptions:
deleteEmptyDirData: false
disableEviction: false
enabled: false
force: false
gracePeriod: 0
ignoreDaemonSets: null
postDrainHooks: null
preDrainHooks: null
skipWaitForDeleteTimeoutSeconds: 0
timeout: 0
workerDrainOptions:
deleteEmptyDirData: false
disableEviction: false
enabled: false
force: false
gracePeriod: 0
ignoreDaemonSets: null
postDrainHooks: null
preDrainHooks: null
skipWaitForDeleteTimeoutSeconds: 0
timeout: 0
status:
clientSecretName: local-kubeconfig
clusterName: local
conditions:
- lastUpdateTime: "2023-10-27T14:20:36Z"
message: marking control plane as initialized and ready
reason: Waiting
status: Unknown
type: Ready
- lastUpdateTime: "2023-10-27T14:18:54Z"
status: "False"
type: Reconciling
- lastUpdateTime: "2023-10-27T14:18:54Z"
status: "False"
type: Stalled
- lastUpdateTime: "2023-11-11T11:30:32Z"
status: "True"
type: Created
- lastUpdateTime: "2023-11-17T12:11:39Z"
status: "True"
type: RKECluster
- status: Unknown
type: DefaultProjectCreated
- status: Unknown
type: SystemProjectCreated
- lastUpdateTime: "2023-10-27T14:19:09Z"
status: "True"
type: Connected
- lastUpdateTime: "2023-11-17T12:11:39Z"
message: configuring worker node(s) custom-318894c86e3c,custom-3d5f2a9df91d,custom-5f62c2f5e52e
reason: Waiting
status: Unknown
type: Updated
- lastUpdateTime: "2023-11-17T12:11:39Z"
message: configuring worker node(s) custom-318894c86e3c,custom-3d5f2a9df91d,custom-5f62c2f5e52e
reason: Waiting
status: Unknown
type: Provisioned
observedGeneration: 4
ready: true
kian@Kians-MacBook-Air ~ %
prehistoric-balloon-31801
11/17/2023, 12:17 PMprehistoric-balloon-31801
11/17/2023, 12:18 PMn2
and n4
are still in 1.2.0.bland-baker-70724
11/17/2023, 12:19 PMprehistoric-balloon-31801
11/17/2023, 12:20 PMworried-state-78253
11/17/2023, 12:28 PM