This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

06/06/2023, 2:14 PM

This message was deleted.

sticky-summer-13450

06/06/2023, 2:26 PM

I made the node schedulable - something else made it not schedulable

great-bear-19718

06/06/2023, 11:14 PM

what version of harvester?

great-bear-19718

06/06/2023, 11:14 PM

did you try and place node in maintenance mode?

great-bear-19718

06/06/2023, 11:15 PM

and last but not least a supportbundle would be appreciated

sticky-summer-13450

06/09/2023, 8:31 AM

v1.1.2

sticky-summer-13450

06/09/2023, 8:32 AM

I just tried maintenance mode - currently the node will not go into maintenance mode

sticky-summer-13450

06/09/2023, 8:33 AM

I hoped there's be something I could look at for myself, instead of sending a support bundle, which makes the investigation and learning a bit opaque from a users POV.

sticky-summer-13450

06/09/2023, 9:02 AM

Anyway - he's a support bundle after just trying uncordon and the node being cordoned again, and also trying to set maintenance mode.

supportbundle_5bb44244-434e-4530-ad35-35c4ef1ff661_2023-06-09T08-48-08Z.zip

great-bear-19718

06/10/2023, 1:01 AM

If you need to check the best place may be harvester pod logs.. when putting node in maintenance mode the controller will try and drain the node.. something is blocking node from being drained and that info would be in the harvester pod logs

great-bear-19718

06/10/2023, 1:01 AM

To trigger this action the ui sets an annotation on the node and controller kicks in and removes this annotation when it’s successful.

great-bear-19718

06/10/2023, 1:02 AM

Because it’s likely the operation failed the controller will keep trying.. the disable maintenance method is not yet available in this state so you can’t remove this annotation from the api

great-bear-19718

06/10/2023, 1:03 AM

You would need to edit the node object and remove this.. but again answer is in the harvester pod logs

sticky-summer-13450

06/11/2023, 9:06 AM

the harvester pod on the node in question is in a pending state.

Copy code

$ kubectl get pods --namespace harvester-system --context harvester003 --field-selector=status.phase=Pending
NAME                                 READY   STATUS    RESTARTS   AGE
harvester-76967d7ddb-52rz7           0/1     Pending   0          2d
harvester-webhook-7f9ffc77dd-qwf88   0/1     Pending   0          2d

Copy code

kubectl describe pod harvester-76967d7ddb-52rz7 --namespace harvester-system --context harvester003
Name:             harvester-76967d7ddb-52rz7
Namespace:        harvester-system
Priority:         0
Service Account:  harvester
Node:             <none>
Labels:           <http://app.kubernetes.io/component=apiserver|app.kubernetes.io/component=apiserver>
                  <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
                  <http://app.kubernetes.io/name=harvester|app.kubernetes.io/name=harvester>
                  <http://app.kubernetes.io/part-of=harvester|app.kubernetes.io/part-of=harvester>
                  <http://app.kubernetes.io/version=v1.1.2|app.kubernetes.io/version=v1.1.2>
                  <http://helm.sh/chart=harvester-1.1.2|helm.sh/chart=harvester-1.1.2>
                  <http://helm.sh/release=harvester|helm.sh/release=harvester>
                  pod-template-hash=76967d7ddb
Annotations:      <http://container.apparmor.security.beta.kubernetes.io/apiserver|container.apparmor.security.beta.kubernetes.io/apiserver>: unconfined
                  <http://kubernetes.io/psp|kubernetes.io/psp>: global-unrestricted-psp
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/harvester-76967d7ddb
Containers:
  apiserver:
    Image:       rancher/harvester:v1.1.2
    Ports:       8443/TCP, 6060/TCP
    Host Ports:  0/TCP, 0/TCP
    Requests:
      cpu:     250m
      memory:  256Mi
    Environment:
      HARVESTER_SERVER_HTTPS_PORT:  8443
      HARVESTER_DEBUG:              false
      HARVESTER_SERVER_HTTP_PORT:   0
      HCI_MODE:                     true
      RANCHER_EMBEDDED:             true
      NAMESPACE:                    harvester-system (v1:metadata.namespace)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wkw8p (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  kube-api-access-wkw8p:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
                             <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  29m (x757 over 2d)  default-scheduler  0/3 nodes are available: 1 node(s) were unschedulable, 2 node(s) didn't match pod anti-affinity rules. preemption: 0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No preemption victims found for incoming pod.

I think I'll take the cowards way out, and restart that node to see whether than helps in any way...

119 Views

Open in Slack

Previous Next