https://rancher.com/ logo
#rke2
Title
f

faint-airport-83518

05/04/2022, 2:39 PM
is there any way to use the automated upgrade controller in an airgap?
oh, I think I understand we just need to grab this container and add it to a plan every time we want to update
g

gray-lawyer-73831

05/04/2022, 3:10 PM
Yep exactly! Or you can upgrade based on channel, but doing that you’d want to have an automated solution to pull the rke2-upgrade containers into your private registry as well, otherwise the upgrade will of course fail with the image not being able to pull
f

faint-airport-83518

05/04/2022, 3:35 PM
got it, thanks. Do you know if nodes need to reboot after being "upgraded"? We're running in an azure vmscaleset - essentially an aws autoscaling group equiv. Just trying to weigh our options on what's least painful to maintain some kind of ansible/bash script to drain + kill vms or to implement this
g

gray-lawyer-73831

05/04/2022, 3:36 PM
Nope, just the rke2 process needs to be restarted (which, if you are using automated upgrade, it will handle all that for you).
f

faint-airport-83518

05/04/2022, 3:37 PM
How about any complications with SELinux? Or we should be good to go with the rke2-selinux package?
g

gray-lawyer-73831

05/04/2022, 3:38 PM
should be good to go with the rke2-selinux package
f

faint-airport-83518

05/04/2022, 3:38 PM
alright, I'll give it a shot. appreciate it.
👍 1
@gray-lawyer-73831 Should I be pulling these images from docker-hub or are you guys hosting somewhere else? It doesn't look like the latest release is here https://hub.docker.com/r/rancher/rke2-upgrade/tags?page=1 nvm, looks like just a different naming convention v1.23.6-rc4+rke2r1 vs v1.23.6-rc4-rke2r1
f

faint-airport-83518

05/04/2022, 5:43 PM
Kind of a stupid side question.. what's the difference between v1.21,1.22,1.23?
correlation to k8s version, I'm guessing?
yeah, that was a dumb question heh
g

gray-lawyer-73831

05/04/2022, 5:46 PM
correlation to k8s version, I’m guessing?
Yep! 😄
f

faint-airport-83518

05/05/2022, 5:19 PM
Hey - so I got this thing deployed in my airgap and all my images mirrored... plans deployed etc. but it doesn't look like it's doing anything?
I see this env var that I kustomized in -
SYSTEM_UPGRADE_JOB_KUBECTL_IMAGE
and I'm using a private registry.. I'm wondering if I need to put any imagePullSecrets somewhere?
g

gray-lawyer-73831

05/05/2022, 5:20 PM
what’s your current rke2 version, and what version or channel do you have in your plan?
f

faint-airport-83518

05/05/2022, 5:22 PM
Copy code
apiVersion: <http://kustomize.toolkit.fluxcd.io/v1beta1|kustomize.toolkit.fluxcd.io/v1beta1>
kind: Kustomization
metadata:
  name: rke2-system-upgrade-controller
  namespace: bigbang
spec:
  interval: 1m
  sourceRef:
    kind: GitRepository
    name: rke2-system-upgrade-controller-repo
  path: .
  prune: true
  images:
  - name: rancher/system-upgrade-controller
    newName: private.registry.internal/rancher/system-upgrade-controller
    newTag: v0.9.1
  patches:
    - patch: |-
        apiVersion: v1
        kind: ConfigMap
        metadata:
          name: default-controller-env
        data:
          SYSTEM_UPGRADE_JOB_KUBECTL_IMAGE: private.registry.internal/rancher/kubectl:v1.23.6
      target:
        kind: ConfigMap
    - patch: |-
        apiVersion: apps/v1
        kind: Deployment
        metadata:
          name: system-upgrade-controller
          namespace: system-upgrade
        spec:
          template:
            spec:
              imagePullSecrets:
                - name: private-registry 
      target:
        kind: Deployment
I have the images airgapped with those tags.. and it's mirroed correctly, etc, the deployment is deployed
my plan:
my plan is also kustomized in -
Copy code
patches:
  - target:
      kind: Plan
    patch: |-
      apiVersion: <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>
      kind: Plan
      metadata:
        name: whatever
      spec:
        version: v1.22.9-rke2r1
I'm just using the default and patching in the versions
g

gray-lawyer-73831

05/05/2022, 5:24 PM
so, from the initial thing you posted, do you have a namespace
system-upgrade
and do you see the
system-upgrade-controller
deployed in that namespace?
f

faint-airport-83518

05/05/2022, 5:24 PM
yeah
g

gray-lawyer-73831

05/05/2022, 5:24 PM
and what’s the current rke2 version running in the cluster?
f

faint-airport-83518

05/05/2022, 5:25 PM
v1.22.6+rke2r1
I may need to update the image paths in my plans now that I'm looking at it..
👍 1
I'm guessing I'll need image pull secrets too?
g

gray-lawyer-73831

05/05/2022, 5:26 PM
Yeah so you have the prereqs, next is just making sure the Plan is up to snuff
f

faint-airport-83518

05/05/2022, 5:26 PM
Copy code
# Server plan
apiVersion: <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>
kind: Plan
metadata:
  name: server-plan
  namespace: system-upgrade
  labels:
    rke2-upgrade: server
spec:
  concurrency: 1
  nodeSelector:
    matchExpressions:
      - {key: rke2-upgrade, operator: Exists}
      - {key: rke2-upgrade, operator: NotIn, values: ["disabled", "false"]}
      # When using k8s version 1.19 or older, swap control-plane with master
      - {key: <http://node-role.kubernetes.io/control-plane|node-role.kubernetes.io/control-plane>, operator: In, values: ["true"]}
  serviceAccountName: system-upgrade
  cordon: true
#  drain:
#    force: true
  upgrade:
    image: rancher/rke2-upgrade
  version: v1.23.1+rke2r2
---
# Agent plan
apiVersion: <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>
kind: Plan
metadata:
  name: agent-plan
  namespace: system-upgrade
  labels:
    rke2-upgrade: agent
spec:
  concurrency: 1
  nodeSelector:
    matchExpressions:
      - {key: rke2-upgrade, operator: Exists}
      - {key: rke2-upgrade, operator: NotIn, values: ["disabled", "false"]}
      # When using k8s version 1.19 or older, swap control-plane with master
      - {key: <http://node-role.kubernetes.io/control-plane|node-role.kubernetes.io/control-plane>, operator: NotIn, values: ["true"]}
  prepare:
    args:
    - prepare
    - server-plan
    image: rancher/rke2-upgrade
  serviceAccountName: system-upgrade
  cordon: true
  drain:
    force: true
  upgrade:
    image: rancher/rke2-upgrade
  version: v1.23.1+rke2r2
I have this I'm just kustomizing over it
g

gray-lawyer-73831

05/05/2022, 5:26 PM
You might yeah, but if all the images necessary for v1.22.9 are present in your private registry then you might not
f

faint-airport-83518

05/05/2022, 5:27 PM
well, uhh
is there anywhere in here I can fit in some secrets?
( if needed)
not totally sure what a plan does tbh
g

gray-lawyer-73831

05/05/2022, 5:30 PM
I don’t think there’s anywhere to put in secrets actually
f

faint-airport-83518

05/05/2022, 5:30 PM
fudge
g

gray-lawyer-73831

05/05/2022, 5:30 PM
but that shouldn’t block anything
the Plans tell system-upgrade-controller what to do
f

faint-airport-83518

05/05/2022, 5:31 PM
alright, cool I wasn't sure if it was trying to do some other pod standup or something
how about this other rancher/kubectl container?
g

gray-lawyer-73831

05/05/2022, 5:31 PM
When it works, it will create Jobs (which will create pods) to upgrade
in my plans, I don’t mess with the kubectl version at all, but you should be able to adjust it as you’ve done. I think that is used in the jobs that get deployed
but my knowledge in this area is a little bit fuzzy
here are known working plans though (not necessarily for airgap, but should be the same):
Copy code
apiVersion: <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>
kind: Plan
metadata:
  name: rke2-server
  namespace: system-upgrade
  labels:
    rke2-upgrade: server
spec:
  concurrency: 1
  version: v1.22.8-rke2r1
  nodeSelector:
    matchExpressions:
      - {key: <http://node-role.kubernetes.io/master|node-role.kubernetes.io/master>, operator: In, values: ["true"]}
  serviceAccountName: system-upgrade
  cordon: true
  #drain:
  #  force: true
  upgrade:
    image: rancher/rke2-upgrade
---
apiVersion: <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>
kind: Plan
metadata:
  name: rke2-agent
  namespace: system-upgrade
  labels:
    rke2-upgrade: agent
spec:
  concurrency: 2
  version: v1.22.8-rke2r1
  nodeSelector:
    matchExpressions:
      - {key: <http://node-role.kubernetes.io/master|node-role.kubernetes.io/master>, operator: NotIn, values: ["true"]}
  serviceAccountName: system-upgrade
  prepare:
    image: rancher/rke2-upgrade
    args: ["prepare", "rke2-server"]
  drain:
    force: true
  upgrade:
    image: rancher/rke2-upgrade
f

faint-airport-83518

05/05/2022, 5:32 PM
so I kustoized our image path over this ->
Copy code
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: default-controller-env
  namespace: system-upgrade
data:
  SYSTEM_UPGRADE_CONTROLLER_DEBUG: "false"
  SYSTEM_UPGRADE_CONTROLLER_THREADS: "2"
  SYSTEM_UPGRADE_JOB_ACTIVE_DEADLINE_SECONDS: "900"
  SYSTEM_UPGRADE_JOB_BACKOFF_LIMIT: "99"
  SYSTEM_UPGRADE_JOB_IMAGE_PULL_POLICY: "Always"
  SYSTEM_UPGRADE_JOB_KUBECTL_IMAGE: "rancher/kubectl:v1.21.9"
  SYSTEM_UPGRADE_JOB_PRIVILEGED: "true"
  SYSTEM_UPGRADE_JOB_TTL_SECONDS_AFTER_FINISH: "900"
  SYSTEM_UPGRADE_PLAN_POLLING_INTERVAL: "15m"
just all of our images usually need a secret to be pulled
g

gray-lawyer-73831

05/05/2022, 5:33 PM
Your best bet is checking throughout https://github.com/rancher/system-upgrade-controller to see if there is some imagepullsecret support
f

faint-airport-83518

05/05/2022, 5:36 PM
ahh so that's like a uh, host volume secret mount or something
f

faint-airport-83518

05/05/2022, 5:37 PM
g

gray-lawyer-73831

05/05/2022, 5:37 PM
airgap of course always makes things more complicated, and to be honest, it’s probably safer in airgap to do what I call a “manual upgrade”
because there could be cases where you don’t have the images necessary for an upgrade, and then these automated upgrades try to pull anyway, and end up breaking your cluster
f

faint-airport-83518

05/05/2022, 5:38 PM
haven't had time to learn go yet heh
g

gray-lawyer-73831

05/05/2022, 5:39 PM
yeah that configmap controls the system-upgrade-controller deployment, which is the brains (the controller) behind how it applies the plans to upgrade your cluster
I THINK.. i need a secret in here somewhere so something can spin up a pod with this container?
KubectlImage
- this guy. yeah I'm not seeing anywhere to add
imagePullSecrets
to whatever pod spec
g

gray-lawyer-73831

05/05/2022, 5:48 PM
It’s possible that doesn’t exist and would need an enhancement to system-upgrade-controller in general. It wouldn’t hurt if you want to create an issue in that repo with the details of what you think you’d need, and if it does exist, someone with more knowledge of this than me can respond and hopefully point you in the right direction. Or if it doesn’t, it can get added and supported eventually 🙂
f

faint-airport-83518

05/05/2022, 5:50 PM
cool - thanks for poking around for me. I'll try to put one in. Just didn't know what I was looking at/for tbh :^)
g

gray-lawyer-73831

05/05/2022, 5:50 PM
Sorry I’m not much help here! It’s fun debugging this stuff though. Thank you!
f

faint-airport-83518

05/05/2022, 5:56 PM
it's weird, I think everything is deployed.. it's just not doing anything lol
from the rancher/system-upgrade-controller:v0.9.1 pod:
Copy code
│
│ system-upgrade-controller-5bd59b74fc-hnqn6 W0505 17:56:39.022311       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.                                                                                                      │
│ system-upgrade-controller-5bd59b74fc-hnqn6 time="2022-05-05T17:56:39Z" level=info msg="Applying CRD <http://plans.upgrade.cattle.io|plans.upgrade.cattle.io>"                                                                                                                                                                           │
│ system-upgrade-controller-5bd59b74fc-hnqn6 time="2022-05-05T17:56:40Z" level=info msg="Starting /v1, Kind=Node controller"                                                                                                                                                                             │
│ system-upgrade-controller-5bd59b74fc-hnqn6 time="2022-05-05T17:56:40Z" level=info msg="Starting /v1, Kind=Secret controller"                                                                                                                                                                           │
│ system-upgrade-controller-5bd59b74fc-hnqn6 time="2022-05-05T17:56:40Z" level=info msg="Starting batch/v1, Kind=Job controller"                                                                                                                                                                         │
│ system-upgrade-controller-5bd59b74fc-hnqn6 time="2022-05-05T17:56:40Z" level=info msg="Starting <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>, Kind=Plan controller"                                                                                                                                                            │
│
g

gray-lawyer-73831

05/05/2022, 5:57 PM
I’d try to minimize your plans, maybe only give it a server plan for now. When those are applied and it is a new version and working, you should see jobs/pods created
specifically minimize the nodeSelector in there
f

faint-airport-83518

05/05/2022, 6:01 PM
okay - i'll try to pull off. those two rke2-upgrade labels? what should put them there though (normally)?
Copy code
# Server plan
apiVersion: <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>
kind: Plan
metadata:
  name: server-plan
  namespace: system-upgrade
  labels:
    rke2-upgrade: server
spec:
  concurrency: 1
  nodeSelector:
    matchExpressions:
      - {key: rke2-upgrade, operator: Exists}
      - {key: rke2-upgrade, operator: NotIn, values: ["disabled", "false"]}
      # When using k8s version 1.19 or older, swap control-plane with master
      - {key: <http://node-role.kubernetes.io/control-plane|node-role.kubernetes.io/control-plane>, operator: In, values: ["true"]}
  serviceAccountName: system-upgrade
  cordon: true
#  drain:
#    force: true
  upgrade:
    image: rancher/rke2-upgrade
  version: v1.23.1+rke2r2
g

gray-lawyer-73831

05/05/2022, 6:02 PM
Copy code
# Server plan
apiVersion: <http://upgrade.cattle.io/v1|upgrade.cattle.io/v1>
kind: Plan
metadata:
  name: server-plan
  namespace: system-upgrade
  labels:
    rke2-upgrade: server
spec:
  concurrency: 1
  nodeSelector:
    matchExpressions:
      - {key: <http://node-role.kubernetes.io/control-plane|node-role.kubernetes.io/control-plane>, operator: In, values: ["true"]}
  serviceAccountName: system-upgrade
  cordon: true
#  drain:
#    force: true
  upgrade:
    image: rancher/rke2-upgrade
  version: v1.23.1-rke2r2
Try that
I think I see the issue
the
version
is confusing with these Plans
it should be a dash instead of a plus
f

faint-airport-83518

05/05/2022, 6:03 PM
🤪
hey I'm getting some errors from OPA gatekeeper blocking some stuff, that's progress
🎉 1
time to go create some new exceptions..weee
g

gray-lawyer-73831

05/05/2022, 6:06 PM
😄
f

faint-airport-83518

05/05/2022, 6:12 PM
I see a job
🦜 1
yeah in the job container spec it's referencing my airgapped
Copy code
- name: SYSTEM_UPGRADE_PLAN_LATEST_VERSION                                                                                                                                                                                                                                                     
           value: v1.22.9-rke2r1                                                                                                                                                                                                                                                                        
          image: private.registry/rancher/rke2-upgrade:v1.22.9-rke2r1                                                                                                                                                                                                                                   
          imagePullPolicy: Always                                                                                                                                                                                                                                                                        
          name: upgrade
it just looks like it's stuck maybe
g

gray-lawyer-73831

05/05/2022, 6:48 PM
you don’t see any pods created? yeah probably stuck from something.. maybe those secrets here too 🤔
f

faint-airport-83518

05/05/2022, 6:56 PM
yeah I look at the job and there's no pods in there
hold on... OPA gatekeeper was blocking privileged containers even through my wildcard exceptions...
Copy code
│ Events:                                                                                                                                                                                                                                                                                                │
│   Type     Reason            Age   From               Message                                                                                                                                                                                                                                          │
│   ----     ------            ----  ----               -------                                                                                                                                                                                                                                          │
│   Warning  FailedScheduling  113s  default-scheduler  0/7 nodes are available: 3 node(s) had taint {CriticalAddonsOnly: true}, that the pod didn't tolerate, 4 node(s) didn't match Pod's node affinity/selector.                                                                                      │
│   Warning  FailedScheduling  52s   default-scheduler  0/7 nodes are available: 3 node(s) had taint {CriticalAddonsOnly: true}, that the pod didn't tolerate, 4 node(s) didn't match Pod's node affinity/selector.
ahhh super close
🙏 1
I think those taints are part of our RKE2 deployment... I don't remember why..
alright, hopefully I can add a toleration to this thing
ahhh!!!!
I got past gatekeeper + taints and tolerations
Copy code
Events:                                                                                                                                                                                                                                                                                                │
│   Type     Reason     Age                From               Message                                                                                                                                                                                                                                    │
│   ----     ------     ----               ----               -------                                                                                                                                                                                                                                    │
│   Normal   Scheduled  66s                default-scheduler  Successfully assigned system-upgrade/apply-server-plan-on-vm-gvzonecil2zackrke2server000002--1-6qvnr to vm-gvzonecil2zackrke2server000002                                                                                                  │
│   Normal   Pulling    25s (x3 over 67s)  kubelet            Pulling image "private.registry/rancher/kubectl:v1.23.6"                                                                                                                                                                                  │
│   Warning  Failed     24s (x3 over 66s)  kubelet            Failed to pull image "private.registry/rancher/kubectl:v1.23.6": rpc error: code = Unknown desc = failed to pull and unpack image "private.registry/rancher/kubectl:v1.23.6": failed to resolve reference "private.registry/rancher/kub │
│ ectl:v1.23.6": pulling from host zarf.c1.internal failed with status code [manifests v1.23.6]: 401 Unauthorized                                                                                                                                                                                        │
│   Warning  Failed     24s (x3 over 66s)  kubelet            Error: ErrImagePull                                                                                                                                                                                                                        │
│   Normal   BackOff    13s (x3 over 66s)  kubelet            Back-off pulling image "private.registry/rancher/kubectl:v1.23.6"                                                                                                                                                                         │
│   Warning  Failed     13s (x3 over 66s)  kubelet            Error: ImagePullBackOff                                                                                                                                                                                                                    │
│
I'm mirroring -> private.registry to zarf.c1.internal
just need those imagePullSecrets
g

gray-lawyer-73831

05/05/2022, 7:29 PM
You might be able to edit the job directly and add those! Also, maybe putting the credentials in registries.yaml to avoid messing around with imagepullsecrets entirely
f

faint-airport-83518

05/05/2022, 8:23 PM
I added the credentials to registries.yaml on one of nodes and it upgrades
only problem is it's a pain because the registry creds are randomly generated after the rke2 cluster is up and running heh
we need to figure out a way to generate the secrets and have our private registry use them..
g

gray-lawyer-73831

05/05/2022, 8:26 PM
Ahh interesting setup! So you bring the cluster up using the tarball then I assume, and then run a private registry within the cluster itself?
f

faint-airport-83518

05/05/2022, 8:27 PM
using a tool called zarf which packages up all of our artifacts and hosts them in a docker registry and gitea server
so it's a fat tarball with everything that we can bring into an airgap
g

gray-lawyer-73831

05/05/2022, 8:28 PM
makes sense. So it’s much easier to just use imagepullsecrets for everything then and ensure to include that in the manifests
okay for one of the nodes that hasn’t upgraded yet, are you able to edit the job directly and add the imagepullsecrets?
or maybe just the pod
f

faint-airport-83518

05/05/2022, 8:30 PM
on the job? let me try
I can't change it on the pod because you can't change a pod spec for imagePullSecrets
Copy code
# pods "apply-server-plan-on-vm-gvzonecil2zackrke2server000000--1-h4pnm" was not valid:
# * spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)
#   core.PodSpec{
#       ... // 11 identical fields
#       NodeName:         "vm-gvzonecil2zackrke2server000000",
#       SecurityContext:  &{HostNetwork: true, HostPID: true, HostIPC: true},
# -     ImagePullSecrets: []core.LocalObjectReference{{Name: "private-registry"}},
# +     ImagePullSecrets: nil,
#       Hostname:         "",
#       Subdomain:        "",
#       ... // 14 identical fields
#   }

#rke2
🤦 1
g

gray-lawyer-73831

05/05/2022, 8:43 PM
right
f

faint-airport-83518

05/05/2022, 8:43 PM
and I can't change the job
and I can't have the plan put it in the job
g

gray-lawyer-73831

05/05/2022, 8:44 PM
darn, okay. I’m going to do some internal snooping around to see if there’s a way. Would you create an issue in system-upgrade-controller repo as well? I think this is something that we don’t currently have but clearly it could be nice to include
f

faint-airport-83518

05/05/2022, 8:45 PM
yeah I don't think it should be a heavy lift to add it to the job templating for the pod spec
💯 1
not that I'm a go developer
g

gray-lawyer-73831

05/05/2022, 8:46 PM
It shouldn’t be, but I can’t guarantee that it’ll get completed at all or at least anytime soon since there are always a lot of other priorities going on, but let’s see what we can do! 💪 Thank you for debugging on this too, and I’m glad we found something that works even if it’s a pain right now
hey
f

faint-airport-83518

05/05/2022, 9:05 PM
huh
interesting
can I kustomize that in?
I'm afk a bit - will try later for surs
g

gray-lawyer-73831

05/05/2022, 9:19 PM
I think you probably can, but I’ve never done that before so I’m not sure! I asked someone a lot smarter than me and he pointed me there 🙂
f

faint-airport-83518

05/05/2022, 9:42 PM
damn, that worked
nice
g

gray-lawyer-73831

05/05/2022, 9:42 PM
beautiful
f

faint-airport-83518

05/05/2022, 9:42 PM
Copy code
apiVersion: <http://kustomize.toolkit.fluxcd.io/v1beta1|kustomize.toolkit.fluxcd.io/v1beta1>
kind: Kustomization
metadata:
  name: rke2-system-upgrade-controller
  namespace: bigbang
spec:
  interval: 1m
  sourceRef:
    kind: GitRepository
    name: rke2-system-upgrade-controller-repo
  path: .
  prune: true
  images:
  - name: rancher/system-upgrade-controller
    newName: private.registry/rancher/system-upgrade-controller
    newTag: v0.9.1
  patches:
    - patch: |-
        apiVersion: v1
        kind: ConfigMap
        metadata:
          name: default-controller-env
        data:
          SYSTEM_UPGRADE_JOB_KUBECTL_IMAGE: private.registry/rancher/kubectl:v1.22.6
      target:
        kind: ConfigMap
    - patch: |-
        apiVersion: apps/v1
        kind: Deployment
        metadata:
          name: system-upgrade-controller
          namespace: system-upgrade
        spec:
          template:
            spec:
              imagePullSecrets:
                - name: private-registry 
      target:
        kind: Deployment
    - patch: |-
        apiVersion: v1
        kind: ServiceAccount
        metadata:
          name: system-upgrade
          namespace: system-upgrade
        imagePullSecrets:
        - name: private-registry
      target:
        kind: ServiceAccount
thanks for helping me out man
still going to keep the issue up, but this is totally workable compared to shoving it into registries.yaml for us
g

gray-lawyer-73831

05/05/2022, 9:47 PM
Yeah I’m glad we got something going! I’ll comment on the issue that doing this works too 🙂
f

faint-airport-83518

05/06/2022, 2:29 PM
so now that I got this thing working.. I'm getting a lot of errors on my worker nodes upgrading regarding not being able to evict pods.. so looking into that now
Copy code
│ drain evicting pod logging/logging-ek-es-master-0                                                                                                                                                                                                                                                                                  │
│ drain evicting pod istio-system/passthrough-ingressgateway-7879ff64db-kh86m                                                                                                                                                                                                                                                        │
│ drain evicting pod gatekeeper-system/gatekeeper-controller-manager-5bd878c895-4sbrp                                                                                                                                                                                                                                                │
│ drain error when evicting pods/"passthrough-ingressgateway-7879ff64db-kh86m" -n "istio-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.                                                                                                                                            │
│ drain error when evicting pods/"gatekeeper-controller-manager-5bd878c895-4sbrp" -n "gatekeeper-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.                                                                                                                                    │
│ drain error when evicting pods/"logging-ek-es-master-0" -n "logging" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
a bunch of my operators/helm charts had pod disruption budgets - so we are scaling up 🙂
💪 1
2 Views