https://rancher.com/ logo
Join the conversationJoin Slack
Channels
academy
amazon
arm
azure
cabpr
chinese
ci-cd
danish
deutsch
developer
elemental
epinio
espanol
events
extensions
fleet
français
gcp
general
harvester
harvester-dev
hobbyfarm
hypper
japanese
k3d
k3os
k3s
k3s-contributor
kim
kubernetes
kubewarden
lima
logging
longhorn-dev
longhorn-storage
masterclass
mesos
mexico
nederlands
neuvector-security
office-hours
one-point-x
onlinemeetup
onlinetraining
opni
os
ozt
phillydotnet
portugues
rancher-desktop
rancher-extensions
rancher-setup
rancher-wrangler
random
rfed_ara
rio
rke
rke2
russian
s3gw
service-mesh
storage
submariner
supermicro-sixsq
swarm
terraform-controller
terraform-provider-rancher2
terraform-provider-rke
theranchcast
training-0110
training-0124
training-0131
training-0207
training-0214
training-1220
ukranian
v16-v21-migration
vsphere
windows
Powered by Linen
rke2
  • b

    billions-easter-91774

    08/18/2022, 9:52 PM
    mh might probably cilium
  • h

    hundreds-evening-84071

    08/19/2022, 1:58 PM
    Hello, I have RKE2 cluster,
    v1.23.7
    Have following config: 1 control plane/etcd node (rhel 8) 2 Windows worker nodes. 1 is server 2022 and another is server 2019 For some reason, I see one coredns pod stays in pending state; looking at the log for that pod it does not contain any entry. So, am trying to figure out why this is?
    # kubectl get pods -n kube-system | grep dns
    helm-install-rke2-coredns-mn6h6 0/1 Completed 0 31d
    rke2-coredns-rke2-coredns-86b4fbd678-brpcm 0/1 Pending 0 4s
    rke2-coredns-rke2-coredns-86b4fbd678-csqxh 1/1 Running 0 22m
    rke2-coredns-rke2-coredns-autoscaler-65c9bb465d-tcv27 1/1 Running 0 31d
    c
    • 2
    • 2
  • b

    billions-easter-91774

    08/20/2022, 12:24 PM
    is there any date when rke2 will support ubuntu 22 lts?
    c
    a
    c
    • 4
    • 16
  • a

    ambitious-plastic-3551

    08/20/2022, 9:53 PM
    I broke my nginx ingress controller, is there a way to restore it
  • a

    ambitious-plastic-3551

    08/20/2022, 9:53 PM
    hostports are not opened
    b
    • 2
    • 12
  • i

    important-umbrella-22006

    08/22/2022, 11:50 AM
    Hello, I have RKE2 cluster I want to enable an alpha feature
    TTLAfterFinished=true
    But when i change kube-controller-manager and kube-apiserver pods configuration, these gets restarted and revise configuration file to original. Does it control by fleet? can someone help me with the process of how to modify kube-controller-manager and kube-apiserver pods configuration?
  • g

    great-photographer-94826

    08/22/2022, 1:14 PM
    Hi! I would like to restore etcd snapshot but the restore will fail. Used command rke2 server \
    --cluster-reset \
     --cluster-reset-restore-path=/tmp/etcd-snapshot-2022-08-22 \
     --token=mytoken
    Versions • OS: "Ubuntu 20.04" • rke2 version I use: "1.21.5-rke2r2" • etcd image I use: "etcd:v3.4.13-k3s1-build20210223" ETCD does not start due to the following error
    2022-08-22 12:40:13.947166 I | etcdmain: Loading server configuration from "/var/lib/rancher/rke2/server/db/etcd/config". Other configuration command line flags and environment variables will be ignored if provided.
    2022-08-22 12:40:13.947196 E | etcdmain: error verifying flags, open /var/lib/rancher/rke2/server/db/etcd/config: permission denied. See 'etcd --help'.
    ETCD config permissions
    -rw------- 1 etcd etcd 1043 Aug 22 15:03 /var/lib/rancher/rke2/server/db/etcd/config
    ETCD pod
    /var/lib/rancher/rke2/agent/pod-manifests/etcd.yaml
    ...  
      securityContext:
        runAsGroup: 1001
        runAsUser: 1001
    ...
    etcd user/groups
    id etcd
    uid=1001(etcd) gid=1001(etcd) groups=1001(etcd)
    RKE2 config.yaml
    profile: cis-1.6
    tls-san:
      - ******
      - ******
    disable-cloud-controller: true
    etcd-snapshot-schedule-cron: "0 */12 * * *"
    etcd-snapshot-retention: 5
    secrets-encryption: true
    c
    • 2
    • 3
  • s

    stale-painting-80203

    08/23/2022, 5:23 AM
    I recently configured a new harvester node and tried to deploy an RKE2 downstream cluster, but the cluster remains in "waiting for cluster agent to connect state" I have verified the connectivity to between the VM and Rancher server (URL) on which the RKE2 gets deployed. I do see error in etcd pod:
    kubectl logs etcd-harbor1-pool1-114b4517-k4lh7  -n kube-system | grep rejected
    {"level":"warn","ts":"2022-08-23T04:08:54.054Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"127.0.0.1:51538","server-name":"","error":"EOF"}
    Number of pods are in pending state:
    NAMESPACE         NAME                                                        READY   STATUS      RESTARTS   AGE
    calico-system     pod/calico-kube-controllers-677d488b5f-stpbg                0/1     Pending     0          55m
    calico-system     pod/calico-node-7hntn                                       0/1     Running     0          55m
    calico-system     pod/calico-typha-66d8ff6684-rnlx6                           0/1     Pending     0          55m
    cattle-system     pod/cattle-cluster-agent-8df9b48fd-2dkdm                    0/1     Pending     0          56m
    kube-system       pod/etcd-harbor1-pool1-114b4517-k4lh7                       1/1     Running     0          56m
    kube-system       pod/harvester-cloud-provider-748f954ffb-ml8bd               1/1     Running     0          56m
    kube-system       pod/harvester-csi-driver-controllers-779c557d47-8h7qb       0/3     Pending     0          56m
    kube-system       pod/harvester-csi-driver-controllers-779c557d47-gvb9d       0/3     Pending     0          56m
    kube-system       pod/harvester-csi-driver-controllers-779c557d47-rjvwt       0/3     Pending     0          56m
    kube-system       pod/helm-install-harvester-cloud-provider-z87qt             0/1     Completed   0          56m
    kube-system       pod/helm-install-harvester-csi-driver-m9pkj                 0/1     Completed   0          56m
    kube-system       pod/helm-install-rke2-calico-crd-scvcm                      0/1     Completed   0          56m
    kube-system       pod/helm-install-rke2-calico-kmppr                          0/1     Completed   1          56m
    kube-system       pod/helm-install-rke2-coredns-plrx2                         0/1     Completed   0          56m
    kube-system       pod/helm-install-rke2-ingress-nginx-dvrd4                   0/1     Pending     0          56m
    kube-system       pod/helm-install-rke2-metrics-server-bwvwv                  0/1     Pending     0          56m
    kube-system       pod/kube-apiserver-harbor1-pool1-114b4517-k4lh7             1/1     Running     0          56m
    kube-system       pod/kube-controller-manager-harbor1-pool1-114b4517-k4lh7    1/1     Running     0          55m
    kube-system       pod/kube-proxy-harbor1-pool1-114b4517-k4lh7                 1/1     Running     0          56m
    kube-system       pod/kube-scheduler-harbor1-pool1-114b4517-k4lh7             1/1     Running     0          55m
    kube-system       pod/rke2-coredns-rke2-coredns-76cb76d66-vl2wg               0/1     Pending     0          56m
    kube-system       pod/rke2-coredns-rke2-coredns-autoscaler-58867f8fc5-whhwc   0/1     Pending     0          56m
    tigera-operator   pod/tigera-operator-6457fc8c7c-s97d9                        1/1     Running     0          56m
    Any suggestion on how to debug the issue or possible root cause?
  • f

    future-monitor-61871

    08/23/2022, 9:41 PM
    The elastic operator is deploying elasticsearch w/ an init container that has to run as root. It's setting the vm.max_map_count to 262144. RKE2 is barfing on the stateful set creation : "create Pod elasticsearch-master-0 in StatefulSet elasticsearch-master failed error: pods "elasticsearch-master-0" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.initContainers[0].securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden spec.initContainers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]" What's the proper way to get elastic to run inside RKE2?
    c
    c
    • 3
    • 6
  • e

    echoing-oxygen-99290

    08/23/2022, 9:52 PM
    Hi all, I am attempting to install RKE2 on GCE (testing before installing on bare-metal). When running the install script it is not able to schedule the nginx pod due to the single node having a disk pressure taint. I do see that my
    /dev/root
    file system being at 98% utilization when running
    df -h
    . I attached a new disk and created a new file system with a 30gb ssd. I am not seeing how to get rancher to use this new filesystem. Could someone give me any insight into what I am not understanding? Thanks
    c
    • 2
    • 2
  • n

    narrow-noon-75604

    08/24/2022, 5:04 AM
    Hi, I am trying to deploy a HA RKE2 cluster on baremetal machines with ubuntu20 using kube-vip. The first server node is up and running and be able to list the VIP so I have added 2 more server nodes and a worker node. The installation is successful without any issue and the VIP has appeared after the overall RKE2 installation. Then I have deployed a sample nginx application to test if the service is accessible through VIP address. I am able to access the application through all the server node IP Addresses but could not be able to access it using VIP. Then I tried to list the IP addresses in the interface that I am using and found that the VIP has disappeared. Right after the installation is finished,
    ip a list $INTERFACE
    Output:
    2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
        link/ether 00:50:56:9b:3a:cb brd ff:ff:ff:ff:ff:ff
        inet 192.168.10.71/24 brd 192.168.10.255 scope global ens160
          valid_lft forever preferred_lft forever
        inet 192.168.10.74/32 scope global ens160
          valid_lft forever preferred_lft forever
        inet6 fe80::250:56ff:fe9b:3acb/64 scope link
          valid_lft forever preferred_lft forever
    After sometime VIP is disappearing,
    ip a list $INTERFACE
    Output:
    2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
        link/ether 00:50:56:9b:3a:cb brd ff:ff:ff:ff:ff:ff
        inet 192.168.10.71/24 brd 192.168.10.255 scope global ens160
          valid_lft forever preferred_lft forever
        inet6 fe80::250:56ff:fe9b:3acb/64 scope link
          valid_lft forever preferred_lft forever
    Please let me know if I am missing anything.
    c
    • 2
    • 3
  • b

    best-actor-8484

    08/24/2022, 4:48 PM
    is there anyone using encryption-provider-config to provide a custom secrets-encryption for rke2?
  • a

    able-engineer-22050

    08/24/2022, 5:02 PM
    Hi everyone, I tumbled into the following problem. I have an RKE2 cluster installed on openSuSE microOS. I deployed kafka on it with cruisecontrol. The kafka fired up properly, but cruisecontrol cannot monitor the kafka cluster. Upon closer inspection it turned out that cruisecontrol expects a unified cgroup hierarchy in the underlying operating system. I fired up google to learn that specifying systemd.unified_cgroup_hierarchy=1 as a kernel parameter should fix the problem, but actually it does not. /sys/fs/cgroup still shows the original hybrid setup. Did anyone here came across the same problem? If so, what was the solution?
    c
    • 2
    • 1
  • b

    bored-rain-98291

    08/25/2022, 5:49 PM
    is etcdctl included with RKE2? I cant seem to find the binary.
    g
    • 2
    • 5
  • b

    bored-rain-98291

    08/25/2022, 5:59 PM
    We are trying to verify encryption at rest. Another member of the team already has etcdctl but everytime he attempts he gets the following:
    l
    • 2
    • 1
  • b

    bored-rain-98291

    08/25/2022, 5:59 PM
    /tmp/etcd-download-test/etcdctl get secret1
    {"level":"warn","ts":"2022-08-25T09:59:39.170-0700","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"<endpoint://client-ddc2ced7-320f-4be9-a25d-68c07645e407/127.0.0.1:2379>","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: all SubConns are in TransientFailure, latest connection error: connection closed"}
  • b

    billions-easter-91774

    08/26/2022, 3:50 PM
    when i change a setting in helmchartconfig, whats the right way to retrigger helm? rerunning the rke2 helm job?
    l
    • 2
    • 3
  • b

    billions-easter-91774

    08/26/2022, 4:15 PM
    When updating cilium helmchartconfig, how do i trigger a cilium update? I'm only seeing the helm jobs for nginx and coredns and metrics but not cilium
  • f

    future-monitor-61871

    08/26/2022, 7:03 PM
    I'm trying to migrate an mixed windows/linux app that uses consul into RKE2. When I run the consul chart it's daemonset fails as it's trying to use 9500 & 9502 as node ports. I don't want to change the --service-node-port-range setting to be 9500-32767. Can I provide a series of ranges to that parameter? maybe 9500,9502,30000-32767 ?
  • a

    ambitious-plastic-3551

    08/26/2022, 9:18 PM
    Did you try?
  • a

    ambitious-plastic-3551

    08/26/2022, 9:27 PM
    I am trying to join a node to a cluster and I get this: "Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:9345/v1-rke2/readyz: 500 Internal Server Error"
  • a

    ambitious-plastic-3551

    08/26/2022, 9:28 PM
    but I can't tell the issue
  • a

    ambitious-plastic-3551

    08/27/2022, 8:42 AM
    API server doesn't become available 😮
  • b

    billions-easter-91774

    08/28/2022, 6:08 AM
    i'm running in an issue with cilium 1.11.5 and ubuntu 22.04 which is breaking and fixed in 1.11.7: https://github.com/cilium/cilium/issues/20125 Any chance for me to get the new version installed without doing the workaround? Is it possible to highlight this in rke2 as an issue so you might consider updating cilium sooner?
    p
    • 2
    • 2
  • n

    narrow-cpu-6472

    08/29/2022, 11:17 AM
    Anyone who has successfully integrated Keycloak and RKE2 ?
  • n

    narrow-cpu-6472

    08/29/2022, 11:32 AM
    oidc authenticator: initializing plugin: x509: certificate is valid for ingress.local, not keycloak.gc.svc.cluster.local
  • n

    narrow-cpu-6472

    08/29/2022, 11:32 AM
    This is what am hitting.. Any help in this regard would be much appreciated
  • b

    bored-rain-98291

    08/30/2022, 6:57 PM
    what is a good backup/restore solution for RKE2?
    h
    e
    • 3
    • 5
  • b

    broad-farmer-70498

    09/01/2022, 1:52 PM
    Am I only supposed to put manifests only on the bootstrap node? What happens if all server nodes don’t have exactly the same content for the manifests dir?
    v
    • 2
    • 5
  • b

    broad-farmer-70498

    09/01/2022, 1:53 PM
    Ditto for the config.yaml on each server node, what happens if they don’t match exactly on things like addons, kubelet args etc?
    v
    • 2
    • 10
Powered by Linen
Title
b

broad-farmer-70498

09/01/2022, 1:53 PM
Ditto for the config.yaml on each server node, what happens if they don’t match exactly on things like addons, kubelet args etc?
v

victorious-analyst-3332

09/01/2022, 4:09 PM
they can technically be different, but it will lead to the nodes behaving differently 🤷
b

broad-farmer-70498

09/01/2022, 4:11 PM
kubelet-arg was probably a bad example, as that should be tweaked/set on the agents as well
v

victorious-analyst-3332

09/01/2022, 4:13 PM
examples of what I’ve tweaked in the past for things like metrics scraping from outside the cluster
etcd-expose-metrics: true
kube-controller-manager-arg:
  - "bind-address=0.0.0.0"
kube-proxy-arg:
  - "metrics-bind-address=0.0.0.0:10249"
b

broad-farmer-70498

09/01/2022, 4:13 PM
yeah, I'm mostly referring to the things that are 'cluster' options like
cni: none
what happens if one of the server nodes doesn't have that and another does or they have different values etc? or should I only set that on the bootstrap node and then have quite minimal configs for all the other servers similar to the agent configs
v

victorious-analyst-3332

09/01/2022, 4:15 PM
those values aren’t propagated that I know of, so you’ll need them on all nodes that you plan to interact with in that way
things like
cni: none
are part of the rancher cluster config that get automatically pushed down to the clusters I believe
b

broad-farmer-70498

09/01/2022, 4:16 PM
so that's a 'last man wins' scenario as well?
v

victorious-analyst-3332

09/01/2022, 4:17 PM
no idea in that scenario
I think best practice is to make sure they match to prevent issues if something goes offline
👍 1
View count: 7