https://rancher.com/ logo
Join the conversationJoin Slack
Channels
academy
amazon
arm
azure
cabpr
chinese
ci-cd
danish
deutsch
developer
elemental
epinio
espanol
events
extensions
fleet
français
gcp
general
harvester
harvester-dev
hobbyfarm
hypper
japanese
k3d
k3os
k3s
k3s-contributor
kim
kubernetes
kubewarden
lima
logging
longhorn-dev
longhorn-storage
masterclass
mesos
mexico
nederlands
neuvector-security
office-hours
one-point-x
onlinemeetup
onlinetraining
opni
os
ozt
phillydotnet
portugues
rancher-desktop
rancher-extensions
rancher-setup
rancher-wrangler
random
rfed_ara
rio
rke
rke2
russian
s3gw
service-mesh
storage
submariner
supermicro-sixsq
swarm
terraform-controller
terraform-provider-rancher2
terraform-provider-rke
theranchcast
training-0110
training-0124
training-0131
training-0207
training-0214
training-1220
ukranian
v16-v21-migration
vsphere
windows
Powered by Linen
rke2
  • m

    magnificent-vr-88571

    06/30/2022, 12:24 PM
    Good Morning, Is it possible to create RKE2 kubernetes cluster from Rancher dashboard?
    c
    • 2
    • 10
  • a

    ambitious-plastic-3551

    06/30/2022, 7:18 PM
    I find this interesting... I have 3 nodes, I disabled rke2-ingress-nginx on two nodes, but when I restart one node, they either kill all nginx processes, or have all of them 😕
    f
    c
    • 3
    • 4
  • m

    magnificent-vr-88571

    07/01/2022, 12:15 AM
    Hi, Any recommendations using https://github.com/rancherfederal/rke2-ansible or https://github.com/lablabs/ansible-role-rke2 ?
    c
    • 2
    • 2
  • m

    magnificent-vr-88571

    07/02/2022, 8:03 PM
    Hi, I have a RKE2 cluster with 1-server + 6-agent nodes. Always I see below error in the server node and in Rancher dashboard status displaying as “Removing” Any recommendation
    Jul 2 04:20:01 svr-node1 rke2[10011]: {“level”:“warn”,“ts”:“2022-07-02T04:20:01.751+0900”,“caller”:“clientv3/retry_interceptor.go:62”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-c6d7298a-2a65/127.0.0.1:2379”,“attempt”:0,“error”:“rpc error: code = Unknown desc = etcdserver: re-configuration failed due to not enough started members”}
    Jul 2 04:20:01 svr-node1 rke2[10011]: E0702 04:20:01.751299 10011 controller.go:135] error syncing ‘svr-node1 ’: handler managed-etcd-controller: etcdserver: re-configuration failed due to not enough started members, requeuing
    b
    • 2
    • 4
  • c

    curved-caravan-26314

    07/02/2022, 8:15 PM
    I had read somewhere that rke2 can be installed via the rancher gui. Is this saying that it will install HA as well or just a single rancher node with multiple agents
    r
    • 2
    • 1
  • k

    kind-air-74358

    07/04/2022, 9:37 AM
    Hi all, I’ve a question; I am trying to join a third master node to my RKE2 cluster. When starting rke2-server it reports the following error:
    level=warning msg="Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation."
    level=fatal msg="starting kubernetes: preparing server: failed to validate server configuration: critical configuration value mismatch"
    Any idea how to fix this?
    g
    • 2
    • 3
  • a

    ambitious-plastic-3551

    07/04/2022, 7:26 PM
    hostnamectl set-hostname x.y
    b
    • 2
    • 1
  • a

    ambitious-plastic-3551

    07/04/2022, 7:28 PM
    editing files isn't enough sadly
    b
    • 2
    • 20
  • b

    bored-rain-98291

    07/05/2022, 7:37 PM
    Greetings. We have 3 servers. I setup rke2 by installing rke2-node on 1, then rke2-agent on 2 and 3. This is a dev cluster. Is this a reasonable configuration?
    n
    • 2
    • 2
  • b

    bored-rain-98291

    07/05/2022, 7:44 PM
    Are there a high number of customers that use the agent? Im not used to the ‘agent’ concept coming from vanilla kubernetes
    n
    • 2
    • 2
  • b

    bored-rain-98291

    07/07/2022, 8:59 AM
    Is it possible to setup an RKE2 cluster in azure aks?
    a
    c
    • 3
    • 7
  • m

    modern-dress-80156

    07/08/2022, 11:14 AM
    HI, Is there a way I can figure out the order in which RKE2 restores the objects from snapshot. Context is we want to move from rke2 based backup and restore to Velero and we were having some issues with the default order in which velero restores the resources and we can override the same via a parameter velero has exposed
    rke2 server \
      --cluster-reset \
      --cluster-reset-restore-path=<PATH-TO-SNAPSHOT>
    c
    • 2
    • 4
  • f

    faint-airport-83518

    07/08/2022, 8:23 PM
    @creamy-pencil-82913 I see you a lot in the github handling issues, is there any written guidance on debugging an unhealthy RKE2 installation? Using ctr, crictl, any useful directories that logs get stored in? I’m able to pull some decent info from this issue: https://github.com/rancher/rke2/issues/2080, but was wondering if rancher provides anything else aside from the quickstart documentation (or if you had any notes handy 🙂)
    c
    • 2
    • 8
  • b

    bored-rain-98291

    07/11/2022, 3:30 PM
    Is longhorn part of rke2 or has anyone used longhorn on rke2?
    h
    • 2
    • 2
  • f

    faint-airport-83518

    07/12/2022, 3:41 PM
    Sorry if this has been asked before, does RKE2 use the in-tree cloud provider (which I think are in maintenance mode(?)) for k8s when I pass in
    cloud-provider-name
    as an argument?
    g
    c
    • 3
    • 14
  • g

    great-flag-38820

    07/12/2022, 11:58 PM
    Does Rancher have any plans on the roadmap to move away from deprecated
    PodSecurityPolicy
    for enforcing the CIS-1.6 hardening guide?
    c
    • 2
    • 2
  • b

    bored-rain-98291

    07/13/2022, 1:23 PM
    I have an RKE2 cluster setup on 3 nodes (bare metal) and am having a very difficult time configuring persistent local storage. Ive been looking online and see all these very complicated configurations of Longhorn + MetalLB etc. Additionally we have kubernetes in the cloud where we dont run into these storage issues. I am kindly asking for any help on a solution that will work for bare metal install. I have a dev team trying to use this rke2 cluster but we are held up on this storage issue. Thanks for any help.
    f
    r
    a
    • 4
    • 5
  • r

    rapid-helmet-86074

    07/19/2022, 7:34 PM
    Trying to experiment with deploying RKE2 through the UI now that it's officially supported and turn on all the security knobs available. Using Rancher 2.6.6 UI, I tried to deploy to a custom downstream cluster RKE2 v1.22.10+rke2r2, Cloud Provider (None), Container Network calico, Default Pod Security Policy restricted-noroot, Worker CIS Profile cis-1.6, checked Project Network Isolation and left CoreDNS, NGINX Ingress, & Metrics Server checked. I didn't touch anything under the other tabs. It failed with rancher-system-agent spinning on no certificate files existing and no RKE2 listening on any TCP or UDP ports. I tried creating etcd user on control plane/etcd nodes & put the rke2-cis-sysctl.conf from a different install in /etc/sysctl.d/60-rke2-cis.conf and rebooted all downstream nodes. Still same rancher-system-agent error but I belatedly remembered to check that rke2-server service is running so checked its logs and aside from it saying there's no nm-cloud-setup.service for systemctl to enable (which I assume is harmless), it's appearing to error out on fatal error
    --protect-kernel-defaults must be true when using --profile=cis-1.6
    , which I thought got set automatically when profile was set to cis-1.6. So my question is did I mess something up with a conflict like the default pod security policy? Is CIS-1.6 not supported? Something else? Should I file a bug?
    c
    • 2
    • 12
  • s

    straight-nest-8449

    07/20/2022, 7:58 AM
    Hi! I'm trying to get RKE2 running through the Rancher UI with a private registry and am having issues with nodes not joining. Checking the logs with
    journalctl -eu rancher-system-agent
    after using the registration command shows the following:
    msg="[Applyinator] Applying one-time instructions for plan with checksum 297f183b2277d90415cace1b3222523e893fe83ca0918af7cf3f58f73377f387"
    msg="[Applyinator] Extracting image registry.[REDACTED]-educ.local/rancher/system-agent-installer-rke2:v1.23.6-rke2r2 to directory /var/lib/rancher/agent/work/20220720-095240/297f183b2277d90415cace1b3222523e893fe83ca0918af7cf3f58f73377f387_0"
    msg="Using private registry config file at /etc/rancher/agent/registries.yaml"
    msg="Pulling image registry.[REDACTED]-educ.local/rancher/system-agent-installer-rke2:v1.23.6-rke2r2"
    msg="[Applyinator] Running command: sh [-c run.sh]"
    msg="[297f183b2277d90415cace1b3222523e893fe83ca0918af7cf3f58f73377f387_0:stderr]: sh: run.sh: command not found"
    msg="[Applyinator] Command sh [-c run.sh] finished with err: <nil> and exit code: 127"
    msg="error executing instruction 0: <nil>"
    I'm running out of ideas. I'm not sure if the image is being pulled since a watch on the filesystem only shows the work directory being created but not a file being written
    a
    • 2
    • 4
  • b

    bored-rain-98291

    07/20/2022, 8:39 PM
    Greetings fellas (and gals), i have an rke2 cluster running on-prem. I want to install rancher for mgmt interface. We dont have a loadbalancer etc. Can i use kube-proxy or something similar to allow me to view the rancher gui?
    r
    • 2
    • 3
  • a

    ambitious-motherboard-40337

    07/21/2022, 9:11 AM
    is there any way i can pick a different version to install?
    m
    r
    • 3
    • 11
  • b

    bored-rain-98291

    07/21/2022, 1:39 PM
    Good morning all. RKE2 seems to be a bit different than im used to. We have it running in a vmware cluster aka bare(ish) metal. Everything ive deployed has had issues but i think its primarily due to being metal and not cloud. I seem to have most of it working correctly except a few pods: helm-install-rke2-core-dns-xxxx and helm-install-rke2-canal-xxxx. Not sure where these came from, Anyone know what these are? many thanks
    r
    • 2
    • 6
  • r

    ripe-journalist-3231

    07/22/2022, 2:28 PM
    hey quick question… looking at Rancher System Upgrade Controller, was able to run through the hello world example and upgrade RKE2 version, very cool by the way and like the CRD design…. would this be something we could utilize to update AMI of the nodes, or is it more scoped to managing the K8s install
  • f

    faint-airport-83518

    07/22/2022, 4:28 PM
    Hey, are the Rancher Federal guys in this slack?
    • 1
    • 2
  • f

    faint-traffic-61866

    07/22/2022, 4:38 PM
    Hi, see "Protecting RKE2 workloads with Dell PowerProtect Data Manager" new video 😉
  • s

    stale-painting-80203

    07/26/2022, 5:56 AM
    I am trying to setup a Rancher HA server with an ngix LB, but seems I may have not setup the LB correctly. When I try to join other nodes to the cluster I get the following error:
    failed to get CA certs: Get \"<https://rancher.mydomain.com:9345/cacerts>\": dial tcp... connect: connection refused"
    The following suggests setting up a listener https://rancher.com/docs/rancher/v2.6/en/installation/resources/k8s-tutorials/ha-rke2/ Note that in order for RKE2 to work correctly with the load balancer, you need to set up two listeners: one for the supervisor on port 9345, and one for the Kubernetes API on port 6443. How should these be setup? Is it an nginx configuration?
    a
    m
    • 3
    • 10
  • v

    victorious-kite-35099

    07/26/2022, 5:07 PM
    Hi there! I am running RKE2 1.23.7 with Rancher 2.6.6 (rancher managed kubernetes). My question is how I can set up the kube-proxy in my RKE2 to use strictARP mode. I am desperately searching in how to do this from Rancher without manually overwriting systemd files on nodes with proper
    --kube-proxy-arg
    settings. Any hints for me?
  • f

    future-monitor-61871

    07/26/2022, 8:36 PM
    Anyone had any issues w/ using the local static provisioner? Doesn't seem to be working for me.
  • f

    future-monitor-61871

    07/26/2022, 9:46 PM
    Got the storage issues figured out. But now I'm trying to run ScyllaDB in the rke2 cluster. Is there a way to let scylla break the rules and run as root but not turn off / remove the validating webhook that is stopping the containers from starting?
  • b

    bland-jackal-22983

    07/27/2022, 6:01 AM
    hi was wondering if will
    pause
    processes get started by
    rke2
    ? i was investigating a "too many file descriptor open" issue, after bootstrapping rke2, i see there are 17868 of
    lsof: no pwd entry for UID 65535
    by checking which process owned by uid 65535:
    # ps -U 65535
        PID TTY          TIME CMD
       3247 ?        00:00:00 pause
       3372 ?        00:00:00 pause
       3472 ?        00:00:00 pause
       3895 ?        00:00:00 pause
       3934 ?        00:00:00 pause
       3974 ?        00:00:00 pause
       4427 ?        00:00:00 pause
       4767 ?        00:00:00 pause
       5519 ?        00:00:00 pause
    the lsof shows a lot of
    pause     3247                           65535  cwd       DIR               0,44        51   10910627 /
    pause     3247                           65535  rtd       DIR               0,44        51   10910627 /
    pause     3247                           65535  txt       REG               0,44    682696     170978 /pause
    pause     3247                           65535  mem       REG              253,0               170978 /pause (stat: No such file or directory)
    pause     3247                           65535    0u      CHR                1,3       0t0          5 /dev/null
    pause     3247                           65535    1u      CHR                1,3       0t0          5 /dev/null
    pause     3247                           65535    2u      CHR                1,3       0t0          5 /dev/null
    pause     3372                           65535  cwd       DIR               0,63        51   10915996 /
    pause     3372                           65535  rtd       DIR               0,63        51   10915996 /
    pause     3372                           65535  txt       REG               0,63    682696     170978 /pause
    pause     3372                           65535  mem       REG              253,0               170978 /pause (stat: No such file or directory)
    pause     3372                           65535    0u      CHR                1,3       0t0          5 /dev/null
    pause     3372                           65535    1u      CHR                1,3       0t0          5 /dev/null
    pause     3372                           65535    2u      CHR                1,3       0t0          5 /dev/null
    ...
    • 1
    • 1
Powered by Linen
Title
b

bland-jackal-22983

07/27/2022, 6:01 AM
hi was wondering if will
pause
processes get started by
rke2
? i was investigating a "too many file descriptor open" issue, after bootstrapping rke2, i see there are 17868 of
lsof: no pwd entry for UID 65535
by checking which process owned by uid 65535:
# ps -U 65535
    PID TTY          TIME CMD
   3247 ?        00:00:00 pause
   3372 ?        00:00:00 pause
   3472 ?        00:00:00 pause
   3895 ?        00:00:00 pause
   3934 ?        00:00:00 pause
   3974 ?        00:00:00 pause
   4427 ?        00:00:00 pause
   4767 ?        00:00:00 pause
   5519 ?        00:00:00 pause
the lsof shows a lot of
pause     3247                           65535  cwd       DIR               0,44        51   10910627 /
pause     3247                           65535  rtd       DIR               0,44        51   10910627 /
pause     3247                           65535  txt       REG               0,44    682696     170978 /pause
pause     3247                           65535  mem       REG              253,0               170978 /pause (stat: No such file or directory)
pause     3247                           65535    0u      CHR                1,3       0t0          5 /dev/null
pause     3247                           65535    1u      CHR                1,3       0t0          5 /dev/null
pause     3247                           65535    2u      CHR                1,3       0t0          5 /dev/null
pause     3372                           65535  cwd       DIR               0,63        51   10915996 /
pause     3372                           65535  rtd       DIR               0,63        51   10915996 /
pause     3372                           65535  txt       REG               0,63    682696     170978 /pause
pause     3372                           65535  mem       REG              253,0               170978 /pause (stat: No such file or directory)
pause     3372                           65535    0u      CHR                1,3       0t0          5 /dev/null
pause     3372                           65535    1u      CHR                1,3       0t0          5 /dev/null
pause     3372                           65535    2u      CHR                1,3       0t0          5 /dev/null
...
sorry there are 63 of them
no pwd entry for UID 65535
View count: 2