https://rancher.com/ logo
Join the conversationJoin Slack
Channels
academy
amazon
arm
azure
cabpr
chinese
ci-cd
danish
deutsch
developer
elemental
epinio
espanol
events
extensions
fleet
français
gcp
general
harvester
harvester-dev
hobbyfarm
hypper
japanese
k3d
k3os
k3s
k3s-contributor
kim
kubernetes
kubewarden
lima
logging
longhorn-dev
longhorn-storage
masterclass
mesos
mexico
nederlands
neuvector-security
office-hours
one-point-x
onlinemeetup
onlinetraining
opni
os
ozt
phillydotnet
portugues
rancher-desktop
rancher-extensions
rancher-setup
rancher-wrangler
random
rfed_ara
rio
rke
rke2
russian
s3gw
service-mesh
storage
submariner
supermicro-sixsq
swarm
terraform-controller
terraform-provider-rancher2
terraform-provider-rke
theranchcast
training-0110
training-0124
training-0131
training-0207
training-0214
training-1220
ukranian
v16-v21-migration
vsphere
windows
Powered by Linen
rke2
  • s

    sparse-fireman-14239

    10/20/2022, 8:09 PM
    ah 1.24.7 has been released, awesome!
  • e

    enough-toddler-31145

    10/21/2022, 3:28 PM
    Okay so I'm trying to wrap my brain around what is going on with this.
  • e

    enough-toddler-31145

    10/21/2022, 3:29 PM
    Im trying to install small-step-issuer but im getting this error
    Warning  FailedCreatePodSandBox  16m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "488aaaff6ba2bdab7b5c1ce46065b25734329b60b5c2ac40bbf1207e6579a2b7": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  15m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "66cbf3810675b54dbcaf37e1e00879cca698c8b64a970debaf275d02ad202572": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  15m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "c13ee0a7667be018d6d89a6e7341957443f9f8f22a30c7443d897318fe10af6a": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  2m25s (x60 over 15m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "30cf6e2b3ddf13280927d19b746c09d5c95cc13a0b0c4129638ba169a8eb1b62": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
  • e

    enough-toddler-31145

    10/21/2022, 3:33 PM
    when i run
    kubectl get pods -n kube-system
    i get
    rke2-canal-2f5fg                                        1/2     CrashLoopBackOff   709 (3m13s ago)    43h
    rke2-canal-2gmx5                                        1/2     Running            1684 (49s ago)     43h
    rke2-canal-5lmmd                                        1/2     CrashLoopBackOff   727 (38s ago)      43h
    rke2-canal-9lb7j                                        1/2     Running            729 (56s ago)      43h
    rke2-canal-qkgts                                        1/2     CrashLoopBackOff   1650 (44s ago)     43h
    rke2-canal-r9hvg                                        1/2     Running            1522 (2m17s ago)   43h
    rke2-canal-xnqnx                                        1/2     CrashLoopBackOff   711 (3m38s ago)    43h
    rke2-canal-zhc42                                        1/2     CrashLoopBackOff   726 (117s ago)     43h
    and this is the output for one of the pods
    Normal   Created      13m (x2 over 14m)       kubelet            Created container calico-node
      Normal   Started      13m (x2 over 14m)       kubelet            Started container calico-node
      Normal   Pulled       13m (x2 over 14m)       kubelet            Container image "rancher/hardened-calico:v3.23.3-build20220810" already present on machine
      Normal   Killing      13m                     kubelet            Container calico-node failed liveness probe, will be restarted
      Warning  Unhealthy    13m (x11 over 14m)      kubelet            Readiness probe failed: Get "<http://localhost:9099/readiness>": dial tcp: lookup localhost on 8.8.8.8:53: no such host
      Warning  BackOff      9m29s (x6 over 10m)     kubelet            Back-off restarting failed container
      Warning  FailedMount  5m20s                   kubelet            MountVolume.SetUp failed for volume "flannel-cfg" : failed to sync configmap cache: timed out waiting for the condition
      Normal   Pulled       4m19s (x2 over 5m19s)   kubelet            Container image "rancher/hardened-calico:v3.23.3-build20220810" already present on machine
      Normal   Created      4m19s (x2 over 5m19s)   kubelet            Created container calico-node
      Normal   Started      4m19s (x2 over 5m19s)   kubelet            Started container calico-node
      Warning  Unhealthy    4m19s (x6 over 5m9s)    kubelet            Liveness probe failed: calico/node is not ready: Felix is not live: Get "<http://localhost:9099/liveness>": dial tcp: lookup localhost on 8.8.8.8:53: no such host
      Normal   Killing      4m19s                   kubelet            Container calico-node failed liveness probe, will be restarted
      Warning  Unhealthy    4m17s (x11 over 5m11s)  kubelet            Readiness probe failed: Get "<http://localhost:9099/readiness>": dial tcp: lookup localhost on 8.8.8.8:53: no such host
      Warning  BackOff      15s (x6 over 69s)       kubelet            Back-off restarting failed container
    I'm assuming the certificates expired for calico no? Any help would be great
    c
    • 2
    • 2
  • p

    proud-ram-62490

    10/24/2022, 3:16 PM
    Hi all, I’m new to Rancher & RKE2 (Kubernetes in general really) and had a few questions about the NGINX ingress that I setup. Within More Resources/Core/Events I noticed that I have several “Scheduled for sync” messages for each ingress that I’ve created and the count is in the tens of thousands. Is this a normal operation or have I configured something terribly wrong? Things seem to be running well but I wanted to check with the community to see if I was ‘missing something’
  • v

    victorious-analyst-3332

    10/25/2022, 3:41 PM
    Howdy all 👋 I was curious if there a mechanism for updating RKE2 node arguments for Rancher managed cluster after the initial joining to the cluster. I see the args from the join command under the
    <http://rke2.io/node-args|rke2.io/node-args>
    annotation, but editing that directly doesn’t appear to have any impact on the
    /etc/rancher/rke2/config.yaml.d/50-rancher.yaml
    file on the node itself. Thanks a lot.
    c
    • 2
    • 6
  • s

    stocky-article-82001

    10/25/2022, 8:44 PM
    Hey 👋, just wondering how to give kubelet extra params, trying to recreate what I had in RKE1 which was:
    kubelet:  
      extra_args:
        max-pods: 250
    c
    • 2
    • 1
  • l

    limited-motherboard-41807

    10/26/2022, 1:47 PM
    Hello folks, I'm confused with the default installation of nginx ingress withe RKE2. I would have thought it should be running as a DaemonSet and exposed as a Service of type NodePort on the port 80. It is effectively running as a DaemonSet but it's only a Service of type ClusterIP. And still I'm able to join my ingress and other services configured as type Ingress on port 80 & 443 on all nodes of my cluster. I think I'm missing something basic here 😞 Does what I'm saying makes sense or not at all?
    c
    p
    • 3
    • 17
  • b

    bored-rain-98291

    10/26/2022, 6:48 PM
    Greetings friends! Is there anything specific to rke2 that i should look at if we have all nodes being synced with ntp but one pod is 23 seconds off?
    c
    • 2
    • 14
  • r

    rough-ocean-41843

    10/27/2022, 2:16 PM
    So i've been trying to deploy a few apps to RKE2. Most recently I've been trying to deploy MariaDB/PgSQL. But right now i'm focusing on MariaDB. I've deployed the Persistent Volume, and the PersistentVolumeClaim and then all the subsequent details needed like the service and then the application. However when I deploy it keeps saying Deployment does not have minimum availability. I don't understand why.
  • r

    rough-ocean-41843

    10/27/2022, 2:18 PM
    After doing some research I cam across this - https://fabianlee.org/2022/01/12/kubernetes-nfs-mount-using-dynamic-volume-and-storage-class/ and tried to incorporate that.
  • p

    prehistoric-solstice-99854

    10/27/2022, 7:41 PM
    I’ve spent the past two days trying to install MetalLB in an RKE 2 cluster. I’ve read blogs, the docs, Stack Overflow questions and nothing really explains how to do it with a pretty straight forward setup. I have a cluster of 3 controllers and 3 workers all on VMs in ESX. I’ve installed Kubernetes 1.21 (max version due to an app requirement) all running on Oralce 8 hosts. I got the cluster up and running, I’ve deployed some test pods without issue (once I fixed the UDP checksum issue), but when I install MetalLB the controller pod is in a crash loop. I feel like I’ve miss configured it, but I don’t know what I did wrong because all the things I’ve read about MetalLB no one walks through installing it on RKE2 and so their situation doesn’t line up with mine. I’ve searched this channel and see MetalLB mentioned several times, so I’m sure it works, but the previous questions weren’t about what I’m seeing. Any help would be appreciated!
    c
    • 2
    • 24
  • c

    cool-pillow-1781

    10/28/2022, 1:43 PM
    Morning, I had asked a while back about splitting ETCD from the rest of the control plane with RKE2 and while we determined it wasn't required I was asked to present an example of it. I didn't find Rancher / RKE2 specific instruction for doing so but I did find this: https://www.suse.com/support/kb/doc/?id=000020771 Everything appears to have worked using those instructions, but I wanted to see if there were was anything wrong with what that article presents? Or if there is Rancher / RKE2 specific documentation I could be pointed towards? think this was the closest I could find on the subject (https://docs.rke2.io/advanced/#control-plane-component-resource-requestslimits)
    c
    • 2
    • 1
  • h

    hundreds-hairdresser-46043

    10/31/2022, 12:53 PM
    interesting observation on RKE2: if i restore a VM snapshot of a running RKE2 cluster the masters never recover. not even with a reboot. I have to rebuild the cluster very time so far.
    s
    • 2
    • 6
  • c

    cool-pillow-1781

    10/31/2022, 2:25 PM
    is it doing something where it's comparing current time against last stored time (like in etcd) and crashing on the time delta??? (that's a totally random shot in the dark)
    s
    • 2
    • 1
  • h

    hundreds-hairdresser-46043

    10/31/2022, 2:30 PM
    @cool-pillow-1781 I am assuming as much. i just dont know how to fix it. a reboot generally fixes this - but for some reason it is not. What i usual do after a snapshot restore is reboot the servers or shut them down for a few minutes the start them up one by one. I must also add - I am using the air gap method to install (no internet access) so it has to use the images i pre-downloaded. I did not have this kinda issue before when i was using a internet proxy server before.
  • c

    cool-pillow-1781

    10/31/2022, 2:34 PM
    I might have to play with that as well I'm also installing with the Air Gap bundles granted I'm installing currently onto some EC2 instances.
    h
    • 2
    • 1
  • n

    narrow-noon-75604

    10/31/2022, 4:58 PM
    Hi, I have deployed an angular application on nginx server as a pod on RKE2 cluster. I am trying to upload a file of size 7MB but the request fails with 413 error. So, I have tried to update the app specific nginx properties to support the max size of 50MB by adding a field "client_max_body_size". This did not solve the issue. So I tried adding the annotations to the ingress object I have created for the angular app, it has modified the "client_max_body_size" of this app in rke2-ingress but that too did not solve the issue. Any help here would be appreciated.
    n
    • 2
    • 4
  • b

    bored-rain-98291

    10/31/2022, 7:05 PM
    I have installed grafana/loki and every time i try to do a port-forward its throws an error stating that it lost connection with the pod. Anything that causes this loss of connection to the pod?
  • b

    bored-rain-98291

    10/31/2022, 7:10 PM
    here is the error: E1031 13:55:21.136427 30239 portforward.go:406] an error occurred forwarding 3000 -> 80: error forwarding port 80 to pod f1a8f35893a7dba757bff26bfff755fa1dec0a2412f9995fb794948f8b0e2477, uid : failed to execute portforward in network namespace “/var/run/netns/cni-84fc3ce6-4a1d-3510-188b-f82d585e992f”: failed to connect to localhost:80 inside namespace “f1a8f35893a7dba757bff26bfff755fa1dec0a2412f9995fb794948f8b0e2477", IPv4: dial tcp4 127.0.0.1:80: connect: connection refused IPv6 dial tcp6 [::1]:80: connect: connection refused E1031 13:55:21.136858 30239 portforward.go:234] lost connection to pod
    c
    • 2
    • 3
  • h

    hundreds-hairdresser-46043

    11/01/2022, 9:32 AM
    I am trying to get past my ETCD error to build using google GAR vs air-gap images "/var/lib/rancher/rke2/agent/images/ but now i am back to where i started. I am unable to pull images from Google GAR. This is my current config for registries.yaml, can anyone tell me why this would not work? Because when it tries to pull the image it says "NAME_INVALID: Request contains an invalid argument" and the URL looks like: https://europe-west4-docker.pkg.dev/v2/token?scope=repository....
    mirrors:
    <http://europe-west4-docker.pkg.dev|europe-west4-docker.pkg.dev>:
    endpoint:
    - "<https://europe-west4-docker.pkg.dev>"
    <http://docker.io|docker.io>:
    endpoint:
    - "<https://europe-west4-docker.pkg.dev>"
    rewrite:
    "^rancher/(.*)": "/rke2/rke2-core-cluster/$1"
    <http://index.docker.io|index.docker.io>:
    endpoint:
    - "<https://europe-west4-docker.pkg.dev>"
    rewrite:
    "^rancher/(.*)": "/rke2/rke2-core-cluster/$1"
    configs:
    <http://europe-west4-docker.pkg.dev|europe-west4-docker.pkg.dev>:
    auth:
    username: _json_key_base64
    password: |
    ICasasaAgIblah...
    c
    • 2
    • 2
  • r

    red-waitress-37932

    11/01/2022, 11:04 AM
    RKE2 exists. does this mean I should use RKE2 instead of RKE1 if I start out fresh?
    h
    n
    • 3
    • 52
  • c

    cool-pillow-1781

    11/03/2022, 4:42 PM
    Afternoon I was looking for partitioning guidance on RKE2 and came across this issue: https://github.com/rancher/rke2/issues/2479 Looking at my own cluster I found the following areas to be larger: • /var/lib/kubelet ? ◦ Not sure what the growth potential is • /var/lib/rancher ! ◦ contianerd uses a sub folder as the root storage so container layers should be here • /var/run/k3s ? ◦ Not sure on growth rate • /var/run/containerd ? ◦ Not sure on growth rate • /var/log/pods ! ◦ pod logs • /var/log ! ◦ bane of my existence always seems to fill up 😄 • /run X - seems to be the tmpfs for containerd sockets as well as where the overlayFS is setup ◦ doesn't seem like I want to create a separate partition ▪︎ as this would take me off a RAM FS and put me onto spinners (sadly 😞 ) ? = should I partition ; ! = planning on partitioning ; X = don't plan on partitioning
    s
    • 2
    • 21
  • c

    careful-table-97595

    11/04/2022, 12:07 PM
    Hey there! How do I enable volumesnapshot support in RKE2? my RKE2 version is 1.25.3, I know that the CRD's are under distribution responsability: https://kubernetes.io/docs/concepts/storage/volume-snapshots/ My CSI is supporting it: synology-csi, but the volumesnapshot CRD's must also be present
    c
    • 2
    • 1
  • c

    cool-pillow-1781

    11/07/2022, 2:52 PM
    Just to wrap up (and leave out there for review). Decided initially:
    /var/lib/rancher > /var/log > / 
    With / still being 500G~1T
    Expecting the vast majority of data to be in /var/lib/rancher
    h
    • 2
    • 1
  • g

    gifted-eye-43916

    11/08/2022, 12:54 PM
    message has been deleted
    c
    • 2
    • 8
  • h

    hundreds-hairdresser-46043

    11/11/2022, 1:07 PM
    RKE2 upgrades observation. I did a cluster upgrade from v1.24.3+rke2r1 to v1.24.6+rke2r1 and now i see this and it does not go away. not sure why or how to fix it? There is no logs for this (the jobs) that i can visibly see anyone have any clue about his?
    c
    s
    • 3
    • 6
  • p

    proud-salesmen-12221

    11/14/2022, 12:45 AM
    Hi, I've deployed a RKE2 cluster with Cilium as the CNI and installed the Rancher Management app and after that it works great, I'm able to access the Rancher Management UI via a browser. But, once I enable IPSec in Cilium, I'm not longer able to hit the UI. Do you have any tips on how I can debug this?
    b
    • 2
    • 19
  • r

    refined-scientist-20236

    11/15/2022, 1:29 PM
    Hello, I try to setup a RKE2 Cluster (3M - 2W). But the Cluster starts only working after reboot after setup, and when I later reboot a node then this node is failing again. The RKE-Server Log shows me that line: Waiting to retrieve kube-proxy configuration; server is not ready: https://127.0.0.1:9345/v1-rke2/readyz: 500 Internal Server Error Node Configuration looks like this:
    "advertise-address": "10.10.255.103"
    "cni": "none"
    "disable":
    - "rke2-ingress-nginx"
    "disable-cloud-controller": true
    "disable-kube-proxy": false
    "disable-network-policy": false
    "flannel-iface": "eth1"
    "kube-controller-manager-arg": "flex-volume-plugin-dir=/var/lib/kubelet/volumeplugins"
    "kubelet-arg":
    - "cloud-provider=external"
    - "volume-plugin-dir=/var/lib/kubelet/volumeplugins"
    "node-ip": "10.10.255.103"
    "node-name": "k8s-test-control-plane-3"
    "node-taint":
    - "<http://node-role.kubernetes.io/control-plane:NoSchedule|node-role.kubernetes.io/control-plane:NoSchedule>"
    "server": "<https://10.10.255.101:9345>"
    "tls-san":
    - "195.201.20.2"
    "token": "xxxx"
    "write-kubeconfig-mode": "0644"
    Info: Currently no LB is between Nodes - Server is the first node and CNI: node is because I setup calico typha operator after cluster init (on setup time the first node has the default cni)
    s
    • 2
    • 2
  • p

    proud-ram-62490

    11/16/2022, 4:39 PM
    Hi, is RKE2 supported at all on Alma 8.6? I realize the release docs state that Rocky 8 is support, and not Alma.
    g
    g
    • 3
    • 15
Powered by Linen
Title
p

proud-ram-62490

11/16/2022, 4:39 PM
Hi, is RKE2 supported at all on Alma 8.6? I realize the release docs state that Rocky 8 is support, and not Alma.
g

gray-lawyer-73831

11/16/2022, 4:40 PM
It has not been validated on Alma, so not officially supported. There shouldn’t necessarily be anything technical from restricting that though, so you’re welcome to try to use it there, but your mileage may vary since we haven’t officially validated it there
p

proud-ram-62490

11/16/2022, 4:44 PM
Thanks Max, I appreciate the feedback.
Are there any plans to officially validate Rancher against Alma?
g

gray-lawyer-73831

11/16/2022, 4:46 PM
Somewhat — it got pushed to the backlog a year ago and hasn’t seen traction since then: https://github.com/rancher/rke2/issues/1177
p

proud-ram-62490

11/16/2022, 4:47 PM
Understood - thanks.
g

great-evening-84612

11/17/2022, 12:26 PM
Finally, did you try with Alma 8.6 ? I need to read how we can add the os on the github project
p

proud-ram-62490

11/17/2022, 1:50 PM
I did try with Alma 8.6 and it fails almost immediately.
journalctl -xe
shows it’s failing at getting the certificates
The specific error is as follows:
Nov 16 17:07:33 lcs-alma8-k8-0 rancher-system-agent[5855]: time="2022-11-16T17:07:33-05:00" level=error msg="error while appending ca cert to pool for probe kube-scheduler"
Nov 16 17:07:33 lcs-alma8-k8-0 rancher-system-agent[5855]: time="2022-11-16T17:07:33-05:00" level=error msg="error loading CA cert for probe (kube-controller-manager) /var/lib/rancher/rke2/server/tls/kube-controller-manager/kube-controller-manager.crt: open /var/lib/ra>
g

great-evening-84612

11/17/2022, 2:56 PM
I will try tomorrow on my platform.
I tried with the 1.24 version. It works. How did you deploy rke2 ?
p

proud-ram-62490

11/18/2022, 6:51 PM
I just used the register command
Copy/Pasta
How did you deploy RKE2 via Rancher to Alma8?
g

great-evening-84612

11/21/2022, 7:06 PM
Nope I deploy directly rke2 on alma. I will try with rancher, maybe the week or the next.
View count: 12