https://rancher.com/ logo
Join the conversationJoin Slack
Channels
academy
amazon
arm
azure
cabpr
chinese
ci-cd
danish
deutsch
developer
elemental
epinio
espanol
events
extensions
fleet
français
gcp
general
harvester
harvester-dev
hobbyfarm
hypper
japanese
k3d
k3os
k3s
k3s-contributor
kim
kubernetes
kubewarden
lima
logging
longhorn-dev
longhorn-storage
masterclass
mesos
mexico
nederlands
neuvector-security
office-hours
one-point-x
onlinemeetup
onlinetraining
opni
os
ozt
phillydotnet
portugues
rancher-desktop
rancher-extensions
rancher-setup
rancher-wrangler
random
rfed_ara
rio
rke
rke2
russian
s3gw
service-mesh
storage
submariner
supermicro-sixsq
swarm
terraform-controller
terraform-provider-rancher2
terraform-provider-rke
theranchcast
training-0110
training-0124
training-0131
training-0207
training-0214
training-1220
ukranian
v16-v21-migration
vsphere
windows
Powered by Linen
rke2
  • a

    ambitious-plastic-3551

    11/16/2022, 5:44 PM
    Uhm, I need help some of pods are in indefinite terminating state
  • a

    ambitious-plastic-3551

    11/16/2022, 5:44 PM
    And nothing makes them stop
  • s

    sparse-fireman-14239

    11/17/2022, 8:14 PM
    Weird, if I add a taint in /etc/rancher/rke2/config.yaml to an already running node in a cluster, restart rke2-server then the taint is not applied.
    node-taint:
      - "CriticalAddonsOnly=true:NoExecute"
    Though, kubelet is started with what I assume is the correct argument.
    --register-with-taints=CriticalAddonsOnly=true:NoExecute
    Adding the taint with kubectl works fine.
    c
    • 2
    • 9
  • s

    sparse-dusk-81900

    11/18/2022, 9:29 AM
    Hi
  • s

    sparse-dusk-81900

    11/18/2022, 9:31 AM
    Hi all, I just upgraded to Rancher
    v2.6.9
    and the
    custom
    RKE2 clusters to
    v1.24.7
    . However, both clusters are now in “Updating” state as both have a single node/machine with status
    waiting for plan to be applied
    . What’s the best way to troubleshoot this? I’ve already checked the
    rancher-system-agent.service
    on the regarding VMs but didn’t find anything suspicious. Also, cluster operations like a manually triggered cert rotation after the upgrade to
    v1.24.7
    run successfully - even on the affected nodes/machines. Because of that it just looks like an old status from a previous sync which keeps the whole clusters in “updating” state.
    • 1
    • 1
  • s

    sparse-fireman-14239

    11/18/2022, 11:13 AM
    ’ello 🙂 Is it possible to convert a non-air-gap installed RKE2 cluster to air-gap? In theory it should work, I guess, but I don’t know if there are configs or something in etcd which needs to be changed.
    h
    • 2
    • 5
  • e

    early-engineer-43393

    11/29/2022, 11:18 AM
    Hello, We are trying to deploy an RKE2 vsphere cluster, when we click on create cluster we are straight away hit with the following error:
    waiting: waiting for viable init node
    has anyone seen this before, we are not even sure how we can troubleshoot as we have no VM spun up to investigate and no other output from the logs. Thanks
    c
    s
    • 3
    • 82
  • w

    witty-engineer-12406

    11/30/2022, 11:52 AM
    Hello, I'm currently trying to create a Service Account for a pod which has a restricted set of rules.
    apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
    kind: ClusterRole
    metadata:
      name: dummy-cr
    rules:
      - nonResourceURLs: ["/healthz", "/readyz", "/livez"]
        verbs: ["get"]
      - apiGroups:
          - ""
        resources: ["pods", "pods/exec"]
        verbs: ["get", "delete", "create", "exec", "list"]
      - apiGroups:
          - ""
        resources: ["configmaps"]
        verbs: ["create", "delete"]
    ---
    apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
    kind: ClusterRoleBinding
    metadata:
      name: dummy-crb
    roleRef:
      apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
      kind: ClusterRole
      name: dummy-cr
    subjects:
      - kind: ServiceAccount
        name: dummy-sa
        namespace: dummy-demo
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: dummy-sa
      namespace: dummy-demo
    • 1
    • 5
  • s

    square-policeman-85866

    11/30/2022, 3:29 PM
    On an rke2 node how do you view the docker images running?
    s
    • 2
    • 3
  • s

    square-policeman-85866

    12/01/2022, 9:17 AM
    On rke2 nodes do you know where we can find the cluster ca certificate? For ref in rke we find the certificate in /etc/kubernetes/ssl/kube-ca.pem
    s
    w
    • 3
    • 3
  • n

    numerous-nail-55802

    12/02/2022, 11:37 AM
    Hi teams! Could you help me? There is RKE2 cluster with 6 nodes: 3 nodes - all roles 3 nodes - worker role This cluster installed like Custom from Rancher 2.6 How can I remove worker role from first nodes (all roles)?
    a
    s
    • 3
    • 3
  • g

    gentle-petabyte-40055

    12/03/2022, 5:53 AM
  • g

    gifted-eye-43916

    12/05/2022, 2:32 PM
    Hi anyone ever try to fix criticals or warnings from a CIS scan?. I try to fix the issue that the PKI certificates and keys have the wrong permissions. I saw the permissions are default correct, but the CIS scan is still complaining about it. RKE2 sets the permissions default to the correct ones
  • w

    worried-plastic-58654

    12/07/2022, 9:07 PM
    to install rke2 rancher in aws ec2, it is recommended to open jump 5.4, ubuntu centos or another
    ?
  • w

    worried-plastic-58654

    12/08/2022, 3:21 PM
    I correct it is opensuse “to install rke2 rancher in aws ec2 it is recommended opensuse leap 5.4, ubuntu, centos or other” ?
    s
    • 2
    • 3
  • b

    boundless-eye-27124

    12/09/2022, 12:43 AM
    how can we change etcd data dir in rke2?
    c
    • 2
    • 4
  • a

    able-engineer-22050

    12/09/2022, 10:49 AM
    Hi
  • a

    able-engineer-22050

    12/09/2022, 10:51 AM
    After upgrading RKE2 from 1.22.9 to 1.23.7 the master node is stuck at "waiting for kubelet to update" Everything seems to work fine, all updated pods restarted normally, just that the Rancher gui complains about unfinished update. Any hints on how to resolve?
    • 1
    • 1
  • r

    refined-scientist-20236

    12/09/2022, 11:10 AM
    Hello RKE2 friends, i have a freshly setup'ed Server with a HA-Proxy balanced control plane. The Cluster runs fine - except when I try to look into the logs or when the percona operator like to read the logs from pods - same with argocd. I get following message: Failed to load logs: Get "https://10.10.0.103:10250/containerLogs/mysql-cluster/mysql-cluster-operator-percona-xtradb-cluster-operator-5586cp5x/percona-xtradb-cluster-operator?sinceTime=1970-01-01T00%3A00%3A01Z&amp;timestamps=true": proxy error from 127.0.0.1:9345 while dialing 10.10.0.103:10250, code 503: 503 Service Unavailable Reason: undefined (500) Seems to be a network issue with dialing from inside kubernetes to the outside host - but i have no idea for the reason. I use calico as network driver.
  • h

    hundreds-evening-84071

    12/12/2022, 9:04 PM
    What do you recommend a preferred way to deploy RKE2 in HA (on RH8)? RPM based (via yum repo) or script method? I plan to deploy Rancher on this cluster.
    n
    • 2
    • 1
  • b

    best-microphone-20624

    12/13/2022, 9:08 PM
    Does anyone here know the plans for project https://github.com/rancher-sandbox/cluster-api-provider-rke2? How does it relate to CAPI support in Rancher?
    c
    • 2
    • 1
  • b

    boundless-eye-27124

    12/14/2022, 2:41 AM
    How to enable sysctl in RKE2. seeing
    forbidden sysctl: "net.ipv4.tcp_rmem" not allowlisted
    error. Already patched psp, still getting the error
    c
    • 2
    • 4
  • s

    square-policeman-85866

    12/14/2022, 10:05 AM
    Hi, when provisioning an rke2 cluster in bps here via the rancher UI, the provision of the cluster hangs in a state for 5hours before completing. Attached a screenshot of where it will now hang for 5 hours. Any ideas why or what log I can check?
    a
    • 2
    • 2
  • a

    ambitious-plastic-3551

    12/14/2022, 8:23 PM
    How is it impossible to bootup rke2 server
    c
    • 2
    • 38
  • a

    ambitious-plastic-3551

    12/14/2022, 8:23 PM
    both 1.24, 1.25
  • a

    ambitious-plastic-3551

    12/14/2022, 9:34 PM
    is it possible to connect newer 1.25 clusters to rancher management who is on 1.24?
    c
    • 2
    • 2
  • a

    agreeable-art-61329

    12/14/2022, 11:46 PM
    has anyone played with Kube-VIP as of late and have that example functioning with rke2? It works fine for a single node and my kubectl traffic - but I cannot join additional nodes following that example due to receiving a
    connection refused
    on port 9345 of the VIP. Any thoughts?
    c
    w
    • 3
    • 16
  • s

    silly-jordan-81965

    12/15/2022, 12:08 PM
    https://rancher-users.slack.com/archives/C3ASABBD1/p1671010078070759
    • 1
    • 1
  • l

    lemon-ability-39482

    12/19/2022, 9:57 AM
    Hello everyone, I'm running RKE2 v1.24.8 on RHEL 8.6 and I'm trying to move the data directory from the default location (/var/lib/rancher/rke2) to /mnt/data/rke2, which is on another (virtual) disk. I copied the entire directory using
    cp -ar /var/lib/rancher/rke2 /mnt/data/
    to preserve all attributes, then modified /etc/rancher/rke2/config.yaml and added the line
    data-dir: /mnt/data/rke2
    . This seems to work on agent/worker nodes. On server nodes, however, it looks like the necessary Kubernetes containers can't start. In the log of rke2-server, I keep getting the message
    Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:9345/v1-rke2/readyz>: 500 Internal Server Error
    , while /mnt/data/rke2/agent/containerd/containerd.log looks like this:
    time="2022-12-19T09:32:29.103553429+01:00" level=info msg="CreateContainer within sandbox \"478658988f888b30063a9127fb124abd38385967b796e15016675930bbb6cf88\" for container &ContainerMetadata{Name:cloud-controller-manager,Attempt:23,}"
    time="2022-12-19T09:32:29.154564201+01:00" level=info msg="CreateContainer within sandbox \"478658988f888b30063a9127fb124abd38385967b796e15016675930bbb6cf88\" for &ContainerMetadata{Name:cloud-controller-manager,Attempt:23,} returns container id \"4a3fe66d16a18ac6397dafc147a68b5bbe9bda1d7d4f7f7ce5e7f95e3a49b84b\""
    time="2022-12-19T09:32:29.154953048+01:00" level=info msg="StartContainer for \"4a3fe66d16a18ac6397dafc147a68b5bbe9bda1d7d4f7f7ce5e7f95e3a49b84b\""
    time="2022-12-19T09:32:29.276767961+01:00" level=info msg="StartContainer for \"4a3fe66d16a18ac6397dafc147a68b5bbe9bda1d7d4f7f7ce5e7f95e3a49b84b\" returns successfully"
    time="2022-12-19T09:32:29.691126028+01:00" level=info msg="shim disconnected" id=4a3fe66d16a18ac6397dafc147a68b5bbe9bda1d7d4f7f7ce5e7f95e3a49b84b
    time="2022-12-19T09:32:29.691184071+01:00" level=warning msg="cleaning up after shim disconnected" id=4a3fe66d16a18ac6397dafc147a68b5bbe9bda1d7d4f7f7ce5e7f95e3a49b84b namespace=<http://k8s.io|k8s.io>
    time="2022-12-19T09:32:29.691196163+01:00" level=info msg="cleaning up dead shim"
    time="2022-12-19T09:32:29.708945825+01:00" level=warning msg="cleanup warnings time=\"2022-12-19T09:32:29+01:00\" level=info msg=\"starting signal loop\" namespace=<http://k8s.io|k8s.io> pid=3497353 runtime=io.containerd.runc.v2\n"
    time="2022-12-19T09:32:30.049467561+01:00" level=info msg="RemoveContainer for \"9e40fb319e648b964c90cc77975c6cf7400aac36e53eb6354f738ad31995ce3c\""
    time="2022-12-19T09:32:30.056618827+01:00" level=info msg="RemoveContainer for \"9e40fb319e648b964c90cc77975c6cf7400aac36e53eb6354f738ad31995ce3c\" returns successfully"
    There are similar messages for kube-apiserver, etcd and kube-controller-manager. If I remove the data-dir line from my config, it all works again. Am I doing something wrong here? Some help would be much appreciated.
    c
    • 2
    • 6
  • c

    creamy-pencil-82913

    12/19/2022, 11:30 AM
    -
Powered by Linen
Title
c

creamy-pencil-82913

12/19/2022, 11:30 AM
-
View count: 4