https://rancher.com/ logo
Join Slack
Powered by
# general
  • f

    faint-soccer-70506

    07/01/2025, 7:31 PM
    How does one set a custom grafana dashboard to be added to new project monitor grafana's? Similarly, is there a mechanism to add said dashboards to existing project monitors other than copy-paste? Just adding it to the cattle-dashboard namespace like you would for the global rancher-monitoring grafana doesn't appear to accomplish this, and I don't see anything in the docs about it. (sorry if there's a better channel to ask this, couldn't find one)
  • r

    red-intern-36131

    07/02/2025, 7:10 AM
    Hello! We migrated Rancher from an eks cluster to one of the downstream clusters in vSphere with rke2 so that Rancher could manage itself. We decommissioned the rancher instance from the eks cluster and now only the one we migrated to is up and running. The only problem we have is that the nodes for that downstream cluster appear outdated/unavailable even though they are functioning. And if we provision new nodes for this cluster, they remain in Provisioned status. Do you have any idea what could be causing this or what we are missing? Thank you very much!
  • b

    boundless-scientist-9417

    07/02/2025, 5:04 PM
    Hi, I am currently using the mongodb-sharded:7.0.6-debian-12-r0 image with my rke2 deployment and I want to use the same version with Distroless image, how can I get it?
    m
    • 2
    • 1
  • f

    flat-spring-34799

    07/02/2025, 8:52 PM
    Is there a way to restore a cluster that was scaled to 0 with only the etcd backups? and the various yamls that are on the local still of course
    c
    • 2
    • 8
  • b

    bland-easter-37603

    07/03/2025, 7:19 AM
    Running k3s with docker container and systemctl restart docker breaks the k3s server. Only way to rescue it is by recreating container.
    I0703 07:15:09.113085      17 factory.go:221] Registration of the crio container factory failed: Get "<http://%2Fvar%2Frun%2Fcrio%2Fcrio.sock/info>": dial unix /var/run/crio/crio.sock: connect: no such file or directory
    I0703 07:15:09.124398      17 kubelet_network_linux.go:49] "Initialized iptables rules." protocol="IPv4"
    I0703 07:15:09.127578      17 kubelet_network_linux.go:49] "Initialized iptables rules." protocol="IPv6"
    I0703 07:15:09.127611      17 status_manager.go:230] "Starting to sync pod status with apiserver"
    I0703 07:15:09.127634      17 watchdog_linux.go:127] "Systemd watchdog is not enabled or the interval is invalid, so health checking will not be started."
    I0703 07:15:09.127641      17 kubelet.go:2436] "Starting kubelet main sync loop"
    E0703 07:15:09.127698      17 kubelet.go:2460] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
    time="2025-07-03T07:15:09Z" level=info msg="Applying CRD <http://addons.k3s.cattle.io|addons.k3s.cattle.io>"
    I0703 07:15:09.149273      17 factory.go:223] Registration of the containerd container factory successfully
    time="2025-07-03T07:15:09Z" level=info msg="Flannel found PodCIDR assigned for node 7343353f304f"
    time="2025-07-03T07:15:09Z" level=info msg="The interface eth0 with ipv4 address 172.24.0.4 will be used by flannel"
    I0703 07:15:09.210568      17 kube.go:139] Waiting 10m0s for node controller to sync
    I0703 07:15:09.210689      17 kube.go:469] Starting kube subnet manager
    I0703 07:15:09.216454      17 cpu_manager.go:221] "Starting CPU manager" policy="none"
    I0703 07:15:09.216571      17 cpu_manager.go:222] "Reconciling" reconcilePeriod="10s"
    I0703 07:15:09.216597      17 state_mem.go:36] "Initialized new in-memory state store"
    I0703 07:15:09.216762      17 state_mem.go:88] "Updated default CPUSet" cpuSet=""
    I0703 07:15:09.216967      17 state_mem.go:96] "Updated CPUSet assignments" assignments={}
    I0703 07:15:09.216984      17 policy_none.go:49] "None policy: Start"
    I0703 07:15:09.216996      17 memory_manager.go:186] "Starting memorymanager" policy="None"
    I0703 07:15:09.217005      17 state_mem.go:35] "Initializing new in-memory state store"
    I0703 07:15:09.217099      17 state_mem.go:75] "Updated machine memory state"
    E0703 07:15:09.219087      17 manager.go:517] "Failed to read data from checkpoint" err="checkpoint is not found" checkpoint="kubelet_internal_checkpoint"
    I0703 07:15:09.219438      17 eviction_manager.go:189] "Eviction manager: starting control loop"
    I0703 07:15:09.219559      17 container_log_manager.go:189] "Initializing container log rotate workers" workers=1 monitorPeriod="10s"
    I0703 07:15:09.222575      17 kube.go:490] Creating the node lease for IPv4. This is the n.Spec.PodCIDRs: [10.42.0.0/24]
    I0703 07:15:09.227004      17 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
    I0703 07:15:09.244032      17 server.go:715] "Successfully retrieved node IP(s)" IPs=["172.24.0.2"]
    E0703 07:15:09.244440      17 server.go:245] "Kube-proxy configuration may be incomplete or incorrect" err="nodePortAddresses is unset; NodePort connections will be accepted on all local IPs. Consider using
    --nodeport-addresses primary`"`
    I0703 07:15:09.247644      17 server.go:254] "kube-proxy running in dual-stack mode" primary ipFamily="IPv4"
    I0703 07:15:09.247856      17 server_linux.go:145] "Using iptables Proxier"
    I0703 07:15:09.270383      17 proxier.go:243] "Setting route_localnet=1 to allow node-ports on localhost; to change this either disable iptables.localhostNodePorts (--iptables-localhost-nodeports) or set nodePortAddresses (--nodeport-addresses) to filter loopback addresses"
    I0703 07:15:09.270940      17 server.go:516] "Version info" version="v1.33.2+k3s1"
    I0703 07:15:09.271050      17 server.go:518] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
    I0703 07:15:09.277080      17 config.go:199] "Starting service config controller"
    I0703 07:15:09.277265      17 shared_informer.go:350] "Waiting for caches to sync" controller="service config"
    I0703 07:15:09.277368      17 config.go:105] "Starting endpoint slice config controller"
    I0703 07:15:09.277453      17 shared_informer.go:350] "Waiting for caches to sync" controller="endpoint slice config"
    I0703 07:15:09.277525      17 config.go:440] "Starting serviceCIDR config controller"
    I0703 07:15:09.277597      17 shared_informer.go:350] "Waiting for caches to sync" controller="serviceCIDR config"
    I0703 07:15:09.278390      17 config.go:329] "Starting node config controller"
    I0703 07:15:09.278525      17 shared_informer.go:350] "Waiting for caches to sync" controller="node config"
    E0703 07:15:09.292754      17 eviction_manager.go:267] "eviction manager: failed to check if we have separate container filesystem. Ignoring." err="no imagefs label for configured runtime"
    I0703 07:15:09.318748      17 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="6b216c0f025b0efd2879ada393fc77bc808594146a1aa3759b37063c0103031c"
    I0703 07:15:09.353082      17 kubelet_node_status.go:75] "Attempting to register node" node="7343353f304f"
    I0703 07:15:09.391194      17 shared_informer.go:357] "Caches are synced" controller="node config"
    I0703 07:15:09.391310      17 shared_informer.go:357] "Caches are synced" controller="service config"
    I0703 07:15:09.391377      17 shared_informer.go:357] "Caches are synced" controller="endpoint slice config"
    I0703 07:15:09.394486      17 shared_informer.go:357] "Caches are synced" controller="serviceCIDR config"
    I0703 07:15:09.427079      17 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="a3534178e457d4f53f49bd0e60b9b322987b1ac7c3781a198dd067d12fc036a4"
    time="2025-07-03T07:15:09Z" level=info msg="Applying CRD <http://etcdsnapshotfiles.k3s.cattle.io|etcdsnapshotfiles.k3s.cattle.io>"
    time="2025-07-03T07:15:09Z" level=fatal msg="Failed to start networking: unable to initialize network policy controller: error getting node subnet: failed to find interface with specified node ip"
    c
    d
    • 3
    • 14
  • m

    most-memory-45748

    07/03/2025, 9:54 AM
    I get a bunch of 403 errors when trying to deploy something in my cluster via a Azure devops pipeline with a Kubernetes service connection. Previously it worked just fine. I have a roletemplate which looks like this:
    Copy code
    /usr/local/bin/kubectl get pods -n cattle-monitoring-system
    
    
    ##[error]E0703 11:02:37.026548 3323374 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: {\"Code\":{\"Code\":\"Forbidden\",\"Status\":403},\"Message\":\"<http://clusters.management.cattle.io|clusters.management.cattle.io> \\\"c-m-7m9crrsz\\\" is forbidden: User \\\"system:unauthenticated\\\" cannot get resource \\\"clusters\\\" in API group \\\"<http://management.cattle.io|management.cattle.io>\\\" at the cluster scope\",\"Cause\":null,\"FieldName\":\"\"}"
  • s

    steep-petabyte-14152

    07/03/2025, 4:45 PM
    Good morning I'm seeing a question in Rancher since I'm a new user. When I create a project and associate it with a namespace, I'm seeing that a namespace with a generic name is created in my cluster. Is this normal in this Rancher process?
    s
    • 2
    • 2
  • b

    bright-address-70198

    07/04/2025, 10:18 AM
    Hi Everyone, I would like to get a help How can I programmatically install a Helm chart using catalog.cattle.io.app in a Rancher extension? I’m using this.$store.dispatch('cluster/create', ...), but the chart doesn’t seem to deploy. Any working example or doc link would help.
    s
    • 2
    • 1
  • w

    wide-alligator-26985

    07/08/2025, 12:43 PM
    Hello, I'm using WSL2, Windows -> Linux, I'm working on PHP with PHPStorm as my IDE. Since I installed Rancher Desktop (previously used Docker Desktop), I couldn't get Xdebug to work, has anyone else had the issue?
    s
    • 2
    • 1
  • b

    billions-kilobyte-26686

    07/08/2025, 3:15 PM
    I'm trying to install k3s on a single server at home, which is behind a nat. Does some of you have a documentation on how to do this which shows also a hello-world ?
  • b

    billions-kilobyte-26686

    07/08/2025, 3:32 PM
    I've created an ingress with traefik. But what is strange is that the command "ss -tlnp" on the server doses not show any process that bind to port 80.
    c
    • 2
    • 2
  • r

    refined-application-74576

    07/08/2025, 6:22 PM
    Will k3s and rke2 w/sqlite honor SQLITE_TMPDIR env var?
    c
    • 2
    • 7
  • a

    acoustic-alligator-67807

    07/09/2025, 3:13 PM
    hello all
  • a

    acoustic-alligator-67807

    07/09/2025, 3:15 PM
    has anyone tried to use longhorn shared volumes (RWX) on EKS? I am definitely missing a step. the error I see from the share-manager-pvc is "NFS server is not running, skip unexporting and unmounting volume". not sure how to get one running on an EKS node?
    m
    • 2
    • 1
  • b

    billions-kilobyte-26686

    07/09/2025, 3:44 PM
    I'm reading a lot about k3s. I'm wanting to have a single-node solution, that could, if needed, be expanded to many nodes. How should I start with the network, should I go with traefik ?
    c
    • 2
    • 1
  • a

    abundant-hair-58573

    07/09/2025, 8:30 PM
    Hi all, in an airgapped network, I'm trying to upgrade rke2 1.28.15 to 1.29.15. When I add the plan to upgrade the control-plane, it gets hung on the first control plane. The node itself is up. When I look at rke2-server on the node it says
    Waitiong for control-plane node <ip> startup: nodes <ip> not found
    with the ip being the node of that control plane that I'm on. In Rancher I see errors in "Recent Events" for the rke2-canal pod running on that node. It's a lot to transcribe from an air-gapped network so I'm summarizing
    Copy code
    MountVolume.SetUp failed for volume "flannel-cfg" : failed to sync configmap cache: timeout waiting for condition
    
    MountVolume.SetUp failed for volume "kube-api-access-..." : [failed to fetch token: serviceaccounts "canal" is forbidden: User "system:node:<node-hostname>" cannot create resource "serviceacccounts/token" in API group "" in the namespace "kube-system": no relationship found between node '<node-fqdn>' and this object, failed to sync configmap cache: time out waiting for the condition]
    • 1
    • 2
  • n

    narrow-twilight-31778

    07/09/2025, 9:07 PM
    hello, I have two 4 node HA Harvester clusters, I've created a few linux base templates in one of them, and would like to copy/migrate/clone/or whatever those templates to the other cluster, what's the easiest way to do that, download the template disk image, import into the other cluster? or is there some other way to do that?
  • n

    narrow-twilight-31778

    07/09/2025, 9:07 PM
    one cluster is 1.5.0, the other is 1.4.1, (yes I plan on getting both to 1.5.1 likely tomorrow or so)
  • n

    narrow-twilight-31778

    07/09/2025, 9:12 PM
    Another question we have: Is there any way to migrate a VM from one cluster to the other? I suspect they're probably related in a sense
  • a

    agreeable-planet-37457

    07/10/2025, 9:21 AM
    hi, I want to bypass the default cpi and cni being deployed with RKE2 resource rancher2_cluster_v2, as I'd like to use a resource helm_release with my own values file. The issue i'm facing here is that a
    <http://node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule|node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule>
    taint is being applied to the node which holts the master node from going into a ready state. How can I resolve this please?
    c
    • 2
    • 5
  • b

    better-garden-71117

    07/10/2025, 3:54 PM
    Hi guys, the
    kube-controller-manager
    and
    kube-scheduler
    certificates on my RKE2 server have expired. I attempted to rotate them using
    rke2 certificate rotate
    , but the renewal didn’t go through as expected. Let me know if anyone has encountered this before or has suggestions to resolve it.
    c
    • 2
    • 1
  • c

    colossal-forest-90809

    07/11/2025, 4:57 PM
    Hey folks, I was wondering if the Rancher CLI supports logging in via the GitHub authentication provider?
  • f

    fierce-dusk-97055

    07/12/2025, 5:21 PM
    s this the correct channel to ask about some strange behavior with Rancher? [11:18 AM] Jin Mazumdar I can get containers to run under docker but not in my Rancher K3s setup. It complains about g support for x86_64 v3 support not being presen
  • b

    brash-petabyte-67855

    07/14/2025, 12:09 AM
    Hi All, hopefully somebody can direct me... Just deployed the fresh v2.11.3 using helm chart (all the defaults besides letsEncrypt enablement) and observing failure in the cert-manager - "re-queuing item due to error processing" err="admission webhook \"validate.nginx.ingress.kubernetes.io\" denied the request: ingress contains invalid paths: path /.well-known/acme-challenge/cqzunnfAKyZuoHdaWqbzt0qExb8tLod1BoNS0Dk0z90 cannot be used with pathType Exact" logger="cert-manager.controller" - what do I change to fix it?
  • b

    brash-petabyte-67855

    07/14/2025, 12:32 AM
    looks like there is some way to turn this ''Exact" behavior off by using ACMEHTTP01IngressPathTypeExact: false, related issue https://github.com/cert-manager/cert-manager/issues/7791
  • n

    nutritious-portugal-97072

    07/14/2025, 7:36 AM
    Hi I am currently using Rancher desktop when I try to execute docker info getting below error Server: error during connect: Get "http://%2F%2F.%2Fpipe%2FdockerDesktopLinuxEngine/v1.49/info": open //./pipe/dockerDesktopLinuxEngine: The system cannot find the file specified. How to fix this issue?
    c
    • 2
    • 2
  • d

    damp-ice-28701

    07/15/2025, 1:45 PM
    Hey o/ I was hoping someone could give me some clarity on this: We keep losing contact with the rancher-system-agent on nodes randomly — sometimes it resolves itself, other times it gets stuck and we need to manually restart the agent. Sometimes we notice machine conditions getting stuck (like PlanApplied=False for months). I'm trying to figure out why and how to resolve this. I think it's related to how we provision the VMs — we host them on Proxmox servers and then register them to Rancher (not importing existing clusters). We have clusters showing: driver: imported provider: rke2 The docs say "Additional Features for Registered RKE2 and K3s Clusters" include version management, but also say imported clusters are limited. We're in this weird middle ground: We can control the rancher-system-agent But can't use Rancher for VM lifecycle management (obviously) Our setup: Rancher creates cluster via rancher2_cluster_v2 Nodes join via registration commands Result shows driver: imported but behaves like documented "Registered RKE2 clusters" Question: For stuck nodes on fully managed infrastructure (AWS/etc), how are they handled — are they destroyed and rebuilt automatically? If so, that's what our setup is lacking and I guess that would resolve the issues we are seeing. I feel we should move over to HarvesterOS, but the more senior engineer sees this as a big red flag and does not feel comfortable getting closer to the Rancher ecosystem and would prefer to rip Rancher out and use upstream K8s Just looking for further insight into this which I can hopefully use to argue the case for keeping rancher and adding harvester :)
    b
    • 2
    • 3
  • e

    elegant-truck-75829

    07/15/2025, 10:50 PM
    👋 Quick Question on rancher2_cluster_sync usage I’m using the rancher2_cluster_sync resource to wait for an RKE2 cluster(created using terraform) to become active. I’m pulling cluster info using data.rancher2_cluster_v2 or rancher2_cluster_v2, and passing .id to cluster_id, which returns something like:
    fleet-default/dev05c
    However, terraform apply fails with a timeout error:
    Error: Timeout waiting for cluster ID fleet-default/dev05c
    I noticed that data.rancher2_cluster_v2.cluster.cluster_v1_id gives the internal Rancher ID (c-m-xxxxxxx), and when I use that instead, the sync works as expected. 💡 So it seems rancher2_cluster_sync.cluster_id expects the internal Rancher ID (c-m-xxxxx) — not the namespaced ID (fleet-default/xxxx). ✅ Can anyone confirm if this is the intended usage?
    c
    b
    • 3
    • 3
  • b

    billions-kilobyte-26686

    07/16/2025, 6:08 PM
    I try to have a single ingress entry that will manage 80 and 443. • 80 should be open and redirect traffic to 443 • 443 should be open and use let's encrypt I was unsuccessful doing this. At the end, I've created two ingress controller. http
    Copy code
    apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
    kind: Ingress
    metadata:
      name: main-http
      namespace: demo-hw
      annotations:
        <http://traefik.ingress.kubernetes.io/router.entrypoints|traefik.ingress.kubernetes.io/router.entrypoints>: web
        <http://traefik.ingress.kubernetes.io/router.middlewares|traefik.ingress.kubernetes.io/router.middlewares>: traefik-custom-redirect-to-https-not-permanent@kubernetescrd
    spec:
      ingressClassName: traefik
      rules:
      - host: <http://demo-hw.logilibre-vert.ch|demo-hw.logilibre-vert.ch>
        http:
          paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: main
                port:
                  number: 8080
    https
    Copy code
    apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
    kind: Ingress
    metadata:
      name: main-https
      namespace: demo-hw
      annotations:
        <http://traefik.ingress.kubernetes.io/router.entrypoints|traefik.ingress.kubernetes.io/router.entrypoints>: websecure
        <http://traefik.ingress.kubernetes.io/router.tls.certresolver|traefik.ingress.kubernetes.io/router.tls.certresolver>: default
    spec:
      ingressClassName: traefik
      rules:
      - host: <http://demo-hw.logilibre-vert.ch|demo-hw.logilibre-vert.ch>
        http:
          paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: main
                port:
                  number: 8080
      tls:
      - hosts:
        - <http://demo-hw.logilibre-vert.ch|demo-hw.logilibre-vert.ch>
    Can someone provide me a working example ?
    c
    • 2
    • 13
  • g

    gifted-airplane-79361

    07/16/2025, 9:26 PM
    Hi, I would like setup custom Windows cluster. For that, I have installed Rancher v2.11.3 and started setting up new Custom Cluster with RKE2 option. Rancher documentation says that I can use either Flannel or Calico for container networking for Windows cluster. But when I select Flannel, I don't see Registration command for registering Windows node. I only see Windows node registration command if I select Calico. Does Flannel not support Windows cluster OR there is bug in Rancher related to showing Windows node registration command OR I am doing something wrong. I am totally lost here... please let me know if you have any information about this.
    c
    • 2
    • 5