https://rancher.com/ logo
Join Slack
Powered by
# general
  • h

    hundreds-oxygen-67526

    09/05/2025, 12:14 AM
    Hi, I’m running into issues with Rancher Desktop (v1.20.0). On macOS ARM, Rancher Desktop keeps trying to shrink my disk back to 100GiB even after I manually resize the
    diffdisk
    and update
    lima.yaml
    . Logs show:
    Resize instance 0's disk from 500GiB to 100GiB diffDisk: Shrinking is currently unavailable
    It looks like Rancher Desktop is ignoring the
    disks:
    config in
    lima.yaml
    and enforcing the default size. Has anyone successfully increased the Rancher Desktop disk beyond 100GiB on macOS (ARM) recently? Any guidance or workarounds would be appreciated 🙏
    h
    • 2
    • 1
  • q

    quaint-librarian-68606

    09/05/2025, 1:41 PM
    Hi All, Have a question about downgrading rancher k3s. I know downgrade is not supported, but I am stuck in a unique situation. My client had a rancher cluster with v2.6.3 and somehow underlying k3s has been upgraded to 1.32.6, so now I don’t have a way to upgrade rancher to latest. So I was thinking whether it makes sense to downgrade k3s and upgrade rancher from 2.6 to latest version upgrading one minor version at a time like going from v2.6.x to v2.7.x and then all the way up to latest
    b
    • 2
    • 9
  • g

    gentle-dawn-38066

    09/06/2025, 4:31 PM
    I'm trying to use local-path-provisioner as the underlying StorageClass and provisioner for a ceph deployment in my homelab single-node "cluster". The issue currently is that the mon PVC is stuck Pending and local-path-provisioner never creates the PV. Any help?
  • b

    brash-zebra-92886

    09/08/2025, 3:35 PM
    Is anyone else having problems with the
    imperative-api-extension
    it is installed with a bunch of assumptions and is impacting basic k8s function?
    • 1
    • 1
  • n

    numerous-agency-66232

    09/08/2025, 3:45 PM
    I’ve got rancher (stable version
    2.12.1
    ) installed via Helm Chart deployed w/ ArgoCD. This lives in a lower EKS environment. • This is working well and I intend for this to be my management cluster for the time being • We’ll call this environment
    source
    I’m trying to import another EKS cluster (diff AWS account + region) • We’ll call this environment
    target
    • I’ve allowed NAT Gateway IPs at the eks level + SG level where relevant (on both
    source
    and
    target
    eks clusters • However, I’m still getting the error:
    Copy code
    failed to communicate with cluster: Get "<https://MY_TARGET_CLUSTER.gr7.us-west-2.eks.amazonaws.com/api/v1/namespaces/cattle-system>": dial tcp TARGET_EKS_PUBLIC_IP:443: i/o timeout
    When I check the pod logs of the
    target
    cluster I see the following
    Copy code
    INFO: <https://rancher>.<MY_DOMAIN>.com/ping is accessible
    ...
    time="2025-09-04T20:17:18Z" level=info msg="Listening on /tmp/log.sock"                                                                                                           │
    │ time="2025-09-04T20:17:18Z" level=info msg="starting cattle-credential-cleanup goroutine in the background"                                                                       │
    │ time="2025-09-04T20:17:18Z" level=info msg="Rancher agent version v2.12.1 is starting"                                                                                            │
    │ time="2025-09-04T20:17:18Z" level=error msg="unable to read CA file from /etc/kubernetes/ssl/certs/serverca: open /etc/kubernetes/ssl/certs/serverca: no such file or directory"  │
    │ time="2025-09-04T20:17:18Z" level=info msg="Connecting to <wss://rancher>.<MY_DOMAIN>.com/v3/connect/register with token starting with TOKEN_STRING"
    time="2025-09-06T01:49:06Z" level=info msg="Connecting to proxy" url="<wss://rancher>.<MY_DOMAIN>.com/v3/connect"
    I think we can ignore those cert errors as I’ve already set
    agentTLSMode: "system-store"
    • When i fixed this, it proceeded beyond the cert errors -> to the connecting to proxy msg Further, I’ve added the following to NO_PROXY on
    source
    cluster
    Copy code
    ,.<http://eks.amazonaws.com|eks.amazonaws.com>,<http://eks.amazonaws.com|eks.amazonaws.com>,<http://TARGET_CLUSTER.gr7.us-west-2.eks.amazonaws.com|TARGET_CLUSTER.gr7.us-west-2.eks.amazonaws.com>
    and the following on the
    target
    cluster
    Copy code
    rancher.<MY_DOMAIN>.com,.svc,.cluster.local,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
    I am at a loss now for why this isn’t connecting. A timeout implies an SG / network issue I would guess My understanding is these 2 AWS accounts do not need network connectivity i.e via VPC peering. They should only need api access to the cluster URL, but I could be wrong there • the Public IP for the
    target
    eks cluster it times out on is AWS owned and not anything I have access to tl;dr why is my import of existing EKS cluster failing with a timeout error?
  • c

    creamy-pencil-82913

    09/08/2025, 4:14 PM
    The imported cluster needs access to the rancher URL, not the apiserver of the cluster that rancher manager is running on.
  • c

    creamy-pencil-82913

    09/08/2025, 4:16 PM
    If you are seeing rancher artifacts deployed to the imported cluster, then the rancher -> cluster connectivity is working, it is the cluster -> rancher path that is not.
    n
    • 2
    • 17
  • h

    hundreds-evening-84071

    09/09/2025, 12:49 PM
    Anyone upgraded to Rancher 2.12.1 ? When I go to look at statefulsets, the UI does not show any pods...
    s
    • 2
    • 5
  • h

    handsome-oil-82912

    09/09/2025, 1:43 PM
    Anyone using
    GracefulNodeShutdown
    setting in kubelet? I have tried dropping the following config in
    /var/lib/rancher/rke2/agent/etc/kubelet.conf.d
    without luck. The nodes are not drained when i reboot or do shutdown
    Copy code
    apiVersion: <http://kubelet.config.k8s.io/v1beta1|kubelet.config.k8s.io/v1beta1>
    kind: KubeletConfiguration
    shutdownGracePeriod: 30s
    shutdownGracePeriodCriticalPods: 10s
    b
    • 2
    • 2
  • q

    quaint-librarian-68606

    09/09/2025, 5:19 PM
    I enabled ACE on downward cluster from my rancher, but when I try to connect to the cluster, I get error indicating that I need to be logged in. When I looked at cattle-impersonation-system namespace in downstream cluster, I see only the project owner service account is configured. Is there a way configure other users configured in rancher cluster to access the downstream cluster ?
    c
    • 2
    • 24
  • n

    nice-businessperson-14225

    09/09/2025, 6:17 PM
    Hi everyone, I'm trying to get some node labels populated, specifically topology.kubernetes.io/region and topology.kubernetes.io/zone and such. The cluster is running via AWS EC2 provider. What's the way to achieve this?
  • n

    numerous-agency-66232

    09/10/2025, 5:02 PM
    Can local admin Rancher (installed via Helm Chart) handle the upgrade for an imported EKS cluster ? (it has cluster admin perm to the imported cluster)
  • b

    bland-pillow-38313

    09/10/2025, 5:09 PM
    👋 Hi everyone!
  • b

    bland-pillow-38313

    09/10/2025, 5:12 PM
    Is there an issue with Rancher Desktop v1.13.1 (Mac arm64)? It appears to be working just fine with the rancher desktop working properly as shown in the screenshot. But when I try to access kubectl from the terminal this the error that I get:
    I0910 20:11:41.266871  16201 versioner.go:87] Right kubectl missing, downloading version 1.28.15+k3s1
    F0910 20:11:42.022164  16201 main.go:70] error while trying to get contents of <https://storage.googleapis.com/kubernetes-release/release/v1.28.15/bin/darwin/amd64/kubectl.sha256>: GET <https://storage.googleapis.com/kubernetes-release/release/v1.28.15/bin/darwin/amd64/kubectl.sha256> returned http status 404 Not Found
    It's trying to get amd64 kubectl. I have double checked that I have installed the arm64 installer for Rancher
    c
    • 2
    • 1
  • l

    loud-potato-42814

    09/10/2025, 5:36 PM
    Could someone please explain cleanness of "Rancher Prime" subscription/paywall to me ? I know that images are "hardened" (for example, the base image is some sort of a minimized SLE Linux userspace). For example, if I compare for example https://apps.rancher.io/applications/alertmanager with https://artifacthub.io/packages/helm/prometheus-community/alertmanager the charts have changed minimally, but the LICENSE and original authors/maintainers have been removed (due to the takeover script patches/Chart.yaml.tpl)
    Copy code
    $ diff -r -u original/ suse/
    diff -r -u original/alertmanager/Chart.yaml suse/alertmanager/Chart.yaml
    --- original/alertmanager/Chart.yaml	2025-09-03 18:23:29.000000000 +0200
    +++ suse/alertmanager/Chart.yaml	2025-09-03 18:38:14.000000000 +0200
    @@ -1,26 +1,20 @@
     annotations:
    -  <http://artifacthub.io/license|artifacthub.io/license>: Apache-2.0
    -  <http://artifacthub.io/links|artifacthub.io/links>: |
    -    - name: Chart Source
    -      url: <https://github.com/prometheus-community/helm-charts>
    +  <http://helm.sh/images|helm.sh/images>: |
    +    - image: <http://dp.apps.rancher.io/containers/alertmanager:0.28.1-10.9|dp.apps.rancher.io/containers/alertmanager:0.28.1-10.9>
    +      name: alertmanager
    +    - image: <http://dp.apps.rancher.io/containers/prometheus-config-reloader:0.85.0-9.9|dp.apps.rancher.io/containers/prometheus-config-reloader:0.85.0-9.9>
    +      name: prometheus-config-reloader
     apiVersion: v2
    -appVersion: v0.28.1
    +appVersion: 0.28.1
     description: The Alertmanager handles alerts sent by client applications such as the
    -  Prometheus server.
    -home: <https://prometheus.io/>
    -icon: <https://raw.githubusercontent.com/prometheus/prometheus.github.io/master/assets/prometheus_logo-cb55bb5c346.png>
    -keywords:
    -- monitoring
    -kubeVersion: '>=1.25.0-0'
    +  Prometheus server. It takes care of deduplicating, grouping, and routing them to
    +  the correct receiver integrations such as email, PagerDuty, OpsGenie, or many other
    +  mechanisms thanks to the webhook receiver. It also takes care of silencing and inhibition
    +  of alerts.
    +home: <https://apps.rancher.io/applications/alertmanager>
    +icon: <https://apps.rancher.io/logos/alertmanager.png>
     maintainers:
    -- email: <mailto:monotek23@gmail.com|monotek23@gmail.com>
    -  name: monotek
    -  url: <https://github.com/monotek>
    -- email: <mailto:naseem@transit.app|naseem@transit.app>
    -  name: naseemkullah
    -  url: <https://github.com/naseemkullah>
    +- name: SUSE LLC
    +  url: <https://www.suse.com/>
     name: alertmanager
    -sources:
    -- <https://github.com/prometheus/alertmanager>
    -type: application
     version: 1.26.0
    ...
    c
    m
    • 3
    • 2
  • s

    steep-petabyte-14152

    09/10/2025, 6:52 PM
    Hello everyone, I'm new to Rancher and am evaluating how to implement it for managing Kubernetes clusters. I have a few questions and would appreciate your guidance: Viewing pod and node metrics: In the Rancher interface, I see basic node and cluster information, but I can't see detailed pod metrics (CPU, memory, disk usage, etc.). The clusters already have Prometheus and Grafana installed via Helm, but not from Rancher. Are there any additional steps required for Rancher to display these metrics? Namespace-level permissions: I've given a developer full namespace-level permissions, but they still can't see pod metrics in their namespace. Are any additional permissions or specific configurations required for them to view these metrics from Rancher? I welcome any guidance, configuration examples, or best practices for enabling pod monitoring by namespace in Rancher when using external Prometheus/Grafana.
    b
    • 2
    • 6
  • f

    faint-policeman-5206

    09/11/2025, 7:52 AM
    Does anyone else have an excessively slow Metrics tab since 2.6? (I'm on 2.11.3) (on workloads, mostly Deloyments in my case) It takes several minutes to load (if it loads at all) Opening the Grafana and finding the workload from the monitoring tab is quicker. (the Grafana link from metrics is nice, but that only shows once the metrics loaded)
    h
    • 2
    • 1
  • v

    victorious-action-31342

    09/11/2025, 12:12 PM
    Hi. While testing created cluster without network and ssh keys. I tried deleting cluster, but seems nothing happens. Any ideas on how to solve this?
  • w

    wide-author-88664

    09/11/2025, 2:52 PM
    Hi all, I'm trying to import a cluster (K3s) into Rancher, that is on a DMZ network that does not allow communications from the cluster to the Rancher server. The Rancher server can however reach the cluster I'm trying to import. I modified the manifest that installs the Rancher mgmt components on the cluster, to have the
    cattle-cluster-agent
    be able to run without being able to contact the Rancher server. However, the cluster import in Rancher still shows as "Provisioning". Is there a way to let Rancher import/manage a cluster, where the cluster cannot reach the Rancher server due to a firewall, but Rancher can reach the cluster API to be imported?
    c
    • 2
    • 4
  • w

    wide-author-88664

    09/11/2025, 2:55 PM
    I see a lot of these in the
    cattle-cluster-agent
    container logs:
    Thu Sep 11 14:52:40 UTC 2025: Cattle cluster agent running in passive mode for cluster import
  • s

    silly-animal-7734

    09/11/2025, 6:23 PM
    Hi everyone, I need some help understanding how to resolve an issue. I have Rancher installed via Docker and had Google SSO enabled, but it was linked to an account that no longer exists. I managed to recover the admin account and log in, but I can’t see almost any settings, and the cluster I had associated doesn’t appear. I can’t even import a new cluster, as the option doesn’t appear. Has anyone experienced this and knows how to fix it?
  • v

    victorious-action-31342

    09/12/2025, 8:02 AM
    Have anyone faced issue, that cloud_config is not executed fully on Rancher 2.12?
  • l

    little-lighter-65277

    09/12/2025, 2:28 PM
    Hi all, I have a special case. I have an rancher manager and RKE2 cluster with 3 nodes. The RKE2 cluster was made with a custom provider, which means I created 3 new Rocky Linux machines and then did the registration with generated curl command. My problem is when I would like to shrink the disk of the machines or change the underlying OSS, I need to regenerate the nodes. So I drain the node and delete them, then I create a new host and register it with the curl command. This is all OK until I come to the first controller node, which is the ETCD bootstrap nodes. When I remove the bootstrap node and switch the ETCD leader, when I add a new worker node, I don't get the right bootstrap IP address for the bootstrap service and can't register the new node to the rancher manager. Is there any suggestion on how to fix this problem or make any workaround to change or set the IP address when generating the curl command? Thank you for your awareness.
    👀 1
    m
    • 2
    • 3
  • f

    full-shoe-26526

    09/15/2025, 3:29 AM
    Hello everyone, I want to enable Rancher’s auditing feature, so I set the
    AUDIT_LEVEL
    variable. However, the number of API call logs is too high. I only want to audit the actions performed by administrators. Is there a good way to achieve this?
  • m

    magnificent-france-42174

    09/15/2025, 12:59 PM
    Hello everyone, I have a downstream cluster with a user who has Cluster Owner permissions. This user needs access to all metrics from the metrics-server, but currently they can only query all nodes and pods, not individual ones. I created a global cluster role that grants
    watch
    ,
    get
    , and
    list
    permissions on
    <http://metrics.k8s.io|metrics.k8s.io>
    and
    <http://management.cattle.io|management.cattle.io>
    with
    ranchermetrics
    .
  • a

    astonishing-stone-85106

    09/15/2025, 3:34 PM
    Hi All I am using rancher to build my downstream cluster.Untill now it is working fine but since few days the rancher gui stops responding -v1/management.cattle.io.settings hangs and it doesn't return return anything and it gets continuously spins I am using rancher v2.10.1 version
  • a

    astonishing-stone-85106

    09/15/2025, 3:38 PM
    This is the exact scenario. https://forums.suse.com/t/rancher-gui-stops-responding-v1-management-cattle-io-setting-hangs/43171
    h
    d
    • 3
    • 5
  • a

    astonishing-stone-85106

    09/15/2025, 3:38 PM
    Could someone help on this
  • p

    powerful-easter-15334

    09/17/2025, 4:37 AM
    Hi, this is for the Rancher UI. Is there a way to change how long events are kept?
    c
    • 2
    • 3
  • a

    ancient-dinner-76338

    09/17/2025, 6:55 AM
    Hello, I have Rancher and a downstream RKE2 cluster. Rancher is installed on 6 nodes (a k3s cluster), where 3 control planes are recorded in Cloudflare as the ingress-nginx-controller. The issue occurs when I run kubectl commands on the downstream RKE2 cluster. I performed a PostgreSQL restore in the downstream RKE2 cluster using kubectl, where the kubeconfig context endpoint is pointing to my Rancher website. The restore process got stuck, and Several Rancher nodes became unresponsive / experienced downtime, and I was unable to access the Rancher web UI. Does anyone have suggestions on how to handle this?