This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

01/08/2025, 7:29 PM

This message was deleted.

creamy-pencil-82913

01/08/2025, 7:33 PM

Copy code

providerID <rke2://testjs2-worker-8bkz5-2dwvt> is invalid for EC2 instances

You need to deploy your cluster with the AWS cloud provider if you want to use the AWS LB controller. Either that or manually set the provider ID on your nodes via the kubelet arg. By default the RKE2 stub cloud provider will set the providerID to

<rke2://NODENAME>

which does not match the format that the AWS LB controller needs to look up the instance.

dazzling-bird-17431

01/08/2025, 7:53 PM

thx, do you have any pointers on how to proceed with your second suggestions? (manually set the provider ID)

creamy-pencil-82913

01/08/2025, 7:59 PM

https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ search for the

--provider-id

string. You’d need to determine what format AWS wants, and how to set it correctly on a per-node basis. It would probably be easier to build a new cluster with the correct cloud provider deployed, since the providerID cannot be changed on existing nodes, so you’ll have to rebuild anyway.

dazzling-bird-17431

01/08/2025, 8:00 PM

and for the 1st suggestion, are we talking about this? https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon

dazzling-bird-17431

01/08/2025, 8:28 PM

Do I get this correctly? This Amazon Cloud Provider needs to be installed on the rancher cluster (parent). If it's the case I think this won't work in my setup since Rancher is deployed in a on-prem env and not in AWS.

creamy-pencil-82913

01/08/2025, 8:31 PM

no. you need to deploy the cloud provider to the cluster that you want to use the LB controller on. Not on the Rancher management cluster.

dazzling-bird-17431

01/08/2025, 8:31 PM

ok, will give it a try

purple-match-66532

02/06/2025, 5:14 PM

I desperately need help accomplishing this when deploying an RKE2 Cluster on EC2. I've tried setting the cloud provider to default and to AWS when creating the cluster (with the drop-down in the screenshot) and no luck. I spin up the cluster, install the aws cni, and then create the ingress - but I get the same error as above.

purple-match-66532

02/06/2025, 5:14 PM

~~Any help HUGELY appreciated~~ 🙏

purple-match-66532

02/06/2025, 5:15 PM

Any help HUGELY appreciated! 🙏🙏

dazzling-bird-17431

02/06/2025, 6:02 PM

In my case I use terraform to deploy the downstream cluster to aws.

Copy code

resource "rancher2_cluster_v2" "rke2_cluster" {
  rke_config {
    ...
    additional_manifest   = <<EOF
---
apiVersion: <http://helm.cattle.io/v1|helm.cattle.io/v1>
kind: HelmChart
metadata:
  name: aws-cloud-controller-manager
  namespace: kube-system
spec:
  chart: aws-cloud-controller-manager
  repo: <https://kubernetes.github.io/cloud-provider-aws>
  targetNamespace: kube-system
  bootstrap: true
  valuesContent: |-
    hostNetworking: true
    nodeSelector:
      <http://node-role.kubernetes.io/control-plane|node-role.kubernetes.io/control-plane>: "true"
    args:
      - --configure-cloud-routes=false
      - --v=5
      - --cloud-provider=aws
EOF
    machine_global_config = <<EOF
      cloud-provider-name: aws
      EOF
    machine_selector_config {
      config = <<EOF
        disable-cloud-controller: true
        kube-apiserver-arg:
          - cloud-provider=external
        kube-controller-manager-arg:
          - cloud-provider=external
        kubelet-arg:
          - cloud-provider=external
      EOF
      machine_label_selector {
        match_expressions {
          key      = "<http://rke.cattle.io/control-plane-role|rke.cattle.io/control-plane-role>"
          operator = "In"
          values   = ["true"]
        }
      }
    }
    machine_selector_config {
      config = <<EOF
        kubelet-arg:
          - cloud-provider=external
      EOF
      machine_label_selector {
        match_expressions {
          key      = "<http://rke.cattle.io/worker-role|rke.cattle.io/worker-role>"
          operator = "In"
          values   = ["true"]
        }
      }
    }

Hope this helps Also, i'm using rke2 v1.31.3+rke2r1 I also had issues with older versions.

creamy-pencil-82913

02/06/2025, 6:05 PM

@purple-match-66532 you are using a pretty old version of Kubernetes. Why are you still on 1.26?

purple-match-66532

02/06/2025, 6:08 PM

Hmm...Seemed like it was the latest available. for RKE2 on EC2?

purple-match-66532

02/06/2025, 6:22 PM

Am I already off track? 😬

dazzling-bird-17431

02/06/2025, 6:23 PM

My rancher installation get's updated automatically, so I don't know how you ended up like this. I use the docker image and keep it updated.

purple-match-66532

02/06/2025, 6:33 PM

Interesting... taking a look at what version I'm using/how to get a newer one. I based mine on the quickstart terraform (https://ranchermanager.docs.rancher.com/getting-started/quick-start-guides/deploy-rancher-manager/aws)

purple-match-66532

02/06/2025, 6:42 PM

Alright, using the latest image from my computer heh

creamy-pencil-82913

02/06/2025, 6:56 PM

If 1.26 is the latest available version in the UI then you are also on an old version of Rancher. All of the Kubernetes versions you’re seeing listed there are end of life.

purple-match-66532

02/06/2025, 6:57 PM

On a new version on my local and following along here: https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon

purple-match-66532

02/06/2025, 6:57 PM

to deploy a modern k8s version to aws

creamy-pencil-82913

02/06/2025, 6:58 PM

What version of Rancher are you using and how did you install it?

dazzling-bird-17431

02/06/2025, 6:58 PM

https://ranchermanager.docs.rancher.com/getting-started/installation-and-upgrade/other-installation-methods/rancher-on-a-single-node-with-docker

purple-match-66532

02/06/2025, 6:58 PM

Previously with the aws quickstart terraform but this time with:

sudo docker run --privileged -d --restart=unless-stopped -p 80:80 -p 443:443 rancher/rancher

purple-match-66532

02/06/2025, 7:08 PM

Is the guide up to date? A bit of ambiguity in the guide for out-of-tree guide - Is it saying that I have to select Amazon in the dropdown and set those rkeConfig settings in the cluster deployment yaml before creating? And is it saying I then have to install the aws-cloud-controller-manager and then giving me 3 options to do so? (manifest, helm, ui?) Experimenting with the options here and will report back for posterity/anyone in the future - but appreciate any insights! and huge thanks for your time!

creamy-pencil-82913

02/06/2025, 7:23 PM

don’t use the rancher/rancher image, it is not supported and only designed for very lightweight proof of concept. You should deploy the Rancher helm chart to a cluster. K3s is a good choice if you must use Docker, as it will run a full Kubernetes cluster in Docker. Also, don’t use the latest tag - make sure you specify a version of that image if you are going to use it. We do not maintain the latest tag pointing at anything in particular, its probably out of date.

purple-match-66532

02/06/2025, 7:34 PM

Ok, using the helm chart against my cluster running with Rancher Desktop. For the out-of-provider steps to modify the YAML for etcd, Control Plane, and Worker - Do I just add all of these if all 3 nodes have all 3 (So three

- config

entries under

rkeConfig -> machineSelectorConfig

)? Sorry if that is a really dumb question

purple-match-66532

02/06/2025, 7:35 PM

Or are those meant to be patched after deployment of the cluster?

purple-match-66532

02/06/2025, 8:44 PM

Ok added the configs under there and used the rancher on my rancher desktop which is v2.10.1 (and k8s 1.31) - Seems like it might be stuck on "Configuring bootstrap node(s) dev-apps-dev-apps-pool-0-5scq8-scv6d: waiting for agent to check in and apply initial plan" One question regarding the cluster id step - when/where do we provide that to rancher? 🤔

creamy-pencil-82913

02/06/2025, 9:15 PM

you don’t

purple-match-66532

02/06/2025, 10:06 PM

Gotcha. Hmm definitely something wrong as its still stuck on "configuring bootstrap node(s) ***the_node_name***: waiting for agent to check in and apply initial plan" - investigating but followed https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/set-up-cloud-providers/amazon to a T

creamy-pencil-82913

02/06/2025, 10:07 PM

creamy-pencil-82913

02/06/2025, 10:07 PM

check rke2-server and rancher-system-agent logs in journald. do

kubectl get node -o wide

and see what it says.

purple-match-66532

02/06/2025, 10:10 PM

Re: the guide - am I supposed to select External or Amazon in the dropdown for Cloud Provider? (I have been adding all the rkeConfig stuff)

purple-match-66532

02/06/2025, 10:13 PM

Given Rancher 2.10.1 and K8s v1.31.3+rke2r1 - assumed External due to the warning at the top re: 1.27 or above needing to be out-of-tree - going to forgo the rkeConfig changes given this?

creamy-pencil-82913

02/06/2025, 10:45 PM

set it to aws and then override to external and deploy the chart as described in the docs.

purple-match-66532

02/06/2025, 10:59 PM

Ok going with that.. Question was due to this step: 2. `Select

Amazon

if relying on the above mechanism to set the provider ID. Otherwise, select External (out-of-tree) cloud provider, which sets

--cloud-provider=external

for Kubernetes components. And the message at the top: In Kubernetes 1.27 and later, you must use an out-of-tree AWS cloud provider. In-tree cloud providers have been deprecated.

purple-match-66532

02/07/2025, 1:49 AM

Ok, lots of progress. Deployed a new cluster as suggested above with:

Copy code

# Kubernetes version to use for Rancher server cluster
rancher_kubernetes_version = "v1.31.3+k3s1"

# Rancher server version (format: v0.0.0)
rancher_version = "2.10.1"

# Kubernetes version to use for managed workload cluster
workload_kubernetes_version = "v1.31.4+rke2r1"

Now I'm seeing: 1. Failures to pull from ECR - seemingly because its not using the instance profile for some reason and doesn't have credentials. 2. Failures from the AWS Cloud Controller Daemonset:

Copy code

Invalidating discovery information
 k8s.io/client-go@v0.27.0/tools/cache/reflector.go:231: forcing resync
 successfully renewed lease kube-system/cloud-controller-manager
 lock is held by pool-2-****** and has not yet expired
 failed to acquire lease kube-system/cloud-controller-manager
 lock is held by pool-2-****** and has not yet expired
 failed to acquire lease kube-system/cloud-controller-manager
 successfully renewed lease kube-system/cloud-controller-manager
 successfully renewed lease kube-system/cloud-controller-manager
 lock is held by pool-2-****** and has not yet expired
 failed to acquire lease kube-system/cloud-controller-manager

61 Views

Open in Slack

Previous Next