This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

11/23/2023, 11:24 AM

This message was deleted.

cold-france-44111

11/23/2023, 4:42 PM

https://github.com/kubernetes/autoscaler/issues/5595

cold-france-44111

11/23/2023, 4:43 PM

If you ever get it working please let me know!

creamy-pencil-82913

11/23/2023, 5:46 PM

You need to deploy the AWS cloud provider, and disable the default rke2 cloud provider. You will also need to delete and re-register all your nodes after doing that, as the provider id cannot be changed after it is set.

alert-king-94787

11/23/2023, 5:48 PM

@creamy-rainbow-46562

creamy-rainbow-46562

11/30/2023, 2:13 AM

Well, some time later.... Almost all the stuff works after the recreation of the cluster.... In the follow blocks you can find the relevant code for this case.... The terraform module invocation to create the cluster:

Copy code

resource "rancher2_cluster_v2" "cluster" {
  ...
  rke_config {
    machine_selector_config {
      config = {
        #<https://docs.rke2.io/reference/server_config>
        cloud-provider-name = "aws"
        kubelet-arg                 = "cloud-provider=aws" #I tried it with "external" but the cluster didn't starts
        kube-apiserver-arg          = "cloud-provider=aws" #I tried it with "external" but the cluster didn't starts
        kube-controller-manager-arg = "cloud-provider=aws" #I tried it with "external" but the cluster didn't starts
      }
    }
    ...
  }
  ...
}

Then the terraform invocation to install the AWS cloud provider helm:

Copy code

data "template_file" "helm-ccm-values-template-file" {
  template = file("${path.module}/helm/ccm.yaml")
}

resource "helm_release" "aws-cloud-controller-manager" {
  name              = "aws-cloud-controller-manager"
  namespace         = "kube-system"
  repository        = "<https://kubernetes.github.io/cloud-provider-aws>"
  chart             = "aws-cloud-controller-manager"
  create_namespace  = false
  wait              = true
  version           = "0.0.8"

  values = [
    data.template_file.helm-ccm-values-template-file.rendered
  ]
}

In this ccm.yaml I just keep almos everithing as the default, but I set also the cloud provider

Copy code

namespace: "kube-system"

args:
  - --v=2
  - --cloud-provider=aws

...

And I did the same with the autoscaler:

Copy code

data "template_file" "helm-as-values-template-file" {
  template = file("${path.module}/helm/as.yaml")
  
  vars = {
       # a lot of replaces that is important for me, including the ASG names to watch and so on
  }
}

resource "helm_release" "autoscaler" {
  name              = "cluster-autoscaler"
  namespace         = "kube-system"
  repository        = "<https://kubernetes.github.io/autoscaler>"
  chart             = "cluster-autoscaler"
  create_namespace  = false
  wait              = true
  version           = "9.32.0"

  values = [
    data.template_file.helm-as-values-template-file.rendered
  ]
}

And with that, the things works.... well.. almost all the things works.... The AWS ASG, with the tags in the right place and etc can be watched by the k8s autoscaler and knows when a instance dies and is replaced, removing it from the cluster. HOWEVER, the "ghost" of the machine remains.... like the follow image... stucking the status of the cluster as "updating" and locking it for any other changes.... version k8s: v1.26.8+rke2r1 version rancher: 2.7.6

creamy-rainbow-46562

11/30/2023, 2:15 AM

Did you guys have any idea what I can possibly missing?

limited-belgium-31201

12/01/2023, 7:56 PM

I am interested to know how to get rid of ghost machines in rke2. I have AWS ASG setup to provision new downstream node and when I scale down worker nodes, ec2 and rancher will clean up the node fast and clean. But, then the machine remains as node not found. I have to manually delete the machine

kubectl delete machine xyz -n fleet-default

in local cluster. What’s the reason and best way I can clean up ghost machines from ASG scaling down?

7 Views

Open in Slack

Previous Next