adamant-kite-43734
11/23/2023, 11:24 AMcold-france-44111
11/23/2023, 4:42 PMcold-france-44111
11/23/2023, 4:43 PMcreamy-pencil-82913
11/23/2023, 5:46 PMalert-king-94787
11/23/2023, 5:48 PMcreamy-rainbow-46562
11/30/2023, 2:13 AMresource "rancher2_cluster_v2" "cluster" {
...
rke_config {
machine_selector_config {
config = {
#<https://docs.rke2.io/reference/server_config>
cloud-provider-name = "aws"
kubelet-arg = "cloud-provider=aws" #I tried it with "external" but the cluster didn't starts
kube-apiserver-arg = "cloud-provider=aws" #I tried it with "external" but the cluster didn't starts
kube-controller-manager-arg = "cloud-provider=aws" #I tried it with "external" but the cluster didn't starts
}
}
...
}
...
}
Then the terraform invocation to install the AWS cloud provider helm:
data "template_file" "helm-ccm-values-template-file" {
template = file("${path.module}/helm/ccm.yaml")
}
resource "helm_release" "aws-cloud-controller-manager" {
name = "aws-cloud-controller-manager"
namespace = "kube-system"
repository = "<https://kubernetes.github.io/cloud-provider-aws>"
chart = "aws-cloud-controller-manager"
create_namespace = false
wait = true
version = "0.0.8"
values = [
data.template_file.helm-ccm-values-template-file.rendered
]
}
In this ccm.yaml I just keep almos everithing as the default, but I set also the cloud provider
namespace: "kube-system"
args:
- --v=2
- --cloud-provider=aws
...
And I did the same with the autoscaler:
data "template_file" "helm-as-values-template-file" {
template = file("${path.module}/helm/as.yaml")
vars = {
# a lot of replaces that is important for me, including the ASG names to watch and so on
}
}
resource "helm_release" "autoscaler" {
name = "cluster-autoscaler"
namespace = "kube-system"
repository = "<https://kubernetes.github.io/autoscaler>"
chart = "cluster-autoscaler"
create_namespace = false
wait = true
version = "9.32.0"
values = [
data.template_file.helm-as-values-template-file.rendered
]
}
And with that, the things works.... well.. almost all the things works....
The AWS ASG, with the tags in the right place and etc can be watched by the k8s autoscaler and knows when a instance dies and is replaced, removing it from the cluster.
HOWEVER, the "ghost" of the machine remains.... like the follow image... stucking the status of the cluster as "updating" and locking it for any other changes....
version k8s: v1.26.8+rke2r1
version rancher: 2.7.6creamy-rainbow-46562
11/30/2023, 2:15 AMlimited-belgium-31201
12/01/2023, 7:56 PMkubectl delete machine xyz -n fleet-default
in local cluster.
What’s the reason and best way I can clean up ghost machines from ASG scaling down?