Hello - anyone got a few minutes to help me with T...
# general
w
Hello - anyone got a few minutes to help me with Terraform + RKE2 running on EC2 and setting up aws-cloud-controller-manager (specifically so that I can use aws-load-balancer-controller to launch an NLB) ?
b
Check out the https://github.com/rancher/terraform-aws-rke2 Terraform module
It can get you to RKE2 on EC2, then you will only need to focus on getting the cloud-controller-manager working, I assume using the kubernetes provider's manifest resource.
The module will deploy a network load balancer by default.
w
hey @bumpy-tomato-36167 - thanks for responding. So, a bit of background - i've got TF successfully building an RKE2 cluster on EC2. I've got aws-load-balancer-controller installed using rancher2_app_v2 resource
I just can't seem to plumb cloud-controller-manager in properly
can I paste you my TF code for rancher2_cluster_v2 and see if you can spot anything obviously wrong ?
b
sure
you must have Rancher installed as well, right ?
w
yep - that's all done and working beautifully
Untitled
at the moment, it spins up 6 nodes (3 CP, 3 workers), installs calico etc on one CP, but that CP never forms the first node of the cluster and hence a cluster isn't formed
to be honest, the only thing I care about is setting the provider ID - so not sure if I even need to do half of this
b
I am going to start with a few general suggestions and get deeper as I go along, ok?
w
perfect. right now i'll take anything !
b
my first thought is to use yamlencode(), or file() functions rather than the HEREDOC syntax. Yamlencode would allow you to write everything in HCL, where the file function would allow you have the yaml on its own for yaml validation in your CI
w
seems like a reasonable thing to do - although I can see by inspecting in rancher GUI, my settings are being applied so I think i'll keep them in HEREDOC format for now, and swap to a file once i've got the cluster behaving if that's cool with you ?
I need to drop now - but if you spot any thing obvious, can you put it in this slack thread and i'll pick it up tomorrow. Thanks for your help so far @bumpy-tomato-36167
b
I don't see anything obvious, but I haven't set up the cloud controller. I would log into the server and start checking the logs. Here are the files I check when I am having problems: ◦
/var/log/pods/kube-system_kube-apiserver-
tab complete ◦
/var/log/pods/kube-system_etcd-
tab complete ◦
/var/lib/rancher/rke2/agent/logs/kubelet.log
/var/lib/rancher/rke2/agent/containerd/containerd.log
There are some potential spacing issues: here:
Copy code
machine_selector_config {
      config = <<EOT
    kubelet-arg:
      - cloud-provider=external
    EOT
and here:
Copy code
# Control plane
    machine_selector_config {
      config = <<EOT
    disable-cloud-controller: true
    kubelet-arg:
      - cloud-provider=external
    kube-apiserver-arg:
      - cloud-provider=external
    kube-controller-manager-arg:
      - cloud-provider=external
    EOT
When you use
<<EOT
the following lines must be completely to the left. I would use
<<-EOT
to continue the spacing as you have it.
it might also help to take a look at the examples in the module I recently created for deploying Rancher: https://github.com/rancher/terraform-rancher2-aws