https://rancher.com/ logo
#general
Title
# general
a

abundant-hair-58573

01/12/2023, 3:22 PM
I'm running an RKE cluster in an air-gapped environment using ec2, currently we have cloud provider as none. I want to configure the cluster to use AWS as the cloud provider, when I change that config in cluster management I get stuck in provisioning with the kubelet on the control plane failing. It fails trying to describe ec2 instances with a "certificate signed by unknown authority" error. I checked /etc/ssl in the kubelet pod and our ca.crt is in there
c

creamy-architect-7387

01/12/2023, 11:57 PM
so kubelet upgrades fine if the cloud provider is none and only breaks when it's enabled?? unless you have something specific set for cloud config, RKE should just generate an empty cloud config file and start kubelet with cloud-provider=aws. these are the fields we allow setting...but don't see anything that could mess with CA https://github.com/rancher/rke/blob/release/v1.4/types/rke_types.go#L893
a

abundant-hair-58573

01/13/2023, 1:03 AM
Yes it works fine without the cloud provider set, it's been running for over a year like that. Just for some background... we need to configure the cloud provider due to some issues with using the cluster autoscaler. We scale up to about 50 workers during the day and then back down to 5 or 6 at night. The cluster autoscaler kills the EC2 instances but they hang around in a cordoned state in rancher until someone manually deletes them. I followed all of the steps here. When I saw the upgrade hanging on one of the control planes I ssh'd to it and confirmed I could run
aws ec2 describe-instances
from the controlplane itself, it's just the kubelet pod that's in a crash loop. I'm not in the office so I don't have the exact error, but it was basically trying to do a health check of itself, grabbing the ec2 instance id from the metadata server on the localhost, and then trying to do a describe instances to verify itself... I think. It sounds similar to what this person ran into here except we aren't using a proxy, we're in a completely air-gapped environment. It's basically this error, but the url is our air-gapped ec2 url
Copy code
I0609 20:43:31.841214    8058 aws.go:1180] Zone not specified in configuration file; querying AWS metadata service
F0609 20:43:33.365708    8058 server.go:273] failed to run Kubelet: could not init cloud provider "aws": error finding instance i-4324dfsdfdfd432a: "error listing AWS instances: \"RequestError: send request failed\\ncaused by: Post <https://ec2.us-east-1.amazonaws.com/>: x509: certificate signed by unknown authority\""
In the kubelet pod our ca-certificates.crt is in
/etc/ssl/certs
but I noticed there's a separate kube-ca.pem in
/etc/kubernetes/ssl
. Maybe it's trying to use the kube-ca.pem?
@creamy-architect-7387 do you know if there's something I could set in a cloud config file or could you point me to an example of one in RKE1?
27 Views