This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

03/14/2023, 12:38 AM

This message was deleted.

creamy-pencil-82913

03/14/2023, 12:39 AM

This sounds like an issue with Rancher, not with RKE2?

creamy-pencil-82913

03/14/2023, 12:41 AM

That said…

Copy code

ERROR: <https://rancher75182.senode.dev/ping> is not accessible (Could not resolve host: rancher75182.senode.dev)

this hostname suggest that you are using a private DNS zone for your Rancher server. Can you confirm that the resolv.conf file on your nodes is properly configured to point at that? Check the RKE2 logs for a message about using 8.8.8.8 instead of your private DNS server.

creamy-pencil-82913

03/14/2023, 12:42 AM

Host resolv.conf includes loopback or multicast nameservers - kubelet will use autogenerated resolv.conf with nameserver 8.8.8.8

stale-painting-80203

03/14/2023, 12:42 AM

yes. I can ping the rancher server from all nodes on the downstream cluster

creamy-pencil-82913

03/14/2023, 12:42 AM

^^ itll look like that

creamy-pencil-82913

03/14/2023, 12:43 AM

make sure that your resolv.conf doesn’t point at multicast or loopback resolvers

creamy-pencil-82913

03/14/2023, 12:43 AM

If you can’t resolve private hostnames from within pods, RKE2 falling back to 8.8.8.8 is the most likely reason why

stale-painting-80203

03/14/2023, 12:46 AM

mine only has the nameserver pointing to an ipaddr

stale-painting-80203

03/14/2023, 12:46 AM

everything else is commented out

stale-painting-80203

03/14/2023, 3:14 AM

Even rebooting the cluster VMs on a working cluster causes the same issue and the cluster never comes back up again. Same pods go into a CrashLoop

creamy-pencil-82913

03/14/2023, 3:22 AM

Is the name server not reachable from within pods? What do the coredns pod logs say?

stale-painting-80203

03/14/2023, 4:03 AM

coredns pods have the right nameserver and can access from there. Unable to exec omnt the pods which are in a crashloop

stale-painting-80203

03/14/2023, 4:07 AM

seems bazar that a working cluster is unable to establish connection with a rancher after the cluster VMs were rebooted

creamy-pencil-82913

03/14/2023, 4:54 AM

So just to confirm, the coredns pods can resolve the rancher server address but the rancher agent pods cannot? How did you test that it can be resolved by coredns?

stale-painting-80203

03/14/2023, 5:04 AM

Recreated a fresh cluster and collected more data. It seems on cluster create the cluster-agent pods successfully come up even though they are unable to ping rancher. Cluster also connects to rancher. One has an error on the certificate and the other cannot reach it. After rebooting the cluster, the cluster-agent pods go in a crashloop

cluster-agent-error.txt

stale-painting-80203

03/14/2023, 5:10 AM

also from my local machine or VMs, I don't see any issues with rancher ping:

Copy code

curl <https://rancher75182.senode.dev/ping>
pong

stale-painting-80203

03/14/2023, 5:19 AM

Also to answer your question regarding coredns, Yes. exec into coredns and curl google.com or my rancher instance and it works, but not from within the cluster-agents

stale-painting-80203

03/14/2023, 5:25 AM

coredns IPs are 10.42.247.129 and 10.42.213.131, but the cluster-agent resolv.conf has a nameserver 10.43.0.10. Is that correct?

creamy-pencil-82913

03/14/2023, 9:01 PM

yes, that is the dns service address. Can you show me how specifically you are testing resolution from within the coredns pods?

stale-painting-80203

03/14/2023, 9:02 PM

Just doing a curl as shown above.

creamy-pencil-82913

03/14/2023, 9:03 PM

You said you were doing that from the local machine or VM, not inside the coredns pod

stale-painting-80203

03/14/2023, 9:05 PM

Both. I also created a new cluster on the same server as Rancher and it seems to work fine. I am able to restart the cluster and it comes back up. I suspect it’s some networking issue on the Harvester server on which the cluster fails to come up

stale-painting-80203

03/14/2023, 9:05 PM

After a restart

creamy-pencil-82913

03/14/2023, 9:07 PM

hmm

300 Views

Open in Slack

Previous Next