This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

11/18/2024, 7:19 PM

This message was deleted.

creamy-pencil-82913

11/18/2024, 7:24 PM

are you sure that networking between the different nodes is working properly? I suspect that coredns is now running on some of the new nodes, but your networking is broken and requests against coredns pods on the new nodes are failing.

orange-thailand-31683

11/18/2024, 7:26 PM

so I went to two hosts that didnt have any kubernetes instances running or installed, then ran the process to install the linux agents on the two hosts.

orange-thailand-31683

11/18/2024, 7:27 PM

in rancher, i see the two new worker nodes as part of the cluster

orange-thailand-31683

11/18/2024, 7:28 PM

and i dont see any coredns / dns related pods running in that node at all

orange-thailand-31683

11/18/2024, 7:29 PM

i do see rke2-canal running on both of the worker nodes, but nothing dns

orange-thailand-31683

11/18/2024, 7:37 PM

ah, now i see some on the other 2 nodes..

orange-thailand-31683

11/18/2024, 7:38 PM

sorry, on one of the new nodes, not both

orange-thailand-31683

11/18/2024, 7:40 PM

and yes, looks like all of the requests are hitting that coredns on the worker node - do i just terminate that pod and the networking should switch back to the master then? like should the worker nodes have coredns on them as well?

creamy-pencil-82913

11/18/2024, 7:41 PM

I would probably fix your networking, rather than playing whack a mole by killing pods and hoping they go back to different nodes.

orange-thailand-31683

11/18/2024, 7:41 PM

i didnt alter networking to start with. everything changed with the installation/introduction of the worker nodes

creamy-pencil-82913

11/18/2024, 7:42 PM

yes but clearly the network between your old nodes and your new nodes isn’t working, because you can’t access coredns when it’s running on the new nodes

orange-thailand-31683

11/18/2024, 7:45 PM

think it was an issue hiding up til now? this was a one node cluster prior to adding the worker nodes. or are you saying at the OS level? also should coredns be on each node? because i only see it on 2/3, so if it needs to be on all 3 thats another thing ill have to investigate

creamy-pencil-82913

11/18/2024, 7:50 PM

no, it doesn’t need to be on all the nodes. It should work no matter where it’s running. If DNS doesn’t work for pods on a node when there’s not a coredns replica running on that node, then networking between your nodes is broken - something is causing the packets to get dropped when it goes between nodes.

orange-thailand-31683

11/18/2024, 7:51 PM

roger that - just had me curious as the two new nodes were just added today, so this networking issue is new to me. im looking into it now. thanks brandon

orange-thailand-31683

11/18/2024, 7:55 PM

random question, just wanting to check the processes i see that were taken. we should be able to install the linux agent by itself, right? or does the server part need installed first THEN the worker part?

creamy-pencil-82913

11/18/2024, 7:57 PM

there is no such thing. There is just RKE2, and you run it as a server, or as an agent.

orange-thailand-31683

11/18/2024, 7:59 PM

okay thought so, just wanted to make sure i wasnt underthinking the situation

orange-thailand-31683

11/18/2024, 8:40 PM

FWIW: compared basic fundamental networking stuff at the OS level of the hosts to start with.. come to find out,

firewalld

was running on the two hosts i used as worker nodes.. disabling/turning it off solved the connectivity issues

creamy-pencil-82913

11/18/2024, 8:46 PM

that’d do it. you could also check the port requirements in the docs, and open those.

orange-thailand-31683

11/18/2024, 8:47 PM

yeah, that's the plan. this is the first project i have to be reliant on a separate IT team to do these things.. so, hacky for now, but it works!

28 Views

Open in Slack

Previous Next