https://rancher.com/ logo
#general
Title
# general
a

adamant-kite-43734

11/17/2022, 7:34 PM
This message was deleted.
d

damp-painting-69352

11/17/2022, 8:33 PM
Yes i've got this working, what are you stuck on exactly
b

blue-needle-61113

11/18/2022, 8:49 AM
Hi Zach, Let’s assume I want to deploy basic cluster configuration. 3 pools - 1 for control plane, 1 for etcd and 1 worker. Not for production just for test. My problem is that only control plane and etcd nodes are up and connected to rancher. Worker node has rancher-system-agent but log is stopped at this:
Copy code
root@test-node-selectors-pool1-6fd7de46-mbhng:/home/ubuntu# journalctl -u rancher-system-agent
-- Logs begin at Thu 2022-11-17 12:14:12 UTC, end at Thu 2022-11-17 20:19:42 UTC. --
Nov 17 12:15:47 test-node-selectors-pool1-6fd7de46-mbhng systemd[1]: Started Rancher System Agent.
Nov 17 12:15:47 test-node-selectors-pool1-6fd7de46-mbhng rancher-system-agent[1155]: time="2022-11-17T12:15:47Z" level=info msg="Rancher System Agent version v0.2.13 (4fa9427) is startin>
Nov 17 12:15:47 test-node-selectors-pool1-6fd7de46-mbhng rancher-system-agent[1155]: time="2022-11-17T12:15:47Z" level=info msg="Using directory /var/lib/rancher/agent/work for work"
Nov 17 12:15:47 test-node-selectors-pool1-6fd7de46-mbhng rancher-system-agent[1155]: time="2022-11-17T12:15:47Z" level=info msg="Starting remote watch of plans"
Nov 17 12:15:47 test-node-selectors-pool1-6fd7de46-mbhng rancher-system-agent[1155]: time="2022-11-17T12:15:47Z" level=info msg="Starting /v1, Kind=Secret controller"
When I deploy node with all roles in every node, I can explore the cluster but it is stucked in “Updated” status, never comming to Active.
d

damp-painting-69352

11/21/2022, 2:37 PM
Ensure firewall ports are configured between nodes on your template. Also grep the logs for "image" to check for any image pull failures
b

blue-needle-61113

11/23/2022, 9:29 PM
Hi Zach, I totally removed firewall for testing purposes. I checked the logs and nothing about images. I can see that there is a problem with Worker role in general. For example if I setup node as a Worker it has only rancher-system-agent sentinel process up and running in relation to rke2. As far as I can see it should pull images and run agent - nothing like that happens
d

damp-painting-69352

11/23/2022, 9:55 PM
the rancher-system-agent process does need to pull the installation image for rke2 from a remote repo
that is a pull always command 😞
rancher/system-agent-installer-rke2:v1.23.10-rke2r1 this is example image it is trying to pull
you can check the logs for the server
journalctl -flu rancher-system-agent
and see if you find anything about the image pulling issue, otherwise you probably wont see any action on rke2 as it is not installed yet
i think the rancher-system-agent logs only shows the pull failure once, so it pretty hard to find
b

blue-needle-61113

11/23/2022, 10:00 PM
hi Zach
thats my entire log from the agent
Copy code
root@test2211-3-worker-299cecfb-mf4m4:/home/ubuntu# journalctl -flu rancher-system-agent
-- Logs begin at Tue 2022-11-22 11:15:11 UTC. --
Nov 22 11:15:20 test2211-3-worker-299cecfb-mf4m4 systemd[1]: Started Rancher System Agent.
Nov 22 11:15:20 test2211-3-worker-299cecfb-mf4m4 rancher-system-agent[1234]: time="2022-11-22T11:15:20Z" level=info msg="Rancher System Agent version v0.2.13 (4fa9427) is starting"
Nov 22 11:15:20 test2211-3-worker-299cecfb-mf4m4 rancher-system-agent[1234]: time="2022-11-22T11:15:20Z" level=info msg="Using directory /var/lib/rancher/agent/work for work"
Nov 22 11:15:20 test2211-3-worker-299cecfb-mf4m4 rancher-system-agent[1234]: time="2022-11-22T11:15:20Z" level=info msg="Starting remote watch of plans"
Nov 22 11:15:21 test2211-3-worker-299cecfb-mf4m4 rancher-system-agent[1234]: time="2022-11-22T11:15:21Z" level=info msg="Starting /v1, Kind=Secret controller"
d

damp-painting-69352

11/23/2022, 10:02 PM
if you check in /var/lib/rancher/rke2/ are there folders
b

blue-needle-61113

11/23/2022, 10:02 PM
only on not Workers node
on the worker role node I’ve got :
d

damp-painting-69352

11/23/2022, 10:04 PM
okay that agent folder is the rancher-system-agent
so since you don't see the other folders rke2 hasn't installed and the container hasn't pulled
b

blue-needle-61113

11/23/2022, 10:05 PM
do you have any idea why Workers are the only affected
?
I tried a few versions of rancher
d

damp-painting-69352

11/23/2022, 10:05 PM
depending on how you are trying to provision them... yes there could be many reason
it's probably not the version of rancher that is the issue
if you have gotten this far i would check cloud-init logs next
to make sure it is getting the user-data from the iso
next is the node-pool. it could be an improper setting somewhere in teh cloud-init for that patricular node-pool
anyway, catch me next week if you still haven't figured it out, Happy thanksgiving
b

blue-needle-61113

11/23/2022, 10:10 PM
ahh ok, Thank you very much for that help, enjoy your time, I’m from Poland so I forgot about Thanksgiving
d

damp-painting-69352

11/23/2022, 10:10 PM
np, yep silly American holiday. Hope you find that needle in the haystack
b

blue-needle-61113

11/23/2022, 10:11 PM
I dont think its silly, tradition is tradition, good time with family, friends, and head out of computers 🙂 Enjoy and thanks again
❤️ 1
s

square-policeman-85866

12/02/2022, 5:13 PM
Hi, I’m gettin the exact same issue building an rke2 cluster on vsphere. Where you able to figure out the issue?
d

damp-painting-69352

12/05/2022, 3:59 PM
@square-policeman-85866 Can you provide logs, or screenshots of the issue
s

square-policeman-85866

12/13/2022, 4:21 PM
Hi Zach. Is there a specific log I can look into ?
d

damp-painting-69352

12/13/2022, 4:22 PM
You should create a new thread and include logs of the issue you are seeing. No idea what this thread is about at this point
s

square-policeman-85866

12/13/2022, 4:23 PM
Will do thanks
14 Views