This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

07/05/2023, 8:07 PM

This message was deleted.

great-bear-19718

07/05/2023, 11:55 PM

likely not a master node.. you may want to check from the first node

great-bear-19718

07/05/2023, 11:55 PM

kubectl get pods -A -o wide

to see what is going on

great-bear-19718

07/05/2023, 11:56 PM

also

journalctl -fu rancherd

may tell us if something went wrong

great-bear-19718

07/05/2023, 11:56 PM

followed by

journalctl -fu rke2-agent

busy-photographer-15014

07/06/2023, 3:21 AM

@great-bear-19718 So, on this new node, rke2 for some inexplicable reason was not installed. The /etc/rancher/rke2/rke2.yml file is not present. Installation did not show any errors.

busy-photographer-15014

07/06/2023, 3:22 AM

The master node remains accessible, 100% operational.

busy-photographer-15014

07/06/2023, 3:23 AM

The difference between the two installations is that on this node in question, I selected the option to join an existing cluster. The VIP was successfully validated at the network level, everything seems ok in the installation console, which was very fast by the way.

busy-photographer-15014

07/06/2023, 3:24 AM

Do I really need to install a 3rd node for the second one to be available? I remember seeing someone on github talking about an old bug in this regard. But the error was different.

great-bear-19718

07/06/2023, 3:47 AM

2nd should become available

great-bear-19718

07/06/2023, 3:47 AM

check the status of rke2-agent service

busy-photographer-15014

07/06/2023, 3:48 AM

Rke2-agent is not install 😞

busy-photographer-15014

07/06/2023, 3:49 AM

I think that is join mode is a problem 😞😓

busy-photographer-15014

07/06/2023, 3:50 AM

@great-bear-19718 this mode recovery, he can repair this instalation? or it's necessary workround?

busy-photographer-15014

07/06/2023, 3:51 AM

This latest version Harvester. I must test rc-2?

great-bear-19718

07/06/2023, 3:52 AM

systemctl list-units --type=service

great-bear-19718

07/06/2023, 3:57 AM

it would be nice to see the output for this to see what is going on there

busy-photographer-15014

07/06/2023, 3:57 AM

Copy code

UNIT                                   LOAD   ACTIVE     SUB     JOB   DESCRIPTION
  apparmor.service                       loaded active     exited        Load AppArmor profiles
  auditd.service                         loaded active     running       Security Auditing Service
  augenrules.service                     loaded active     exited        auditd rules generation
  boot-sysctl.service                    loaded active     exited        Apply Kernel Variables for 5.14.21-150400.24.60-default from /boot
  cos-setup-boot.service                 loaded active     exited        cOS system configuration
  cos-setup-fs.service                   loaded active     exited        cOS system after FS setup
  dbus.service                           loaded active     running       D-Bus System Message Bus
  detect-part-label-duplicates.service   loaded active     exited        Detect if the system suffers from bsc#1089761
  dracut-shutdown.service                loaded active     exited        Restore /run/initramfs on shutdown
  getty@tty1.service                     loaded active     running       Getty on tty1
  haveged.service                        loaded active     running       Entropy Daemon based on the HAVEGE algorithm
  iscsi.service                          loaded active     exited        Login and scanning of iSCSI devices
  kbdsettings.service                    loaded active     exited        Apply settings from /etc/sysconfig/keyboard
  kmod-static-nodes.service              loaded active     exited        Create List of Static Device Nodes
  lvm2-monitor.service                   loaded active     exited        Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling
  rancherd.service                       loaded activating start   start Rancher Bootstrap
  rdma-load-modules@infiniband.service   loaded active     exited        Load RDMA modules from /etc/rdma/modules/infiniband.conf
  rdma-load-modules@rdma.service         loaded active     exited        Load RDMA modules from /etc/rdma/modules/rdma.conf
  rdma-load-modules@roce.service         loaded active     exited        Load RDMA modules from /etc/rdma/modules/roce.conf
  smartd.service                         loaded active     running       Self Monitoring and Reporting Technology (SMART) Daemon
  sshd.service                           loaded active     running       OpenSSH Daemon
  systemd-hwdb-update.service            loaded active     exited        Rebuild Hardware Database
  systemd-journal-catalog-update.service loaded active     exited        Rebuild Journal Catalog
  systemd-journal-flush.service          loaded active     exited        Flush Journal to Persistent Storage
  systemd-journald.service               loaded active     running       Journal Service
  systemd-logind.service                 loaded active     running       User Login Management
  systemd-modules-load.service           loaded active     exited        Load Kernel Modules
  systemd-random-seed.service            loaded active     exited        Load/Save Random Seed
  systemd-remount-fs.service             loaded active     exited        Remount Root and Kernel File Systems
  systemd-sysctl.service                 loaded active     exited        Apply Kernel Variables
  systemd-sysusers.service               loaded active     exited        Create System Users
  systemd-time-wait-sync.service         loaded active     exited        Wait Until Kernel Time Synchronized
  systemd-timesyncd.service              loaded active     running       Network Time Synchronization
  systemd-tmpfiles-setup-dev.service     loaded active     exited        Create Static Device Nodes in /dev
  systemd-tmpfiles-setup.service         loaded active     exited        Create Volatile Files and Directories
  systemd-udev-trigger.service           loaded active     exited        Coldplug All udev Devices
  systemd-udevd.service                  loaded active     running       Rule-based Manager for Device Events and Files
  systemd-update-done.service            loaded active     exited        Update is Completed
  systemd-update-utmp-runlevel.service   loaded inactive   dead    start Record Runlevel Change in UTMP
  systemd-update-utmp.service            loaded active     exited        Record System Boot/Shutdown in UTMP
  systemd-user-sessions.service          loaded active     exited        Permit User Sessions
  user-runtime-dir@0.service             loaded active     exited        User Runtime Directory /run/user/0
  user@0.service                         loaded active     running       User Manager for UID 0
  wicked.service                         loaded active     exited        wicked managed network interfaces
  wickedd-auto4.service                  loaded active     running       wicked AutoIPv4 supplicant service
  wickedd-dhcp4.service                  loaded active     running       wicked DHCPv4 supplicant service
  wickedd-dhcp6.service                  loaded active     running       wicked DHCPv6 supplicant service
  wickedd-nanny.service                  loaded active     running       wicked network nanny service
  wickedd.service                        loaded active     running       wicked network management service daemon

great-bear-19718

07/06/2023, 3:58 AM

journalctl -fu rancherd

busy-photographer-15014

07/06/2023, 4:00 AM

Copy code

Jul 06 03:59:21 vmh-cgh-eqn001p rancherd[3134]: time="2023-07-06T03:59:21Z" level=info msg="[stderr]: time=\"2023-07-06T03:59:21Z\" level=info msg=\"Probe [kubelet] is unhealthy\""
Jul 06 03:59:25 vmh-cgh-eqn001p rancherd[3134]: time="2023-07-06T03:59:25Z" level=info msg="[stderr]: time=\"2023-07-06T03:59:25Z\" level=info msg=\"Probe [kubelet] is healthy\""
Jul 06 03:59:25 vmh-cgh-eqn001p rancherd[3134]: time="2023-07-06T03:59:25Z" level=info msg="[stderr]: time=\"2023-07-06T03:59:25Z\" level=info msg=\"All probes are healthy\""
Jul 06 03:59:25 vmh-cgh-eqn001p rancherd[3134]: time="2023-07-06T03:59:25Z" level=info msg="Successfully Bootstrapped Rancher (v2.6.11/v1.24.11+rke2r1)"
Jul 06 03:59:25 vmh-cgh-eqn001p systemd[1]: rancherd.service: Deactivated successfully.
Jul 06 03:59:25 vmh-cgh-eqn001p systemd[1]: Finished Rancher Bootstrap.
Jul 06 03:59:26 vmh-cgh-eqn001p systemd[1]: Starting Rancher Bootstrap...
Jul 06 03:59:26 vmh-cgh-eqn001p rancherd[8092]: time="2023-07-06T03:59:26Z" level=info msg="System is already bootstrapped. To force the system to be bootstrapped again run with the --force flag"
Jul 06 03:59:26 vmh-cgh-eqn001p systemd[1]: rancherd.service: Deactivated successfully.
Jul 06 03:59:26 vmh-cgh-eqn001p systemd[1]: Finished Rancher Bootstrap.

busy-photographer-15014

07/06/2023, 4:01 AM

I discovery this problem!

great-bear-19718

07/06/2023, 4:02 AM

did you re-use this node?

busy-photographer-15014

07/06/2023, 4:04 AM

Not! For some reason, the /etc/rancher/rancherd/config.yaml file had the dns of my VIP and not the IP (I don't know how to explain it) I changed and restarted the rancherd.service and the node became available in the cluster

busy-photographer-15014

07/06/2023, 4:05 AM

However, on this second node, I still can't run the kubectl command, it still looks like it doesn't have the rke2 binaries

busy-photographer-15014

07/06/2023, 4:05 AM

Copy code

vmh-cgh-eqn001p:/home/rancher # kubectl get pods -n kube-system
W0706 04:02:12.184752   26687 loader.go:223] Config not found: /etc/rancher/rke2/rke2.yaml
The connection to the server localhost:8080 was refused - did you specify the right host or port?

great-bear-19718

07/06/2023, 4:07 AM

second node will not have the kubeconfig file.. you need to run this from the master node

✅ 1

great-bear-19718

07/06/2023, 4:07 AM

kubectl get nodes

on first node and see if it shows up there

busy-photographer-15014

07/06/2023, 4:08 AM

This two availlable!

busy-photographer-15014

07/06/2023, 4:10 AM

https://rancher-users.slack.com/archives/C01GKHKAG0K/p1688616422601249?thread_ts=1688587640.018669&cid=C01GKHKAG0K Ok @great-bear-19718 ! I understand he should have.

great-bear-19718

07/06/2023, 4:11 AM

looks like it joined 8m ago

✅ 1

🙌 1

great-bear-19718

07/06/2023, 4:11 AM

kubectl get pods -A

busy-photographer-15014

07/06/2023, 4:12 AM

I altered this file

/etc/rancher/rancherd/config.yaml

manually. If node restart, this conf is a lost?

great-bear-19718

07/06/2023, 4:13 AM

yes

😓 1

busy-photographer-15014

07/06/2023, 4:13 AM

https://rancher-users.slack.com/archives/C01GKHKAG0K/p1688616708117659?thread_ts=1688587640.018669&cid=C01GKHKAG0K Yeah. online all pods

great-bear-19718

07/06/2023, 4:13 AM

rancherd will not come into play once rke2 is initialised

busy-photographer-15014

07/06/2023, 4:15 AM

Ok. How do I make this setting permanent?

great-bear-19718

07/06/2023, 4:15 AM

/oem/90_custom.yaml

you will find the config for /etc/rancher/rancherd/config.yaml

🙌 1

great-bear-19718

07/06/2023, 4:15 AM

its a cloud-init.. you need to make changes there and it will persist across reboots

busy-photographer-15014

07/06/2023, 4:16 AM

Ah. Ok!

busy-photographer-15014

07/06/2023, 4:16 AM

I will modify and restart the server to test

busy-photographer-15014

07/06/2023, 4:19 AM

Once the modification is made, should I restart some service to make it effective before restarting the server?

great-bear-19718

07/06/2023, 4:19 AM

no service restart is needed

great-bear-19718

07/06/2023, 4:19 AM

OS will apply changes on reboot

busy-photographer-15014

07/06/2023, 4:19 AM

Ok!

busy-photographer-15014

07/06/2023, 4:29 AM

@great-bear-19718 you is my hero!

busy-photographer-15014

07/06/2023, 4:30 AM

Everything worked! Thanks for the strong support!

19 Views

Open in Slack

Previous Next