This message was deleted.
# harvester
a
This message was deleted.
b
What I’ve tried: • i’ve tried to configure kube-vip by specifing
.spec.values.kube-vip.env.vip_interface
=
mgmt-br.4000
- the VIP is deployed on the
mgmt-br
interface (most likely the interface is picked based on the default IP route)
• the internal ip of the node is identified incorrectly
Copy code
kubectl get node -o wide
NAME                 STATUS   ROLES                       AGE   VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION                 CONTAINER-RUNTIME
dedicated-server-1   Ready    control-plane,etcd,master   46h   v1.24.11+rke2r1   144.76.X.X   <none>        Harvester v1.1.2   5.14.21-150400.24.60-default   <containerd://1.6.15-k3s1>
g
seems like there is a lot going on.. i assume you specific vlan id when configuring the management interface?
also a support-bundle may be nice to try and figure out what is going on
b
yes, it's right, I've specified a VLAN I'd when configuring the mgmt interface
g
are you running this in Hetzner cloud?
b
@great-bear-19718 is there an option to export a support bundle when the harvester interface is down? (for example using kubectl)
g
not sure.. i will need to check..
you would have to collect the bundle manually
b
I'm deploying Harvester on dedicated servers in Hetzner, then I have a Mikrotik RouterOS instance in Hetzner Cloud
the connection between RouterOS and Harvester is made using a vSwitch (VLAN 4000) and because of a Hetzner limitation, this interface does not have internet
today I will try to make Calico identify the internal IP using a specific interface (mgmt-br.4000) and not by using the default route (which will resolve to internet)
because I saw an error saying that etcd contains 10.1.1.10 and that know is accessible at 144.76.X.X
g
ok
i am still not sure i completely understand the setup, but i can still try and help
b
https://docs.tigera.io/calico/latest/networking/ipam/ip-autodetection -for some reason, the node IP was 't changed after I changed Calico IP_AUTODETECTION_METHOD env variable to interface=mgmt-br.4000
I will do my best to give you all the details
Copy code
ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: enp7s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1400 qdisc mq master mgmt-bo state UP group default qlen 1000
    link/ether 7c:10:c9:21:f3:36 brd ff:ff:ff:ff:ff:ff
3: mgmt-br: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:10:c9:21:XX:XX brd ff:ff:ff:ff:ff:ff
    inet 144.76.XX.XX/27 brd 144.76.XX.XX scope global mgmt-br
       valid_lft forever preferred_lft forever
4: mgmt-bo: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1400 qdisc noqueue master mgmt-br state UP group default qlen 1000
    link/ether 7c:10:c9:21:f3:36 brd ff:ff:ff:ff:ff:ff
5: mgmt-br.4000@mgmt-br: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:10:c9:21:f3:36 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.10/24 brd 10.1.1.255 scope global mgmt-br.4000
       valid_lft forever preferred_lft forever
    inet 10.1.1.100/32 scope global mgmt-br.4000
       valid_lft forever preferred_lft forever
9: cali296fc1af7ea@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-7dbaa863-2bc9-a54c-007d-4794756b6189
....

ip r
default via 144.76.XX.XX dev mgmt-br proto dhcp
10.1.0.0/16 via 10.1.1.1 dev mgmt-br.4000
10.1.1.0/24 dev mgmt-br.4000 proto kernel scope link src 10.1.1.10
10.52.0.172 dev cali296fc1af7ea scope link
...
If I remove the default route and change the default route to mgmt-br then restart harvester, everything works.
Copy code
ip route delete default via 144.76.XX.XX dev mgmt-br proto dhcp
ip route add default via 10.1.1.1
- of course, we will make the changes persistent by saving them in
/etc/sysconfig/network/
Harvester config:
Copy code
managementinterface:
    interfaces:
    - name: enp7s0
      hwaddr: 7c:10:c9:21:XX:XX
    method: static
    ip: 10.1.1.10
    subnetmask: 255.255.255.0
    gateway: 10.1.1.1
    defaultroute: true
    bondoptions:
      miimon: "100"
      mode: active-backup
    mtu: 1400
    vlanid: 4000
  vip: 10.1.1.100
  viphwaddr: ""
  vipmode: static
Copy code
kubectl get node -o wide
NAME                 STATUS   ROLES                       AGE   VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION                 CONTAINER-RUNTIME
dedicated-server-1   Ready    control-plane,etcd,master   46h   v1.24.11+rke2r1   144.76.X.X   <none>        Harvester v1.1.2   5.14.21-150400.24.60-default   <containerd://1.6.15-k3s1>
The problem is that on 10.1.1.1 we don’t have internet access, that’s why I’ve set the default route via mgmt-br gateway
Copy code
journalctl --no-pager -u rke2-server.service -b -n 100
Jun 22 05:11:59 dedicated-server-1 rke2[31380]: time="2023-06-22T05:11:59Z" level=info msg="Defragmenting etcd database"
Jun 22 05:12:00 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:00Z" level=info msg="Failed to test data store connection: this server is a not a member of the etcd cluster. Found [dedicated-server-1-9a503659=<https://10.1.1.10:2380>], expect: dedicated-server-1-9a503659=<https://144.76.XX.XX:2380>"
Jun 22 05:12:03 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:03Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Jun 22 05:12:04 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:04Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:9345/v1-rke2/readyz>: 500 Internal Server Error"
Jun 22 05:12:05 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:05Z" level=info msg="Defragmenting etcd database"
Jun 22 05:12:05 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:05Z" level=info msg="Failed to test data store connection: this server is a not a member of the etcd cluster. Found [dedicated-server-1-9a503659=<https://10.1.1.10:2380>], expect: dedicated-server-1-9a503659=<https://144.76.XX.XX:2380>"
I’ve managed to solve the issue by adding:
Copy code
- "content": |
        node-ip: 10.1.1.10
        node-external-ip: 144.76.XX.XX
      "encoding": ""
      "group": 0
      "owner": 0
      "ownerstring": ""
      "path": "/etc/rancher/rke2/config.yaml.d/90-harvester-network.yaml"
      "permissions": 384