This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

06/21/2023, 10:28 AM

This message was deleted.

blue-kitchen-51801

06/21/2023, 10:30 AM

What I’ve tried: • i’ve tried to configure kube-vip by specifing

.spec.values.kube-vip.env.vip_interface

mgmt-br.4000

- the VIP is deployed on the

mgmt-br

interface (most likely the interface is picked based on the default IP route)

blue-kitchen-51801

06/21/2023, 5:36 PM

• the internal ip of the node is identified incorrectly

Copy code

kubectl get node -o wide
NAME                 STATUS   ROLES                       AGE   VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION                 CONTAINER-RUNTIME
dedicated-server-1   Ready    control-plane,etcd,master   46h   v1.24.11+rke2r1   144.76.X.X   <none>        Harvester v1.1.2   5.14.21-150400.24.60-default   <containerd://1.6.15-k3s1>

great-bear-19718

06/21/2023, 11:23 PM

seems like there is a lot going on.. i assume you specific vlan id when configuring the management interface?

great-bear-19718

06/21/2023, 11:23 PM

also a support-bundle may be nice to try and figure out what is going on

blue-kitchen-51801

06/22/2023, 4:50 AM

yes, it's right, I've specified a VLAN I'd when configuring the mgmt interface

great-bear-19718

06/22/2023, 4:51 AM

are you running this in Hetzner cloud?

blue-kitchen-51801

06/22/2023, 4:51 AM

@great-bear-19718 is there an option to export a support bundle when the harvester interface is down? (for example using kubectl)

great-bear-19718

06/22/2023, 4:51 AM

not sure.. i will need to check..

great-bear-19718

06/22/2023, 4:51 AM

you would have to collect the bundle manually

blue-kitchen-51801

06/22/2023, 4:51 AM

I'm deploying Harvester on dedicated servers in Hetzner, then I have a Mikrotik RouterOS instance in Hetzner Cloud

blue-kitchen-51801

06/22/2023, 4:52 AM

the connection between RouterOS and Harvester is made using a vSwitch (VLAN 4000) and because of a Hetzner limitation, this interface does not have internet

blue-kitchen-51801

06/22/2023, 4:54 AM

today I will try to make Calico identify the internal IP using a specific interface (mgmt-br.4000) and not by using the default route (which will resolve to internet)

blue-kitchen-51801

06/22/2023, 4:55 AM

because I saw an error saying that etcd contains 10.1.1.10 and that know is accessible at 144.76.X.X

great-bear-19718

06/22/2023, 4:56 AM

great-bear-19718

06/22/2023, 4:57 AM

i am still not sure i completely understand the setup, but i can still try and help

blue-kitchen-51801

06/22/2023, 4:58 AM

https://docs.tigera.io/calico/latest/networking/ipam/ip-autodetection -for some reason, the node IP was 't changed after I changed Calico IP_AUTODETECTION_METHOD env variable to interface=mgmt-br.4000

blue-kitchen-51801

06/22/2023, 4:58 AM

I will do my best to give you all the details

blue-kitchen-51801

06/22/2023, 5:10 AM

Copy code

ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: enp7s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1400 qdisc mq master mgmt-bo state UP group default qlen 1000
    link/ether 7c:10:c9:21:f3:36 brd ff:ff:ff:ff:ff:ff
3: mgmt-br: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:10:c9:21:XX:XX brd ff:ff:ff:ff:ff:ff
    inet 144.76.XX.XX/27 brd 144.76.XX.XX scope global mgmt-br
       valid_lft forever preferred_lft forever
4: mgmt-bo: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1400 qdisc noqueue master mgmt-br state UP group default qlen 1000
    link/ether 7c:10:c9:21:f3:36 brd ff:ff:ff:ff:ff:ff
5: mgmt-br.4000@mgmt-br: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default qlen 1000
    link/ether 7c:10:c9:21:f3:36 brd ff:ff:ff:ff:ff:ff
    inet 10.1.1.10/24 brd 10.1.1.255 scope global mgmt-br.4000
       valid_lft forever preferred_lft forever
    inet 10.1.1.100/32 scope global mgmt-br.4000
       valid_lft forever preferred_lft forever
9: cali296fc1af7ea@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-7dbaa863-2bc9-a54c-007d-4794756b6189
....

ip r
default via 144.76.XX.XX dev mgmt-br proto dhcp
10.1.0.0/16 via 10.1.1.1 dev mgmt-br.4000
10.1.1.0/24 dev mgmt-br.4000 proto kernel scope link src 10.1.1.10
10.52.0.172 dev cali296fc1af7ea scope link
...

If I remove the default route and change the default route to mgmt-br then restart harvester, everything works.

Copy code

ip route delete default via 144.76.XX.XX dev mgmt-br proto dhcp
ip route add default via 10.1.1.1

- of course, we will make the changes persistent by saving them in

/etc/sysconfig/network/

Harvester config:

Copy code

managementinterface:
    interfaces:
    - name: enp7s0
      hwaddr: 7c:10:c9:21:XX:XX
    method: static
    ip: 10.1.1.10
    subnetmask: 255.255.255.0
    gateway: 10.1.1.1
    defaultroute: true
    bondoptions:
      miimon: "100"
      mode: active-backup
    mtu: 1400
    vlanid: 4000
  vip: 10.1.1.100
  viphwaddr: ""
  vipmode: static

Copy code

kubectl get node -o wide
NAME                 STATUS   ROLES                       AGE   VERSION           INTERNAL-IP      EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION                 CONTAINER-RUNTIME
dedicated-server-1   Ready    control-plane,etcd,master   46h   v1.24.11+rke2r1   144.76.X.X   <none>        Harvester v1.1.2   5.14.21-150400.24.60-default   <containerd://1.6.15-k3s1>

The problem is that on 10.1.1.1 we don’t have internet access, that’s why I’ve set the default route via mgmt-br gateway

blue-kitchen-51801

06/22/2023, 5:12 AM

Copy code

journalctl --no-pager -u rke2-server.service -b -n 100
Jun 22 05:11:59 dedicated-server-1 rke2[31380]: time="2023-06-22T05:11:59Z" level=info msg="Defragmenting etcd database"
Jun 22 05:12:00 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:00Z" level=info msg="Failed to test data store connection: this server is a not a member of the etcd cluster. Found [dedicated-server-1-9a503659=<https://10.1.1.10:2380>], expect: dedicated-server-1-9a503659=<https://144.76.XX.XX:2380>"
Jun 22 05:12:03 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:03Z" level=info msg="Tunnel server egress proxy waiting for runtime core to become available"
Jun 22 05:12:04 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:04Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: <https://127.0.0.1:9345/v1-rke2/readyz>: 500 Internal Server Error"
Jun 22 05:12:05 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:05Z" level=info msg="Defragmenting etcd database"
Jun 22 05:12:05 dedicated-server-1 rke2[31380]: time="2023-06-22T05:12:05Z" level=info msg="Failed to test data store connection: this server is a not a member of the etcd cluster. Found [dedicated-server-1-9a503659=<https://10.1.1.10:2380>], expect: dedicated-server-1-9a503659=<https://144.76.XX.XX:2380>"

blue-kitchen-51801

06/22/2023, 7:37 AM

I’ve managed to solve the issue by adding:

Copy code

- "content": |
        node-ip: 10.1.1.10
        node-external-ip: 144.76.XX.XX
      "encoding": ""
      "group": 0
      "owner": 0
      "ownerstring": ""
      "path": "/etc/rancher/rke2/config.yaml.d/90-harvester-network.yaml"
      "permissions": 384

91 Views

Open in Slack

Previous Next