This message was deleted.
# harvester
a
This message was deleted.
a
few questions:
Copy code
is 192.168.0.17 the VIP/management-IP of Harvester cluster ?
is this cluster installed from ISO/PXE ?
is 192.168.0.17 ping-able ?
normally, no specific router config is needed
l
1. Yes 2. Yes 3. No After doing some digging, I tried accessing the NodePort and the ingress itself via
kubectl port-forward svc/ingress-expose -n kube-system :443
but I got the following error:
Copy code
error: error upgrading connection: unable to upgrade connection: pod not found ("rke2-ingress-nginx-controller-s9kp8_kube-system")
Doing
kubectl get pods -A
shows lot of crashes and errors:
Copy code
NAMESPACE                   NAME                                                     READY   STATUS                  RESTARTS         AGE
cattle-fleet-local-system   fleet-agent-75f5945649-jtbdr                             0/1     Unknown                 1                6d19h
cattle-fleet-system         fleet-controller-7bcbb6d755-5x2v2                        0/1     Unknown                 1                6d19h
cattle-fleet-system         gitjob-564b9c87dd-lt8kr                                  0/1     Unknown                 1                6d19h
cattle-system               harvester-cluster-repo-748879959-xn66h                   0/1     Unknown                 1                6d19h
cattle-system               rancher-59f9ff7bb7-c2ln8                                 0/1     Unknown                 1                6d19h
cattle-system               rancher-webhook-948d698dc-75w8q                          0/1     Unknown                 1                6d19h
cattle-system               system-upgrade-controller-74db4c9dd-f2j52                0/1     Unknown                 1                6d19h
harvester-system            defaultnet1-cd2080f8-mgnwj                               0/1     Completed               0                6d18h
harvester-system            harvester-5454f556dd-qhcmf                               0/1     Unknown                 2                6d19h
harvester-system            harvester-load-balancer-5489cc4cc9-lkddb                 0/1     Unknown                 1                6d19h
harvester-system            harvester-load-balancer-webhook-6f7c496f74-dpm9t         0/1     Unknown                 1                6d19h
harvester-system            harvester-network-controller-manager-cc74876c8-j4rcp     1/1     Running                 3 (33m ago)      6d19h
harvester-system            harvester-network-controller-rrmcp                       0/1     CrashLoopBackOff        17 (96s ago)     6d19h
harvester-system            harvester-network-webhook-859b796897-kkxdj               0/1     Unknown                 1                6d19h
harvester-system            harvester-node-disk-manager-9vkds                        0/1     CrashLoopBackOff        18 (2m10s ago)   6d19h
harvester-system            harvester-node-manager-ldxbq                             0/1     Unknown                 1                6d19h
harvester-system            harvester-pcidevices-controller-6bf9j                    0/1     Unknown                 1                6d18h
harvester-system            harvester-webhook-64bc6969f8-wmqth                       0/1     Unknown                 1                6d19h
harvester-system            helm-install-pcidevices-controller-d8b74                 0/1     Completed               0                6d18h
harvester-system            kube-vip-cr7vg                                           1/1     Running                 3 (33m ago)      6d19h
harvester-system            virt-api-7d684bf677-sbwx7                                0/1     Unknown                 1                6d19h
harvester-system            virt-controller-56b798bbf8-dw5cg                         0/1     Unknown                 1                6d19h
harvester-system            virt-controller-56b798bbf8-plwxx                         0/1     Unknown                 1                6d19h
harvester-system            virt-handler-8k6n9                                       0/1     Unknown                 1                6d19h
harvester-system            virt-operator-5b4cfdf9dc-t96p8                           0/1     Unknown                 1                6d19h
kube-system                 cloud-controller-manager-hv-1                            1/1     Running                 6 (32m ago)      6d19h
kube-system                 etcd-hv-1                                                1/1     Running                 3 (33m ago)      6d19h
kube-system                 harvester-snapshot-validation-webhook-6bbc68b6db-67nbk   0/1     Unknown                 1                6d19h
kube-system                 harvester-snapshot-validation-webhook-6bbc68b6db-wsj2n   0/1     Unknown                 1                6d19h
kube-system                 harvester-whereabouts-ksnmk                              0/1     CrashLoopBackOff        17 (100s ago)    6d19h
kube-system                 helm-install-rke2-canal-dz4sl                            0/1     Completed               0                6d19h
kube-system                 helm-install-rke2-coredns-kl4zs                          0/1     Completed               0                6d19h
kube-system                 helm-install-rke2-ingress-nginx-m8kj4                    0/1     Completed               0                6d19h
kube-system                 helm-install-rke2-metrics-server-mscsj                   0/1     Completed               0                6d19h
kube-system                 helm-install-rke2-multus-jhfq2                           0/1     Completed               0                6d19h
kube-system                 kube-apiserver-hv-1                                      1/1     Running                 3 (33m ago)      6d19h
kube-system                 kube-controller-manager-hv-1                             1/1     Running                 7 (32m ago)      6d19h
kube-system                 kube-scheduler-hv-1                                      1/1     Running                 4 (33m ago)      6d19h
kube-system                 rke2-canal-mbm2l                                         0/2     Init:CrashLoopBackOff   17 (95s ago)     6d19h
kube-system                 rke2-coredns-rke2-coredns-6b9548f79f-p4whd               0/1     Unknown                 1                6d19h
kube-system                 rke2-coredns-rke2-coredns-autoscaler-57647bc7cf-tc2h2    0/1     Unknown                 1                6d19h
kube-system                 rke2-ingress-nginx-controller-s9kp8                      0/1     Unknown                 1                6d19h
kube-system                 rke2-metrics-server-7d58bbc9c6-qtvk9                     0/1     Unknown                 1                6d19h
kube-system                 rke2-multus-ds-qt48x                                     1/1     Running                 3 (33m ago)      6d19h
kube-system                 snapshot-controller-7b76b577bc-6s6dq                     0/1     Unknown                 1                6d19h
kube-system                 snapshot-controller-7b76b577bc-kf9fj                     0/1     Unknown                 1                6d19h
longhorn-system             backing-image-manager-36c7-9dd8                          0/1     Unknown                 0                6d18h
longhorn-system             csi-attacher-6d46d6df-4fvhz                              0/1     Unknown                 1                6d19h
longhorn-system             csi-attacher-6d46d6df-mzkm7                              0/1     Unknown                 1                6d19h
longhorn-system             csi-attacher-6d46d6df-tnf87                              0/1     Unknown                 1                6d19h
longhorn-system             csi-provisioner-789fb46d84-5cf2l                         0/1     Unknown                 1                6d19h
longhorn-system             csi-provisioner-789fb46d84-dqbjn                         0/1     Unknown                 1                6d19h
longhorn-system             csi-provisioner-789fb46d84-psrtl                         0/1     Unknown                 1                6d19h
longhorn-system             csi-resizer-7b4c499ff9-lw94q                             0/1     Unknown                 1                6d19h
longhorn-system             csi-resizer-7b4c499ff9-pvqpf                             0/1     Unknown                 1                6d19h
longhorn-system             csi-resizer-7b4c499ff9-vz9bd                             0/1     Unknown                 1                6d19h
longhorn-system             csi-snapshotter-865d9b6f88-7wq22                         0/1     Unknown                 1                6d19h
longhorn-system             csi-snapshotter-865d9b6f88-r2vpv                         0/1     Unknown                 1                6d19h
longhorn-system             csi-snapshotter-865d9b6f88-xgh96                         0/1     Unknown                 1                6d19h
longhorn-system             engine-image-ei-1d169b76-trzcc                           0/1     Unknown                 1                6d19h
longhorn-system             instance-manager-e-ac8017729611c9056dff485e0a4a2efc      0/1     Unknown                 0                6d19h
longhorn-system             instance-manager-r-ac8017729611c9056dff485e0a4a2efc      0/1     Unknown                 0                6d19h
longhorn-system             longhorn-admission-webhook-86578b785d-6wslq              0/1     Init:Unknown            0                6d19h
longhorn-system             longhorn-admission-webhook-86578b785d-8bmzb              0/1     Init:Unknown            0                6d19h
longhorn-system             longhorn-conversion-webhook-67b57c79cb-m5mzc             0/1     Unknown                 1                6d19h
longhorn-system             longhorn-conversion-webhook-67b57c79cb-tbmjl             0/1     Unknown                 1                6d19h
longhorn-system             longhorn-csi-plugin-blq27                                0/3     Error                   569              6d19h
longhorn-system             longhorn-driver-deployer-76b8ddc678-cfn2w                0/1     Init:Unknown            0                6d19h
longhorn-system             longhorn-loop-device-cleaner-84z2r                       1/1     Running                 3 (33m ago)      6d19h
longhorn-system             longhorn-manager-mkk8s                                   0/1     Init:Unknown            1                6d19h
longhorn-system             longhorn-recovery-backend-5776964c79-96d8g               0/1     Unknown                 1                6d19h
longhorn-system             longhorn-recovery-backend-5776964c79-nm4rr               0/1     Unknown                 1                6d19h
longhorn-system             longhorn-ui-7464fdc77c-c5wsw                             0/1     Error                   282              6d19h
longhorn-system             longhorn-ui-7464fdc77c-j8pb9                             0/1     Error                   282              6d19h
The system was working few days ago, but then it got shut down. Here are some additional info from
kubectl describe nodes hv-1
:
a
most PODs are not running up ..., if you have a
console
, it will show that the cluster is setting...
l
How long it should take? Yesterday I waited ~8h
a
I guess something is broken, the cluster can't start-up
let's check:
journalctl --unit=rke2-server
,
journalctl --unit=rancherd
,
journalctl --unit=rancher-system-agent
l
Ah, I think I see the issue... I think the hardware ID of the network cards changed once again, which caused IP of mgmt interface...
a
oh...
NODE IP must be stable
l
I had the same issue some time ago, it was caused by plugging in new NIC, when enp0s6 was renamed to enp0s7, and new NIC got enp0s6. Now it seems like the same issue happened after creating new network in the Harvester interface.
a
that's k8s cluster's fundemental
1
l
Yes, but this was caused directly after I created bridged network with the second interface this time.
a
could you give the detailed operation steps this time, all from Harvester WebUI ?
l
Let me bind the MAC to the IP via router, this will probably fix the issue for now.
a
when NODE IP is via DHCP, IP/MAC binding in DHCP SERVER is required
1
l
Works, thanks!
a
basically, do not touch the network inside Harvester OS via Linux shell directly
👍 1
l
The action that broke the Network was directly after creating the following Cluster Network:
Copy code
apiVersion: <http://network.harvesterhci.io/v1beta1|network.harvesterhci.io/v1beta1>
kind: VlanConfig
metadata:
  annotations:
    <http://network.harvesterhci.io/matched-nodes|network.harvesterhci.io/matched-nodes>: '["hv-1"]'
  creationTimestamp: '2023-09-29T12:47:28Z'
  finalizers:
    - <http://wrangler.cattle.io/harvester-network-vlanconfig-controller|wrangler.cattle.io/harvester-network-vlanconfig-controller>
  generation: 1
  labels:
    <http://network.harvesterhci.io/clusternetwork|network.harvesterhci.io/clusternetwork>: net2
  managedFields:
    - apiVersion: <http://network.harvesterhci.io/v1beta1|network.harvesterhci.io/v1beta1>
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:clusterNetwork: {}
          f:uplink:
            .: {}
            f:bondOptions:
              .: {}
              f:miimon: {}
              f:mode: {}
            f:linkAttributes:
              .: {}
              f:txQLen: {}
            f:nics: {}
      manager: harvester
      operation: Update
      time: '2023-09-29T12:47:28Z'
    - apiVersion: <http://network.harvesterhci.io/v1beta1|network.harvesterhci.io/v1beta1>
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers:
            .: {}
            v:"<http://wrangler.cattle.io/harvester-network-vlanconfig-controller|wrangler.cattle.io/harvester-network-vlanconfig-controller>": {}
      manager: harvester-network-controller
      operation: Update
      time: '2023-09-29T12:47:28Z'
  name: a1
  resourceVersion: '67035'
  uid: 9c473bbb-243e-4bb9-8fc0-94f139d81cce
spec:
  clusterNetwork: net2
  uplink:
    bondOptions:
      miimon: -1
      mode: active-backup
    linkAttributes:
      txQLen: -1
    nics:
      - enp7s0
a
do you have multi nics in the server ?
l
yes, enp6s0 (mgmt) and enp7s0 (additional one)
a
as you said
I had the same issue some time ago, it was caused by plugging in new NIC, when enp0s6 was renamed to enp0s7, and new NIC got enp0s6. Now it seems like the same issue happened after creating new network in the Harvester interface.
did it happen today/yesterday, that those nics are renamed to each other
the new cluster network is on
enp7s0
, all the settings should not affect
enp6s0
l
The NICs were renamed on 1.1.2. I purged the system and installer 1.2.0. The network was created. I installed some VMs. Node rebooted due to power outage. NICs were renamed after boot.
a
could you help log an issue https://github.com/harvester/harvester/issues here, add
lspci -v
information of your two
nics
we need to know clearly how the issue happens, and look for solution to persistant the nic naming thanks. cc @prehistoric-balloon-31801 @red-king-19196 @great-bear-19718
1
👍 1