This message was deleted Rancher Users #longhorn-storage

Join Slack

This message was deleted.

# longhorn-storage

adamant-kite-43734

04/18/2023, 7:08 AM

This message was deleted.

billowy-painting-56466

04/19/2023, 12:36 AM

The script is attempting to find out the host OS of the nodes, what OS are you running the Kubernetes on?

nutritious-oxygen-89191

04/19/2023, 6:26 AM

The OS on all nodes is Ubuntu 22.04. The result of

grep -E "^ID_LIKE=" /etc/os-release | cut -d= -f2

on all nodes is

debian

billowy-painting-56466

04/19/2023, 6:39 AM

Oh wait, I've missed the first line

Error from server: error dialing backend: x509: certificate is valid for 127.0.0.1, not <http://xxx.xxx.xxx.xxx|xxx.xxx.xxx.xxx>

. Looks like the check script failed executing kubectl. Are you running locally on one of the K8s nodes?

nutritious-oxygen-89191

04/19/2023, 6:58 AM

I am running the script from my local machine

nutritious-oxygen-89191

04/19/2023, 6:59 AM

K3s 1.25.6 with rancher 2.7.2

billowy-painting-56466

04/19/2023, 7:02 AM

The certificate appears to be valid to the Kubernetes API server. Could tried to run the script on the control plan node?

nutritious-oxygen-89191

04/19/2023, 7:02 AM

will try, give me a moment

nutritious-oxygen-89191

04/19/2023, 7:06 AM

ok, I ran the script on one of the three control plane nodes and this is the result

Copy code

Error from server: error dialing backend: x509: certificate is valid for 127.0.0.1, not <http://xxx.xxx.xxx.xxx|xxx.xxx.xxx.xxx>
[ERROR] Unable to detect kernel release on node geo-node1.

geo-node1

is a worker (4th node)

nutritious-oxygen-89191

04/19/2023, 7:07 AM

but xx.xxx.xxx.xxx is not the IP of the worker, it is the IP of another control plane

billowy-painting-56466

04/19/2023, 7:19 AM

You need to resolve the certificate issue first. Suggest to checking with Rancher folks.

nutritious-oxygen-89191

04/19/2023, 8:28 AM

Could you help me figure out which command in the script is causing the certificate error? I assume it is not the same that you posted earlier. thanks

billowy-painting-56466

04/19/2023, 8:31 AM

Are you able to run

kubectl get ns

nutritious-oxygen-89191

04/19/2023, 8:32 AM

Copy code

NAME                                     STATUS   AGE
cattle-fleet-clusters-system             Active   39d
cattle-fleet-local-system                Active   39d
cattle-fleet-system                      Active   39d
cattle-global-data                       Active   39d
cattle-global-nt                         Active   39d
cattle-impersonation-system              Active   39d
cattle-monitoring-system                 Active   28d
cattle-system                            Active   44h
cluster-fleet-local-local-1a3d67d0a899   Active   39d
default                                  Active   39d
fleet-default                            Active   39d
fleet-local                              Active   39d
gi                                       Active   39d
kube-node-lease                          Active   39d
kube-public                              Active   39d
kube-system                              Active   39d
local                                    Active   39d
longhorn-system                          Active   13d
node-feature-discovery                   Active   27d
p-d92bp                                  Active   13d
p-h6p5f                                  Active   13d
p-lwjbx                                  Active   13d
p-xgtff                                  Active   13d
p-xsq75                                  Active   13d
user-6v7ft                               Active   13d
whatwhatwhy                              Active   33d

billowy-painting-56466

04/19/2023, 8:33 AM

How about

kubectl exec -it <any-pod> -- bash

nutritious-oxygen-89191

04/19/2023, 8:35 AM

works for a test pod in whatwhatwhy

kubectl exec -it -n whatwhatwhy box1-798d6dd5d6-mjftc -- bash

billowy-painting-56466

04/19/2023, 8:36 AM

hmm

nutritious-oxygen-89191

04/19/2023, 8:36 AM

i can a also do

kubectl exec -it overlaytest-4vfff -- bash

billowy-painting-56466

04/19/2023, 8:38 AM

Try to create this DaemonSet.

billowy-painting-56466

04/19/2023, 8:38 AM

Then execute

kubectl exec -i <daemonset-pod> -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'grep -E "^ID_LIKE=" /etc/os-release | cut -d= -f2'

billowy-painting-56466

04/19/2023, 8:40 AM

kubectl exec -i <daemonset-pod> -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'uname -r'

nutritious-oxygen-89191

04/19/2023, 11:19 AM

Copy code

$ kubectl get pods -o wide | grep longhorn-environment-check
longhorn-environment-check-6d64z   1/1     Running                 0                7m7s   10.42.2.157      gi-rm1      <none>           <none>
longhorn-environment-check-cwk62   1/1     Running                 0                7m7s   10.42.1.252      geo-node1   <none>           <none>
longhorn-environment-check-rjkhf   1/1     Running                 0                7m7s   10.42.0.159      gi-rm0      <none>           <none>
longhorn-environment-check-xmtqs   1/1     Running                 0                7m7s   10.42.3.21       gi-rm2      <none>           <none>

nutritious-oxygen-89191

04/19/2023, 11:20 AM

Copy code

$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'grep -E "^ID_LIKE=" /etc/os-release | cut -d= -f2'
debian
$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'uname -r'
5.15.0-69-generic
$ DS=longhorn-environment-check-cwk62
$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'grep -E "^ID_LIKE=" /etc/os-release | cut -d= -f2'
debian
$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'uname -r'
5.15.0-69-generic
$ DS=longhorn-environment-check-rjkhf
$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'grep -E "^ID_LIKE=" /etc/os-release | cut -d= -f2'
Error from server: error dialing backend: x509: certificate is valid for 127.0.0.1, not 82.165.18.79
$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'uname -r'
Error from server: error dialing backend: x509: certificate is valid for 127.0.0.1, not 82.165.18.79
$ DS=longhorn-environment-check-xmtqs
$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'grep -E "^ID_LIKE=" /etc/os-release | cut -d= -f2'
debian
$ kubectl exec -i ${DS} -- nsenter --mount=/proc/1/ns/mnt -- bash -c 'uname -r'
5.15.0-69-generic

billowy-painting-56466

04/19/2023, 12:55 PM

It seems that only one of the nodes is having the certificate issue. This needs to resolve first since the check script runs those commands on all DaemonSet pods.

nutritious-oxygen-89191

04/19/2023, 12:56 PM

I replaced the node (it was just a small cloud server) and now the issue has moved to another node

geo-node1

🤔 1

billowy-painting-56466

04/20/2023, 1:04 AM

Maybe this is relevant.

nutritious-oxygen-89191

04/20/2023, 7:53 AM

I destroyed the cluster and recreated everything. script comes back without any complaint. thanks!

🙌 1

377 Views

Open in Slack

Previous Next