This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

10/17/2022, 12:12 PM

This message was deleted.

late-needle-80860

10/17/2022, 12:49 PM

Sounds like a very bad idea to blindly accept nodes .. basically a glaring security issue to do so.

late-needle-80860

10/17/2022, 1:02 PM

Reading e.g. this > https://kubeedge.io/en/docs/setup/keadm/ I get the impression that you should try with one K3s control-plane/master node and then join the kube-edge workers by following that guide.

late-needle-80860

10/17/2022, 1:02 PM

And see if that’s possible.

late-needle-80860

10/17/2022, 1:02 PM

Where that brings you.

rapid-jelly-9995

10/18/2022, 6:23 AM

Hi! I'm quite familiar with the process of joining kubeedge nodes. It brings me to the point where the edge node is present and accepting workload just like standard nodes in the cluster. There you see my Raspberry PI edge node called

clownfish

Copy code

$ kubectl get nodes -o wide                                                                                
NAME                             STATUS   ROLES                  AGE   VERSION                                            INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
k3d-edgefarm-core-dev-agent-1    Ready    <none>                 17h   v1.22.15+k3s1                                      192.168.16.3    <none>        K3s dev              5.17.0-1016-oem     <containerd://1.5.13-k3s1>
k3d-edgefarm-core-dev-server-0   Ready    control-plane,master   17h   v1.22.15+k3s1                                      192.168.16.4    <none>        K3s dev              5.17.0-1016-oem     <containerd://1.5.13-k3s1>
k3d-edgefarm-core-dev-agent-0    Ready    <none>                 17h   v1.22.15+k3s1                                      192.168.16.2    <none>        K3s dev              5.17.0-1016-oem     <containerd://1.5.13-k3s1>
clownfish                        Ready    agent,edge             10h   v1.22.6-kubeedge-v1.11.1-12+0d66acb85546eb-dirty   192.168.1.100   <none>        Ubuntu 22.04.1 LTS   5.15.0-1012-raspi   <docker://20.10.18>

Here are my pods running on my edge node.

Copy code

$ kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=clownfish
NAMESPACE   NAME                                    READY   STATUS    RESTARTS   AGE     IP              NODE        NOMINATED NODE   READINESS GATES
kubeedge    edgemesh-agent-lcttg                    1/1     Running   0          3m29s   192.168.1.100   clownfish   <none>           <none>
nodegroup   example-app-clownfish-7c4f5654c-x5cqb   2/2     Running   0          60s     172.17.0.2      clownfish   <none>           <none>

kubectl exec and logs calls try to directly talk to the node to execute the request there. This cannot work for edge devices, because they are very likely located in a completely different network, or may be connected via a LTE modem. Thus, these

exec

and

logs

calls from kubectl are forwarded using iptables to the cloudcore service running on a cloud node. So, whats happening is that the request targets to`192.168.1.100` but gets redirected to the control plane nodes (via the iptables-manager pod).

Copy code

kubectl get pods -o wide -n kubeedge 
Found existing alias for "kubectl get pods". You should use: "kgp"
NAME                                           READY   STATUS    RESTARTS      AGE     IP              NODE                             NOMINATED NODE   READINESS GATES
edgemesh-agent-rhnzj                           1/1     Running   1 (11h ago)   17h     192.168.16.4    k3d-edgefarm-core-dev-server-0   <none>           <none>
kubeedge-controller-manager-85b5d8f6fb-q94x8   1/1     Running   1 (11h ago)   17h     10.42.2.10      k3d-edgefarm-core-dev-agent-1    <none>           <none>
edgemesh-agent-56t9b                           1/1     Running   2 (11h ago)   17h     192.168.16.3    k3d-edgefarm-core-dev-agent-1    <none>           <none>
edgemesh-agent-t5b4v                           1/1     Running   1 (11h ago)   17h     192.168.16.5    k3d-edgefarm-core-dev-agent-0    <none>           <none>
iptables-manager-x6bqt                         1/1     Running   0             10h     192.168.16.4    k3d-edgefarm-core-dev-server-0   <none>           <none>
cloudcore-59bbfcf5c7-lmxtx                     2/2     Running   0             10h     192.168.16.2    k3d-edgefarm-core-dev-agent-0    <none>           <none>
edgemesh-agent-lcttg                           1/1     Running   0             9m43s   192.168.1.100   clownfish                        <none>           <none>

See the iptables here that placed on the control-plane node.

Copy code

$ kubectl exec -it iptables-manager-x6bqt sh
/ # iptables -t nat -L TUNNEL-PORT
Chain TUNNEL-PORT (2 references)
target     prot opt source               destination         
DNAT       tcp  --  anywhere             anywhere             tcp dpt:10353 to:192.168.16.2:10003
DNAT       tcp  --  anywhere             anywhere             tcp dpt:10351 to:192.168.16.3:10003
DNAT       tcp  --  anywhere             anywhere             tcp dpt:10352 to:192.168.16.5:10003

So, when i exec or log to some pod that is running on the edge node, it calls it with

192.168.1.100

. The request gets redirected by the iptables rule to the node cloudcore is running (

k3d-edgefarm-core-dev-agent-0

with

192.168.16.2

) to the port 10003 where some cloudcore port is opened. Cloudcore acts as a proxy for API calls to the edge node. Now, the problem is that the kubectl request sent to the API Server still has the

192.168.1.100

in it. The cloudcore server certificate the request gets redirected to, does not have the edge node's IP address in its

IP Sans

. Therefore the request gets rejected. I added

0.0.0.0

to the IP Sans of the cloudcore server certificate just to verify that this is the point of failure. When i retrieve logs from a edge node's pod i get the following error:

Copy code

$ kubectl logs edgemesh-agent-lcttg
Error from server: Get "<https://192.168.1.100:10353/containerLogs/kubeedge/edgemesh-agent-lcttg/edgemesh-agent>": x509: certificate is valid for 0.0.0.0, not 192.168.1.100

So, this is definitly the issue.

kubectl logs

allow the feature

Copy code

--insecure-skip-tls-verify-backend=false: Skip verifying the identity of the kubelet that logs are requested from.
In theory, an attacker could provide invalid log content back. You might want to use this if your kubelet serving
certificates have expired

So the same logs request with the insecure-skip-tls-verify-backend enabled allows me to see the pods logs

Copy code

$ kubectl logs edgemesh-agent-lcttg --insecure-skip-tls-verify-backend
I1018 05:56:26.085654       1 server.go:55] Version: v1.12.0-dirty
I1018 05:56:26.085891       1 server.go:89] [1] Prepare agent to run
I1018 05:56:26.088853       1 netif.go:96] bridge device edgemesh0 already exists
I1018 05:56:26.089349       1 server.go:93] edgemesh-agent running on EdgeMode
...

I could live with that fact, but the

kubectl exec

feature doesn't allow the

insecure-skip-tls-verify-backend

flag or anything that does the same. I hope you can understand my problem. If not, please let me know and i try to elaborate a little bit deeper. The only possible solution i might think of right now Predefine a range of IP addresses my edge node can get in my local dev environment.

47 Views

Open in Slack

Previous Next