Anyone around here ever used `virtual-kubelet` in ...
# rke2
b
Anyone around here ever used
virtual-kubelet
in conjunction with rke2? There seem to possibly be some specific considerations around certs and 'sessions' between nodes for rke2 that I may need to sort out.
@creamy-pencil-82913 any chance we could sync up sometime and discuss? I have prototyped this and intend to move forward and would like to clean this up in some relatively sane way: https://github.com/virtual-kubelet/virtual-kubelet/issues/1294
I dug through significant chunks of code and ultimately boiled it down to the use of the
remotedialer
project and how the control plane connects outbound using that. From what I can tell there is an (understandably) expectation that all nodes are running the rke/k3s agent and some sort of custom listener. In the code that uses the remote dialer there do seem to be some switches in there about using the ‘proxy’ or not. I am wondering if it would be possible to apply a role or label or something of that nature on a per-node basis to force the communication to not use the ‘proxy’ (iirc that’s the term in the code) and to just dial using standard k8s semantics.
c
if there is no remotedialer session for a node it should fall back to just dialing it directly. Do your apiservers have connectivity to the virtual kubelet’s reported node IP?
thats what I recall at least, it’s been a minute since I looked at it
b
Yes apiservers do have connectivity. I additionally see some logs spewed out on the terminal of virtual kubelet process where it doesn’t like what’s going on in tls land
So something is at least trying to connect.
That’s actually what the issue linked above is about. I am unclear what certs might be getting used by the api nodes that would differ from the certs/ca being used by the kubelet to connect to the api servers
c
where ARE you getting your certs from?
You can see the default CA topology here: https://docs.rke2.io/security/certificates#default-ca-topology There are undocumented APIs you can hit on the servers to generate kubelet client and serving certs, if you have a join token.
b
I currently am just using the admin kubeconfig from cluster creation. I pulled the ca and client key/crt out of that.
Certainly need to lock the kubeconfig down to a service account with limited permissions, but the ca for the client appears to be the same as used by the ‘serving’ aspect of the virtual-kubelet
c
ohhh… yeah that’s not going to work.
b
Pro tips accepted :)
c
that’s a client certificate, it is missing several critical fields that would prevent it from working for the kubelet serving cert.
b
The odd thing is that did all work with a crude kind cluster.
c
Hit the two endpoints I linked up above to get valid client and server certs for your kubelet. you should be able to figure out from the examples and tests what extra headers it wants.
b
Ok I will mess with it. However vk doesn’t really distinguish from what I can tell…you give it 3 paths, client key/cert and ca path. I am super unclear what files are being used in what context essentially.
c
kubelet needs more than a client cert, it also needs a serving cert since the apiserver needs to connect to it for
kubectl logs
and
kubectl exec
👍 1
b
Well, you have those 3 and the path to a kubeconfig…which doesn’t make a lot of sense to me as typically the client key/cert are in the kubeconfig
c
are you sure it’s not asking for server cert/key/ca and client kubeconfig?
that would be more in line with what I would expect
b
Right, but which vk options correlate to those is the question really.
I am specifically using this: https://github.com/agoda-com/macOS-vz-kubelet
Maybe I am misreading the args/env vars as documented there? Do you read those differently than I do about their meaning and use?
Maybe this is the serving cert for example?
APISERVER_CERT_LOCATION
c
yeah… but maybe the docs are just bad lol
or maybe its not actually functional and doesn’t listen at all or with a valid cert, idk
b
Possibly. I will see if I can sort out how to manually create a join token and then generate a serving cert/key and find the appropriate ca and give it a go.
c
you might open an issue asking how to configure the kubelet serving cert
b
Well, I as I say, it worked ootb with kind
ie: not logs puking out tls stuff it didn’t like and exec and logs were functional
If you are aware of any other relevant links about creating such files with rke2 or blog post or whatever please send, I will dig in to what you sent and see if I can sort it either way..thanks for the kind help and pointers!
c
i commented on your issue
👍 1
b
@creamy-pencil-82913 is this the CA file I should be using?
cat /var/lib/rancher/rke2/server/tls/server-ca.crt
and is there any particular reason I would not be able to use #3 from here with rke2? https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/#kubernetes-signers
c
That should work too, if you prefer to use that instead.
Or you could just use curl....
b
well, if you have an example I would be happy to give it a go 😄 I'm only slightly smart enough to dig through golang, so the links didn't spell it out enough for this ignorant
Copy code
kubectl exec -ti macos -- /bin/zsh
error: Internal error occurred: error sending request: Post "<https://172.26.64.19:10250/exec/default/macos/macos?command=%2Fbin%2Fzsh&input=1&output=1&tty=1>": proxy error from 127.0.0.1:9345 while dialing 172.26.64.19:10250, code 502: 502 Bad Gateway
the tls errors seemed to have cleaned up with the kubelet logs, but I still get that 😞
c
Check the server logs. Running rke2 with
debug: true
might show some additional info.
You might also try
egress-selector-mode: disabled
although that may break other things
b
it appears all inbound traffic to 10250 is coming from a single node (not an apiserver node oddly), does that make any sense?
this is per tcpdump on the mac machine
c
you’re seeing metrics-server traffic. you’ll only see traffic from the apiserver when you get logs or exec into the pod
b
ok, well the odd thing is when I execute the
exec
command I am getting the error above without even seeing any packets come through
c
have you tried running the servers with
egress-selector-mode: disabled
?
b
this is where the whole proxy thing kicks in iirc, like something is breaking down in there 😞 no but I will try that shortly
that needs to be set on all api server nodes yeah?
c
yes
Another thing you might try is
Copy code
kube-apiserver-arg:
  - kubelet-preferred-address-types=Hostname,InternalDNS,InternalIP
Right now it’s trying to connect to the kubelet by IP which requires a tunnel session when the egress-selector-mode is not set to disabled.
b
ah, well the name of the node is in no way resolvable as I chose an arbitrary name..will test this shortly
so IP is what I would need/hope for and need in this case
c
Yeah IP requires an agent connection at the moment, unless you disable the egress selector stuff. The idea with all of this is that we keep all the connectivity agent-initiated. We don't really support heterogeneous clusters so you will have to turn off some things if all your nodes are not actually rke2.
b
Yeah, still waiting for some others on my team to join here but will test as soon as I can. I think the ideal option for us would be to allow that behavior to be controllable on a per-node basis via annotation or similar…would that make any sense in the ecosystem/code base?
c
I think your best bet would probably be to just set egress-selector-mode to disabled if you want to support heterogenous kubelets. Other stuff may break too since we don’t technically support this but that is probably the easiest path forward for you.
We haven’t ported this section of the k3s docs over to the RKE2 docs yet, but you can find info on this setting here: https://docs.k3s.io/networking/basic-network-options#control-plane-egress-selector-configuration
b
Ah, thanks for sending, was just wondering if there was some deeper details on that.
ok, that worked, I have
exec
access now 😄
do you have any examples of the
curl
command you mentioned for issuing certs? I'm looking to codify this process and make it a bit more consumable for folks to add a bunch of these mac machines to the cluster.
c
replace ‘token’ with your join token. The node name/password combo are saved on the server and must be reused for the same node name in the future, unless you delete the node password secret for that name out of the kube-system namespace. This will respond with the CA cert, server cert, and server key:
Copy code
curl -vkS -u 'node:token' -H 'rke2-node-name: my-node' -H 'rke2-node-password: my-password' <https://localhost:9345/v1-rke2/serving-kubelet.crt>
you can get a client cert+key (usable in a kubeconfig) from
client-kubelet.crt
b
@creamy-pencil-82913 thanks for the help on this! I have documented our discussion above and have everything fully functional now.