This message was deleted.
# k3s
a
This message was deleted.
r
Aside from your standard file permissions (which I assume you've checked and tracked if pods are running privileged as root or something else and you are tracking that UID), do you have anything like SELinux running as enforcing? If so you'll need to pay attention to the file context for the directory, and you also might need to check your sebools (I know, for example, that there's an sebool virt_use_nfs that you have to turn on for a CentOS 7 host acting as a KVM hypervisor if you want to do something like mount an ISO file from an NFS share to install the VM).
b
Hey Bill. Thanks for answering. No SELinux in use. Just a normal Ubuntu 22.04 install. Pods are running as non-root. I am pretty certain it is my noob use of k3sup. For some reason, with this last node, the command was different in terms of the
--server-user
and
--user
flags. I had to swap them to get the ssh keys to function. Very strange to me and I can't get the credentials to work otherwise.
r
I've had to use SELinux at work, but I've never looked at if AppArmor does similar weird stuff or not. I know 22.04 gave me fits with snap enough that I backed off to 20.04 for the places I am using Ubuntu, so I can't say I've got much 22.04 experience.
b
So, I've gone and even downgraded the node to 20.04 and still the same issue. Kinda pulling my hair out at this stage. I have a business trip the next week. I'll have to come back to this when I get back next weekend. Thanks anyway Bill.
r
Next thing I can think of is if there was a partition full issue, that might be it. Even if it was cleared some apps (I know ElasticSearch for one) will remember a lack of space and refuse even when space is free until you enter a command to tell it that it's ok. I don't recall hearing of anything like that for Kubernetes, but not impossible I suppose. Past that no ideas except that with it being just the one agent I'd probably time box how long I try vs just deleting the node from the cluster, wiping it, and starting over to rejoin it.
Unsatisfying, but some problems just end up that way.
b
Hey Bill. I came back a week later with a fresh mind and new energy and low and behold, I found the issue. For some reason, the storage folders for the local-path volume mounts (PVCs) were given the wrong permissions. 775 instead of 777. I reported it as a comment in this older issue. I'm happy everything is working as it should now, but not feeling confident this won't pop up again. https://github.com/k3s-io/k3s/issues/3704 Scott