Hello, I’ve been having intermittent issues relate...
# k3s
r
Hello, I’ve been having intermittent issues related to selinux, where local volumes cannot be created:
Copy code
type=AVC msg=audit(1756145551.603:1849): avc:  denied  { write } for  pid=31597 comm="mkdir" name="storage" dev="sdb" ino=1031 scontext=system_u:system_r:container_t:s0:c25,c186 tcontext=system_u:object_r:k3s_data_t:s0 tclass=dir permissive=0
It looks like the data dir has the correct labels:
Copy code
dev-k8re-01:/var/lib/rancher/k3s/data # ls -lZ
total 4
-rw-------. 1 root root unconfined_u:object_r:k3s_lock_t:s0   0 Aug 25 12:14 .lock
drwxr-xr-x. 1 root root system_u:object_r:k3s_data_t:s0      12 Aug 25 12:34 80f7f3b67af96d724515f635f4e1625843b453399987e9084ea0f4f67c3c2ebe
drwxr-xr-x. 1 root root system_u:object_r:k3s_data_t:s0     116 Aug 25 12:34 cni
lrwxrwxrwx. 1 root root system_u:object_r:k3s_data_t:s0      90 Aug 25 12:34 current -> /var/lib/rancher/k3s/data/80f7f3b67af96d724515f635f4e1625843b453399987e9084ea0f4f67c3c2ebe
drwxr-xr-x. 1 root root system_u:object_r:k3s_data_t:s0       0 Aug 25 12:51 storage
and pods are being started with the right context:
Copy code
dev-k8re-01:/var/lib/rancher/k3s/data # ps auxZ | grep pause
system_u:system_r:container_t:s0:c99,c555 65535 3174 0.0  0.0 972   512 ?        Ss   13:53   0:00 /pause
and I believe all the required packages are installed:
Copy code
dev-k8re-01:/var/lib/rancher/k3s/data # zypper search -i selinux
Loading repository data...
Reading installed packages...

S  | Name                    | Summary                                         | Type
---+-------------------------+-------------------------------------------------+--------
i  | cockpit-selinux         | Cockpit SELinux package                         | package
i  | container-selinux       | SELinux policies for container runtimes         | package
i+ | k3s-selinux             | SELinux policy module for k3s                   | package
i  | libselinux1             | SELinux runtime library                         | package
i  | passt-selinux           | SELinux support for passt and pasta             | package
i+ | patterns-base-selinux   | SELinux Support                                 | package
i  | python3-selinux         | Python bindings for the SELinux runtime library | package
i+ | selinux                 | SELinux Support                                 | pattern
i  | selinux-policy          | SELinux policy configuration                    | package
i  | selinux-policy-targeted | SELinux targeted base policy                    | package
i  | selinux-tools           | SELinux command-line utilities                  | package
i  | swtpm-selinux           | SELinux module for the Software TPM emulator    | package
I can resolve the issue by manually adjusting the
storage
context:
chcon -t container_file_t /var/lib/rancher/k3s/data/storage
however I don’t think this is correct based on my minimal knowledge and inspection of the k3s-selinux package
h
what is the label for /var/lib/rancher directory?
r
Copy code
dev-k8re-01:/var/lib # ls -lZ | grep rancher
drwxr-xr-x. 1 root           root           unconfined_u:object_r:var_lib_t:s0                6 Aug 25 12:14 rancher
h
that is what I have (running rke2 not k3s) for /var/lib/rancher directory and ask long as the label is correct (before installation) at that directories under that should get correct label was this label set before you installed k3s?
r
I believe so. I can verify in a moment. There’s very little that is configured on this system before k3s is installed. This system is set up in the same way as a bunch of other systems we have and it seems that most of the time it works, however occasionally (such as this one) it breaks, and the only difference that I can see is that
storage
has the
container_file_t
context on the systems that run fine, and
k3s_data_t
on the systems that fail.
h
what version is your k3s deployment?
r
1.32.7+k3s1 running on openSUSE Leap Micro 6.1 I’d check on 1.33 but we had a broken dependency with that version and I haven’t had time to check if it’s been resolved
Okay so I went back through my installation script, and I’m actually manually running
chcon system_u:object_r:container_file_t:s0 /var/lib/rancher/k3s/data/storage/
immediately after k3s is installed because I guess I ran into this issue before and forgot. I disabled that and let the default stay:
system_u:object_r:k3s_data_t:s0
After rebooting, the local provisioner helper pods start failing again due to the OP selinux issue.
So it seems on LEAP Micro 6.1, k3s+selinux is broken OOTB. I’m going to attempt to reproduce on completely fresh system without any other modifications.
Alright, small script that reproduces the issue on a fresh LEAP MicroOS 6.1 system:
Copy code
#!/usr/bin/env bash

mount /var

curl -fsSL -o /usr/bin/yq <https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64>
chmod +x /usr/bin/yq

export INSTALL_K3S_BIN_DIR=/usr/bin
export INSTALL_K3s_SKIP_START=true

curl -fsSL <https://get.k3s.io> |
  sed 's#k3s >/.*#k3s; then#' | \
  sh -s - server \
    --disable=traefik \
    --disable=servicelb \
    --disable=local-storage \
    --node-name=pvctest \
    --selinux

kdir=/var/lib/rancher/k3s
k3sm=${kdir}/server/manifests
lsim=${k3sm}/local-storage-im.yaml

mkdir -p ${k3sm}

curl -fsSL -o ${lsim} <https://raw.githubusercontent.com/k3s-io/k3s/refs/heads/master/manifests/local-storage.yaml>
yq -i '. |= with(select(.kind == "StorageClass"); .volumeBindingMode = "Immediate" )' ${lsim}
sed -i "s/%{SYSTEM_DEFAULT_REGISTRY}%//" ${lsim}
sed -i "s#%{DEFAULT_LOCAL_STORAGE_PATH}%#${kdir}/data/storage#" ${lsim}

cat <<EOF > ${k3sm}/pvctest.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    <http://volume.kubernetes.io/selected-node|volume.kubernetes.io/selected-node>: pvctest
  name: pvctest
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: local-path
  volumeMode: Filesystem
EOF
After setting up the system copy paste to
script.sh
, make it executable, then run
transactional-update run ./script.sh
and reboot. After the system comes back up you can see that the local-provisioner helper pod is failing to create the volume.
a
Copy code
# Rule 3: Allow the container runtime to manage its storage directory.
# This grants the necessary permissions for the local-path-provisioner
# to create, modify, and delete directories and files within its storage path.

allow container_runtime_t var_lib_t:dir { add_name create getattr setattr search write remove_name };
allow container_runtime_t var_lib_t:file { create setattr unlink write };
Saw one of my clusters running this custom policy to fix local path provisioner errors 🙂