This message was deleted.
# k3s
a
This message was deleted.
r
I'd expect default spot to be under /var/lib/k3s and seeing the various /run/k3s mounts that appear to be pointed at /var as well jives with that too. Your partitions suggest you might be using some form of security guide for lockdowns, so maybe you've got a policy you applied that impacts ephemeral storage (such as an SELinux boolean that needs set one way or the other). I know, for example, that when I used a CentOS 7 install with the DISA STIG security policy at install time which I was using as a hypervisor, I had to run
setsebool -P virt_use_nfs=on
(or something to that effect) to be able to mount my ISO images from an NFS share to install the VMs. Maybe you've got something similar?
b
Hi Bill, Thank you for your reply! /var/lib/k3s is missing, but /var/lib/rancher/k3s exists.
Copy code
# df -h /var/lib/rancher
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/systemvg-varlv  105G  5.1G  100G   5% /var
[me@localhost]# df -h /var/lib/rancher/k3s/
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/systemvg-varlv  105G  5.1G  100G   5% /var
[me@localhost]# df -h /var/lib/
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/systemvg-varlv  105G  5.1G  100G   5% /var
[me@localhost]# du -sh
2.9G	.
I will try to check the security guide for lockdowns and policies. Just in case, I'm using embedded etcd (https://docs.k3s.io/datastore/ha-embedded). Do I need to configure storage in the K3S cluster somehow or it by default will be ready to go? (Asking, because on my previous K3S cluster setups in other environments, I didn't face any issues with storage, while the installation procedure was everywhere the same)
r
Yeah, /var/lib/rancher/k3s is right, forgot the rancher bit. If you've got scripts or things that apply then you may want to try Google search for it, though that doesn't always work. You might also want to try seeing if you can spot anything under /var/log/audit/audit.log , though those can be a pain to catch and they're not always terribly helpful (I have a low priority thing that just tells me systemcall execve failed which I think is related to fapolicyd but haven't had time to get back and verify, but very useless log message). No clue past that. I don't recall ever using ephemeral storage in Kubernetes so not sure what all it does/needs/is. You might try checking upstream docs for requirements and also K3S docs for any kind of lockdowns (for example there's an --selinux flag for K3S which I think I recall we've had SELinux enforcing and forgotten the flag and still had it work, but maybe not 100% if we didn't try certain features). You might be in a halfway state like that with a missing flag for K3S or RPM to install to get a policy installed or similar. Sorry I can't be of more help. I've debugged some locked down RKE2's before but not done much with storage so that's where my knowledge ends.
👍 1
b
Thanks, Bill! If I remember correctly, K3S with SELinux was installed by using this guide: https://docs.k3s.io/advanced#selinux-support:
Copy code
yum install -y container-selinux selinux-policy-base
yum install -y <https://rpm.rancher.io/k3s/latest/common/centos/7/noarch/k3s-selinux-0.2-1.el7_8.noarch.rpm>
(These are in place) However, I think I didn't use
--selinux
flag, but not quite sure if it impacts the ephemeral storage or not.
r
Yeah, not sure either. Not sure if it's still the case, but a year or two ago with RKE2 SELinux support the RPM install method and the TAR install method put the binaries in different spots and the rke2-selinux RPM worked for the RPM install method but not TAR. The main consequence of this was that using the install shell script RPM is default (but changeable), but installing through rancher it used TAR and wasn't changeable, so at that time I couldn't deploy an RKE2 cluster with SELinux enforcing from Rancher without screwing around backsolving things (and my task was to document for others so I was trying to avoid that sort of thing). I don't know if K3S diverges the same way RKE2 does or not, but it's a hangup to be aware of if you installed K3S from Rancher instead of the install script (though
rpm -qa | grep -i k3s
will tell you if you used RPM install).
c
This isn't a selinux or permissions problem. It's being rejected at the scheduling stage. If you describe the node, what do you see for the ephemeral storage capacity?
Note that the path used for measuring available ephemeral storage is /var/lib/kubelet
🎯 1
b
Hi @creamy-pencil-82913, Thank you very much for your reply! Sure, here is the node describe output (where the pod is running):
Copy code
Name:               <http://my1.node.net|my1.node.net>
Roles:              <none>
Labels:             <http://beta.kubernetes.io/arch=amd64|beta.kubernetes.io/arch=amd64>
                    <http://beta.kubernetes.io/instance-type=k3s|beta.kubernetes.io/instance-type=k3s>
                    <http://beta.kubernetes.io/os=linux|beta.kubernetes.io/os=linux>
                    <http://kubernetes.io/arch=amd64|kubernetes.io/arch=amd64>
                    <http://kubernetes.io/hostname=my1.node.net|kubernetes.io/hostname=my1.node.net>
                    <http://kubernetes.io/os=linux|kubernetes.io/os=linux>
                    <http://node.kubernetes.io/instance-type=k3s|node.kubernetes.io/instance-type=k3s>
Annotations:        <http://flannel.alpha.coreos.com/backend-data|flannel.alpha.coreos.com/backend-data>: {"VNI":1,"VtepMAC":"d6:d9:55:53:69:3e"}
                    <http://flannel.alpha.coreos.com/backend-type|flannel.alpha.coreos.com/backend-type>: vxlan
                    <http://flannel.alpha.coreos.com/kube-subnet-manager|flannel.alpha.coreos.com/kube-subnet-manager>: true
                    <http://flannel.alpha.coreos.com/public-ip|flannel.alpha.coreos.com/public-ip>: 10.158.198.181
                    <http://k3s.io/hostname|k3s.io/hostname>: <http://my1.node.net|my1.node.net>
                    <http://k3s.io/internal-ip|k3s.io/internal-ip>: 10.158.198.181
                    <http://k3s.io/node-args|k3s.io/node-args>: ["agent","--resolv-conf","/etc/rancher/k3s/resolv.conf"]
                    <http://k3s.io/node-config-hash|k3s.io/node-config-hash>: EFKFYA4AHWQMSPBU4BRL3JQU2ACEQC32V34A6OXJAXHVPNJN7VIQ====
                    <http://k3s.io/node-env|k3s.io/node-env>:
                      {"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/d1373f31227cf763459011fa1123224f72798511c86a56d61a9d4e3c0fa8a0c9","K3S_TOKEN":"********","K3S_U...
                    <http://node.alpha.kubernetes.io/ttl|node.alpha.kubernetes.io/ttl>: 0
                    <http://volumes.kubernetes.io/controller-managed-attach-detach|volumes.kubernetes.io/controller-managed-attach-detach>: true
CreationTimestamp:  Fri, 22 Sep 2023 14:52:57 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  <http://my1.node.net|my1.node.net>
  AcquireTime:     <unset>
  RenewTime:       Wed, 27 Sep 2023 09:55:55 +0200
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Wed, 27 Sep 2023 09:51:08 +0200   Fri, 22 Sep 2023 22:32:07 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 27 Sep 2023 09:51:08 +0200   Fri, 22 Sep 2023 22:32:07 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Wed, 27 Sep 2023 09:51:08 +0200   Fri, 22 Sep 2023 22:32:07 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Wed, 27 Sep 2023 09:51:08 +0200   Fri, 22 Sep 2023 22:32:07 +0200   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.158.198.181
  Hostname:    <http://my1.node.net|my1.node.net>
Capacity:
  cpu:                8
  ephemeral-storage:  5110Mi
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32652860Ki
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  5090312189
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32652860Ki
  pods:               110
System Info:
  Machine ID:                   73a9abc07bcc4a75981cff15ebc7d47d
  System UUID:                  531c2542-dd5e-94b6-d21b-f783bcfd494d
  Boot ID:                      fa5f2eaa-0c55-4017-b3dd-89f5883b3dcf
  Kernel Version:               4.18.0-477.21.1.el8_8.x86_64
  OS Image:                     Red Hat Enterprise Linux 8.8 (Ootpa)
  Operating System:             linux
  Architecture:                 amd64
  Container Runtime Version:    <containerd://1.6.15-k3s1>
  Kubelet Version:              v1.26.2+k3s1
  Kube-Proxy Version:           v1.26.2+k3s1
PodCIDR:                        10.42.17.0/24
PodCIDRs:                       10.42.17.0/24
ProviderID:                     <k3s://my1.node.net>
Non-terminated Pods:            (2 in total)
  Namespace                     Name                                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                     ----                                        ------------  ----------  ---------------  -------------  ---
  my-project  my-project-h22m6-mmf45    125m (1%)     250m (3%)   512Mi (1%)       512Mi (1%)     18h
  kube-system                   svclb-traefik-e7b72f7e-g2qhk                0 (0%)        0 (0%)      0 (0%)           0 (0%)         4d19h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                125m (1%)   250m (3%)
  memory             512Mi (1%)  512Mi (1%)
  ephemeral-storage  1Gi (21%)   12Gi (253%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
Events:              <none>
Copy code
resources:
        limits:
          cpu: "0.25"
          memory: "0.5Gi"
          ephemeral-storage: "12Gi"
        requests:
          cpu: "0.125"
          memory: "0.5Gi"
          ephemeral-storage: "1Gi"
Copy code
# du -h --max-depth=0 /var/lib/kubelet
1.1G	/var/lib/kubelet
c
Copy code
Capacity:
  cpu:                8
  ephemeral-storage:  5110Mi
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32652860Ki
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  5090312189
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32652860Ki
  pods:               110
The 5GB of allocatable space for ephemeral-storage seems to line up with your
/tmp
volume:
Copy code
/dev/mapper/systemvg-tmplv          5.0G   69M  5.0G   2% /tmp
Your pod asked for 20GB of temporary (ephemeral) space, but you only have 5GB for /tmp. If you want more temp files, you need a larger /tmp.
❤️ 1
b
Hi @creamy-pencil-82913, Thank you for your reply! I have extended the disk space on that volume:
Copy code
Filesystem                          Size  Used Avail Use% Mounted on
devtmpfs                             16G     0   16G   0% /dev
tmpfs                                16G     0   16G   0% /dev/shm
tmpfs                                16G  1.3M   16G   1% /run
tmpfs                                16G     0   16G   0% /sys/fs/cgroup
/dev/mapper/systemvg-rootlv         2.0G  148M  1.9G   8% /
/dev/mapper/systemvg-usrlv           10G  2.3G  7.8G  23% /usr
/dev/sda2                          1014M  226M  789M  23% /boot
/dev/sda1                           200M  5.8M  195M   3% /boot/efi
/dev/mapper/systemvg-tmplv          160G  1.2G  159G   1% /tmp
/dev/mapper/systemvg-varlv          105G  5.1G  100G   5% /var
/dev/mapper/systemvg-homelv         2.0G   47M  2.0G   3% /home
/dev/mapper/systemvg-optlv          5.0G  662M  4.4G  13% /opt
/dev/mapper/systemvg-kdumplv        2.0G   47M  2.0G   3% /var/crash
/dev/mapper/systemvg-varloglv       5.0G  640M  4.4G  13% /var/log
/dev/mapper/systemvg-varlogauditlv   10G  230M  9.8G   3% /var/log/audit
tmpfs                               3.2G     0  3.2G   0% /run/user/988
shm                                  64M     0   64M   0% /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/8a0c71e0ac12a8743a273d237e70443b90ee2e695d58fca2809f0d6818e261be/shm
overlay                             105G  5.1G  100G   5% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/8a0c71e0ac12a8743a273d237e70443b90ee2e695d58fca2809f0d6818e261be/rootfs
overlay                             105G  5.1G  100G   5% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/22809445eafd01fdf3f22a09262a6b33f92379f561af9ac47544037f02bffa83/rootfs
overlay                             105G  5.1G  100G   5% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/b6c7b64e4fd92c8ca8f26f1dafebfa93300a219950f949b347990a77cacdbade/rootfs
tmpfs                               980K   12K  968K   2% /var/lib/kubelet/pods/b4e547f7-9422-484a-ad1e-f95a1a30d9cd/volumes/kubernetes.io~empty-dir/var-run
tmpfs                                32G   12K   32G   1% /var/lib/kubelet/pods/b4e547f7-9422-484a-ad1e-f95a1a30d9cd/volumes/kubernetes.io~projected/kube-api-access-4s6f7
shm                                  64M  8.0K   64M   1% /run/k3s/containerd/io.containerd.grpc.v1.cri/sandboxes/f0624c98b5cb8848343bdef7fbbbe4e73000a88dd83fb39c97faf0e0d2172ae9/shm
overlay                             105G  5.1G  100G   5% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/f0624c98b5cb8848343bdef7fbbbe4e73000a88dd83fb39c97faf0e0d2172ae9/rootfs
overlay                             105G  5.1G  100G   5% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/1e97bf1be1587ff10a6fe51579b6cdadcfb918da2d5c0e2de52b85fd044c31ff/rootfs
overlay                             105G  5.1G  100G   5% /run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/9f494d48970ccc781818467365212ed7f1ddf74e4eb49df5c44e76973fb486e3/rootfs
tmpfs                               3.2G     0  3.2G   0% /run/user/170197
However, still, pods stuck with pending status:
Copy code
Limits:
      cpu:                250m
      ephemeral-storage:  20Gi
      memory:             512Mi
    Requests:
      cpu:                125m
      ephemeral-storage:  10Gi
      memory:             512Mi

Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  29s   default-scheduler  0/10 nodes are available: 10 Insufficient ephemeral-storage. preemption: 0/10 nodes are available: 10 No preemption victims found for incoming pod..
node describe
output when a request is equal to 4Gi:
Copy code
Name:               <http://my1.node.net|my1.node.net>
Roles:              <none>
Labels:             <http://beta.kubernetes.io/arch=amd64|beta.kubernetes.io/arch=amd64>
                    <http://beta.kubernetes.io/instance-type=k3s|beta.kubernetes.io/instance-type=k3s>
                    <http://beta.kubernetes.io/os=linux|beta.kubernetes.io/os=linux>
                    <http://kubernetes.io/arch=amd64|kubernetes.io/arch=amd64>
                    <http://kubernetes.io/hostname=my1.node.net|kubernetes.io/hostname=my1.node.net>
                    <http://kubernetes.io/os=linux|kubernetes.io/os=linux>
                    <http://node.kubernetes.io/instance-type=k3s|node.kubernetes.io/instance-type=k3s>
Annotations:        <http://flannel.alpha.coreos.com/backend-data|flannel.alpha.coreos.com/backend-data>: {"VNI":1,"VtepMAC":"12:88:20:ab:2c:71"}
                    <http://flannel.alpha.coreos.com/backend-type|flannel.alpha.coreos.com/backend-type>: vxlan
                    <http://flannel.alpha.coreos.com/kube-subnet-manager|flannel.alpha.coreos.com/kube-subnet-manager>: true
                    <http://flannel.alpha.coreos.com/public-ip|flannel.alpha.coreos.com/public-ip>: 10.158.199.18
                    <http://k3s.io/hostname|k3s.io/hostname>: <http://my1.node.net|my1.node.net>
                    <http://k3s.io/internal-ip|k3s.io/internal-ip>: 10.158.199.18
                    <http://k3s.io/node-args|k3s.io/node-args>: ["agent","--resolv-conf","/etc/rancher/k3s/resolv.conf"]
                    <http://k3s.io/node-config-hash|k3s.io/node-config-hash>: EFKFYA4AHWQMSPBU4BRL3JQU2ACEQC32V34A6OXJAXHVPNJN7VIQ====
                    <http://k3s.io/node-env|k3s.io/node-env>:
                      {"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/d1373f31227cf763459011fa1123224f72798511c86a56d61a9d4e3c0fa8a0c9","K3S_TOKEN":"********","K3S_U...
                    <http://node.alpha.kubernetes.io/ttl|node.alpha.kubernetes.io/ttl>: 0
                    <http://volumes.kubernetes.io/controller-managed-attach-detach|volumes.kubernetes.io/controller-managed-attach-detach>: true
CreationTimestamp:  Fri, 22 Sep 2023 15:12:46 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  <http://my1.node.net|my1.node.net>
  AcquireTime:     <unset>
  RenewTime:       Thu, 28 Sep 2023 10:48:19 +0200
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 28 Sep 2023 10:47:52 +0200   Fri, 22 Sep 2023 22:32:02 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 28 Sep 2023 10:47:52 +0200   Fri, 22 Sep 2023 22:32:02 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 28 Sep 2023 10:47:52 +0200   Fri, 22 Sep 2023 22:32:02 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Thu, 28 Sep 2023 10:47:52 +0200   Fri, 22 Sep 2023 22:32:02 +0200   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  10.158.199.18
  Hostname:    <http://my1.node.net|my1.node.net>
Capacity:
  cpu:                8
  ephemeral-storage:  5110Mi
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32652860Ki
  pods:               110
Allocatable:
  cpu:                8
  ephemeral-storage:  5090312189
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             32652860Ki
  pods:               110
System Info:
  Machine ID:                   73a9abc07bcc4a75981cff15ebc7d47d
  System UUID:                  99362542-0b22-86d6-56a0-2f37aaa5f78a
  Boot ID:                      8eed2fab-9114-49d3-9822-b2a2bbf6e729
  Kernel Version:               4.18.0-477.21.1.el8_8.x86_64
  OS Image:                     Red Hat Enterprise Linux 8.8 (Ootpa)
  Operating System:             linux
  Architecture:                 amd64
  Container Runtime Version:    <containerd://1.6.15-k3s1>
  Kubelet Version:              v1.26.2+k3s1
  Kube-Proxy Version:           v1.26.2+k3s1
PodCIDR:                        10.42.23.0/24
PodCIDRs:                       10.42.23.0/24
ProviderID:                     <k3s://my1.node.net>
Non-terminated Pods:            (2 in total)
  Namespace                     Name                                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                     ----                                        ------------  ----------  ---------------  -------------  ---
  my-project  my-project-6lhjc-6ggck    125m (1%)     250m (3%)   512Mi (1%)       512Mi (1%)     26s
  kube-system                   svclb-traefik-e7b72f7e-dj9m8                0 (0%)        0 (0%)      0 (0%)           0 (0%)         5d19h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                125m (1%)   250m (3%)
  memory             512Mi (1%)  512Mi (1%)
  ephemeral-storage  4Gi (84%)   20Gi (421%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
Events:              <none>
c
scheduler still says you don’t have enough ephemeral storage. Did you restart k3s after resizing the partition? A bunch of that stuff is only checked when the kubelet first starts.
1
b
Omg, thank you so much, that helped! 😀