08/01/2022, 5:40 AM
Hi folks, I'm running k3s on top of an NFS-based rootfs which has worked for me for years, but recently did a pretty large set of upgrades so now I'm on containerd vs docker and having issues with some containers and snapshotters: •
snapshotter, understandably seems to not work here • Trying the
snapshotter, some containers throwing errors like:
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd container: copying of parent failed: failed to copy file info for /var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.native/snapshots/new-4170709921: failed to chown /var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.native/snapshots/new-4170709921: lchown /var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.native/snapshots/new-4170709921: invalid argument
• Other times (and I haven't figured out the pattern here, I thought it was under overlay config but seems not), I get this kind of similar but different error, where the native snapshotter is not mentioned, seems this one relates to the pull/unpack rather than the starting:
Failed to pull image "<|>": rpc error: code = Unknown desc = failed to pull and unpack image "<|>": failed to extract layer sha256:c7412c2a678786efc84fe4bd0b1a79ecca47457b0eb4d4bceb1f79d6b4f75695: mount callback failed on /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount4066054735: failed to Lchown "/var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount4066054735/bin" for UID 0, GID 0: lchown /var/lib/rancher/k3s/agent/containerd/tmpmounts/containerd-mount4066054735/bin: invalid argument: unknown
• For both of the above, my GitHub searches seemed to indicate they're caused by not running on an ext4 filesystem, which is true in my situation. • In some cases I can't start any containers in a pod, in others I can start most except busybox • I recalled that on docker, I believe I was using
which seemed to work fine and did not have this problem, but as far as I can tell,
is either not included at all in the k3s version of containerd, or just not configured (some places say it's not included, but I see errors in the containerd.log and
ctr plugin ls
output that kind of indicate it's there but not working. I'm also not 100% sure if it would fix my problem. So, my questions (and thanks in advance) are: • How come only the busybox image so far has this particular problem? I've gotten a bunch of others to work so far. • Is there some tweak to the
snapshotter that can be done to allow things to work? • If not, is there a simply way to get


08/01/2022, 6:30 AM
I think you're going to be kind of on your own with this. Running containerd or docker on nfs isn't really done. If you need to run on diskless nodes I'd probably use iscsi or something else like that to expose a traditional block device and then put ext4 or xfs on it, since those are the two filesystems that both of those projects are tested against. The native snapshotter is barely more than a "better than nothing" naive implementation and putting that on top of nfs just makes it worse. It's cool that it worked for you for so long but honestly I'm surprised.


08/01/2022, 6:51 AM
Hey, thanks for the reply! Yeah i was kind of surprised it worked too, but it was as simple as just setting docker back in 1.18 and off it went, no problems. I can look into iSCSI, as it's all off a TrueNAS box, maybe just mounting the volume as the containerd directory would be easier than trying to boot from it but I'll see. Shame it's not as simple as getting devmapper to work though, but I hear you on it being kind of uncharted!


08/03/2022, 11:23 PM
there is an lvm snapshotter as well, but I’d go the iscsi route like brandond said with a filesystem like ext4 or xfs directly


08/03/2022, 11:25 PM
thx @early-sundown-21192 - in the spirit of speed and getting my cluster back online I've decided to do something in-between and just mount an iSCSI volume at
and leave the rest on NFS, rather than try to figure out iSCSI boot at this time I have it working on one node since last night and so far, it seems to work well, just need to repeat for the others 🙂