Hello Using Rancher to manage Helm K8s Docker deployments I Rancher Users #rke2

Hello. Using Rancher to manage Helm/K8s (Docker) d...

cool-jewelry-75805

04/22/2025, 11:19 PM

Hello. Using Rancher to manage Helm/K8s (Docker) deployments, I had been using an RKE1 Triton template to deploy a Debian-based (Debian 11) image that mounted a Triton NFS volume, using NFSv3, for permanent storage. However, after switching to RKE2, the underlying image was changed to use Ubuntu and also moved from Docker to containerd. At this same time, the NFS version was changed to NFSv4. For our deployment, we have a production system that takes backups and syncs them to a disaster recovery system in a different data center - this had been working just fine using RKE1/Docker/Debian/NFSv3. However, after all of the previously mentioned changes, every few weeks Postgres on the DR system reports after a backup that a

stale file handle

existed in the

$PGDATA/data

directory while trying to apply the WAL changes, and thus shut itself down. This causes the pod to restart and also remove the

standby.signal

file that tells Postgres to start in standby mode, and from here backups fail to be applied. After manually restoring the database files, the DR system runs fine for several weeks before it happens again. I had tried adding a few NFSv4 options to resolve the issue, but so far none of these have worked:

actimeo=0,noac,lookupcache=none

. Also importantly, trying to force NFSv3 with

nfsvers=3

in the PV for the mount has no affect (NFSv3 is used, but the

stale file handle

still occurs eventually). Might anyone have suggestions or ideas of what might be going on here? Thank you.

2 Views

Open in Slack

Previous Next