cool-jewelry-75805
04/22/2025, 11:19 PMstale file handle
existed in the $PGDATA/data
directory while trying to apply the WAL changes, and thus shut itself down. This causes the pod to restart and also remove the standby.signal
file that tells Postgres to start in standby mode, and from here backups fail to be applied. After manually restoring the database files, the DR system runs fine for several weeks before it happens again.
I had tried adding a few NFSv4 options to resolve the issue, but so far none of these have worked: actimeo=0,noac,lookupcache=none
. Also importantly, trying to force NFSv3 with nfsvers=3
in the PV for the mount has no affect (NFSv3 is used, but the stale file handle
still occurs eventually).
Might anyone have suggestions or ideas of what might be going on here? Thank you.