04/24/2023, 1:29 PM
Hey all, I'm running Rancher 2.6.7 and am trying to upgrade a Rancher provisioned RKE1 bare metal cluster from
and it's wrecking my nodes, preventing pretty much anything from running on them, including the cluster agent. Since the cluster agent is unable to start, the upgrade is unable to progress and I can't even tell it to roll back or make any other changes. It looks like whatever is responsible for mounting
is not doing it correctly, and it's being mounted as a directory instead of a file, causing the error below for any pod that tries to start. I'm not exactly sure how to proceed or work around the issue. I've been trying to figure this out for several days now. If anyone has any ideas it would be greatly appreciated.
E0422 08:11:06.957002 4127378 pod_workers.go:965] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"cluster-register\" with RunContainerError: \"failed to start container \\\"afd3874c6173e119096d353ed5fea741ce20050d7a2dd5db2c3d8ca5865f9ef1\\\": Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: rootfs_linux.go:59: mounting \\\"/opt/rke/var/lib/kubelet/pods/03663f0d-4ec2-4655-8eae-c39584074230/etc-hosts\\\" to rootfs at \\\"/var/lib/docker/overlay2/5a8743f9f37feac08f36725cca7ccd6ac39926940244ff924cddf0d41bec73a7/merged/etc/hosts\\\" caused: not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type\"" pod="cattle-system/cattle-cluster-agent-dc659f6fc-482gg" podUID=03663f0d-4ec2-4655-8eae-c39584074230


04/24/2023, 2:16 PM
i believe 2.6.7 was a bad release, upgrade to 26.8 which was a patch for 2.6.7 releases one day or so later


04/24/2023, 2:38 PM
Thanks for the quick reply, I'll give that a try