This message was deleted.
# harvester
a
This message was deleted.
s
r
That validating webhook is to ensure the upgrade’s smoothness. It’ll check the nodes’ local storage, especially where the container images are stored. For how to address it manually, please refer to the doc.
w
@sticky-summer-13450 I had the same when upgrading my cluster. I cleared some log files from /root/ on one of my nodes from previous problems. I also found journalctl taking up ~4G on each node - check with
journalctl --disk-usage
and you can reduce with
journalctl --vacuum-size=<size>
(I used 1G)
s
Thanks Simon
Copy code
> journalctl --disk-usage
Archived and active journals take up 240.0M in the file system.
So not that.
Thanks Zespre This isn't an air gapped server, so not that.
w
I'm guessing these are single disk nodes? What size disks?
s
I'm feeding this back because Harvester is an appliance and shouldn't need manual intervention.
The node is well within the the spec. Each node has a single 2TB SSD, so a lot above the minimum 250GB/500GB for dev/prod
Copy code
> df -hlT --exclude-type=overlay --exclude-type=tmpfs
Filesystem     Type      Size  Used Avail Use% Mounted on
devtmpfs       devtmpfs  4.0M     0  4.0M   0% /dev
/dev/nvme0n1p3 ext4       15G  2.6G   12G  19% /run/initramfs/cos-state
/dev/loop0     ext2      3.0G  1.3G  1.6G  46% /
/dev/nvme0n1p2 ext4       44M   54K   40M   1% /oem
/dev/nvme0n1p5 ext4       98G   74G   20G  80% /usr/local
/dev/nvme0n1p6 ext4      1.8T  664G 1004G  40% /var/lib/harvester/defaultdisk
Maybe the original partitioning from the original beta is different from the expected partitioning for Harvester 894 days later
w
mine are 1TB so you should be in better position than me
s
There's 1TB free on
/dev/nvme0n1p6
, if only it could be used...
What does the partition layout look like on your nodes?
r
Copy code
/dev/nvme0n1p5 ext4       98G   74G   20G  80% /usr/local
This is the one that we care. 20G is not enough. As you mentioned, if it’s not an air-gapped env, we shouldn’t prevent users from upgrading.
There’s an annotation
<http://harvesterhci.io/skipWebhook|harvesterhci.io/skipWebhook>: true
that might be of interest. But you’ll need to create the Upgrade object with the annotation with
kubectl
🙃
w
Copy code
dobby:~ # df -hlT --exclude-type=overlay --exclude-type=tmpfs --exclude-type=devtmpfs
Filesystem                                             Type  Size  Used Avail Use% Mounted on
/dev/nvme0n1p3                                         ext4   15G  2.6G   12G  19% /run/initramfs/cos-state
/dev/loop0                                             ext2  3.0G  1.3G  1.6G  46% /
/dev/nvme0n1p2                                         ext4   44M  404K   40M   1% /oem
/dev/nvme0n1p5                                         ext4   98G   74G   20G  80% /usr/local
/dev/nvme0n1p6                                         ext4  795G  217G  538G  29% /var/lib/longhorn
/dev/longhorn/pvc-854b28ea-3684-4e49-ab1a-583d20e6528c ext4  974M   41M  918M   5% /var/lib/kubelet/pods/26771470-9d2e-4513-9627-751167f96850/volumes/kubernetes.io~csi/pvc-854b28ea-3684-4e49-ab1a-583d20e6528c/mount
/dev/longhorn/pvc-a807fda0-0f05-4c77-ba13-3296f6a66d48 ext4  4.9G   28K  4.9G   1% /var/lib/kubelet/pods/5a3f40f4-2f18-4b1a-8336-7e9124f49ea4/volumes/kubernetes.io~csi/pvc-a807fda0-0f05-4c77-ba13-3296f6a66d48/mount
/dev/longhorn/pvc-04bb904a-3e6d-42b1-b863-e1866d991b3d ext4  974M  8.5M  949M   1% /var/lib/kubelet/pods/255beff9-15ad-4351-9f5c-5fb5c45dc369/volumes/kubernetes.io~csi/pvc-04bb904a-3e6d-42b1-b863-e1866d991b3d/mount
/dev/longhorn/pvc-72276e63-4b68-4dd2-b077-8fa892c9b5a1 ext4  5.4M  2.3M  3.0M  43% /var/lib/kubelet/pods/9e11b9d7-9a61-4563-b494-7e254d34283a/volumes/kubernetes.io~csi/pvc-72276e63-4b68-4dd2-b077-8fa892c9b5a1/mount
/dev/longhorn/pvc-b6f666ae-831b-415b-97eb-498368b427be ext4  974M   61M  897M   7% /var/lib/kubelet/pods/8362214d-20a5-4197-8927-1634f01741a8/volumes/kubernetes.io~csi/pvc-b6f666ae-831b-415b-97eb-498368b427be/mount
/dev/longhorn/pvc-143f8d0b-5dda-4600-8648-b024a4b36a48 ext4   49G  2.1G   47G   5% /var/lib/kubelet/pods/b0b69334-b778-4cac-ae76-044fd3d436b3/volumes/kubernetes.io~csi/pvc-143f8d0b-5dda-4600-8648-b024a4b36a48/mount
yes my /usr/local is also 20G but this is post-upgrade to 1.3.1 - looks like I might have issues (ha!) with the next upgrade
s
I'd already put the annotation
<http://harvesterhci.io/minFreeDiskSpaceGB|harvesterhci.io/minFreeDiskSpaceGB>: '20'
on the Version object. I think I'll wait until I've had a bit more sleep before making an Upgrade object.
Simon - you don't have a
/var/lib/harvester/defaultdisk
but you do have a
/var/lib/longhorn
. That's curious.
w
anyway /usr/local is checked in the second test, we still need to clear some space for /
Mark - could be a 1.3.1 thing? i don't seem to have noted df output pre-upgrade
👍 1
s
Could be - very interesting to know for sure.
There do seem to be a lot of old versions of images on the node which could be pruned. I wonder why kubelet is not garbage collecting them.
Actually, all I did was to run 20 or 30
sudo /var/lib/rancher/rke2/bin/crictl -r unix:///run/k3s/containerd/containerd.sock rmi <image>:<version>
to dump the older duplicates of the largest images, and that's given me, on one node, more that 10GB more space in the
/usr/local
partition. So the upgrade has started. I think purging the old images and keeping down the number of junk images would be a useful task for the installer to do.
r
After v1.2.2 and v1.3.0, further upgrades will have an additional step to purge the old version container images
👍 2