This message was deleted Rancher Users #harvester

Join Slack

This message was deleted.

# harvester

adamant-kite-43734

07/04/2024, 5:37 PM

This message was deleted.

sticky-summer-13450

07/04/2024, 5:45 PM

Support bundle - of you want it.

supportbundle_5bb44244-434e-4530-ad35-35c4ef1ff661_2024-07-04T17-38-03Z.zip

red-king-19196

07/05/2024, 1:53 AM

That validating webhook is to ensure the upgrade’s smoothness. It’ll check the nodes’ local storage, especially where the container images are stored. For how to address it manually, please refer to the doc.

witty-jelly-95845

07/05/2024, 9:10 AM

@sticky-summer-13450 I had the same when upgrading my cluster. I cleared some log files from /root/ on one of my nodes from previous problems. I also found journalctl taking up ~4G on each node - check with

journalctl --disk-usage

and you can reduce with

journalctl --vacuum-size=<size>

(I used 1G)

sticky-summer-13450

07/05/2024, 9:29 AM

Thanks Simon

Copy code

> journalctl --disk-usage
Archived and active journals take up 240.0M in the file system.

So not that.

sticky-summer-13450

07/05/2024, 9:29 AM

Thanks Zespre This isn't an air gapped server, so not that.

witty-jelly-95845

07/05/2024, 9:31 AM

I'm guessing these are single disk nodes? What size disks?

sticky-summer-13450

07/05/2024, 9:38 AM

I'm feeding this back because Harvester is an appliance and shouldn't need manual intervention.

sticky-summer-13450

07/05/2024, 9:38 AM

The node is well within the the spec. Each node has a single 2TB SSD, so a lot above the minimum 250GB/500GB for dev/prod

Copy code

> df -hlT --exclude-type=overlay --exclude-type=tmpfs
Filesystem     Type      Size  Used Avail Use% Mounted on
devtmpfs       devtmpfs  4.0M     0  4.0M   0% /dev
/dev/nvme0n1p3 ext4       15G  2.6G   12G  19% /run/initramfs/cos-state
/dev/loop0     ext2      3.0G  1.3G  1.6G  46% /
/dev/nvme0n1p2 ext4       44M   54K   40M   1% /oem
/dev/nvme0n1p5 ext4       98G   74G   20G  80% /usr/local
/dev/nvme0n1p6 ext4      1.8T  664G 1004G  40% /var/lib/harvester/defaultdisk

sticky-summer-13450

07/05/2024, 9:46 AM

Maybe the original partitioning from the original beta is different from the expected partitioning for Harvester 894 days later

witty-jelly-95845

07/05/2024, 9:47 AM

mine are 1TB so you should be in better position than me

sticky-summer-13450

07/05/2024, 9:49 AM

There's 1TB free on

/dev/nvme0n1p6

, if only it could be used...

sticky-summer-13450

07/05/2024, 9:57 AM

What does the partition layout look like on your nodes?

red-king-19196

07/05/2024, 10:06 AM

Copy code

/dev/nvme0n1p5 ext4       98G   74G   20G  80% /usr/local

This is the one that we care. 20G is not enough. As you mentioned, if it’s not an air-gapped env, we shouldn’t prevent users from upgrading.

red-king-19196

07/05/2024, 10:11 AM

There’s an annotation

<http://harvesterhci.io/skipWebhook|harvesterhci.io/skipWebhook>: true

that might be of interest. But you’ll need to create the Upgrade object with the annotation with

kubectl

🙃

witty-jelly-95845

07/05/2024, 10:13 AM

Copy code

dobby:~ # df -hlT --exclude-type=overlay --exclude-type=tmpfs --exclude-type=devtmpfs
Filesystem                                             Type  Size  Used Avail Use% Mounted on
/dev/nvme0n1p3                                         ext4   15G  2.6G   12G  19% /run/initramfs/cos-state
/dev/loop0                                             ext2  3.0G  1.3G  1.6G  46% /
/dev/nvme0n1p2                                         ext4   44M  404K   40M   1% /oem
/dev/nvme0n1p5                                         ext4   98G   74G   20G  80% /usr/local
/dev/nvme0n1p6                                         ext4  795G  217G  538G  29% /var/lib/longhorn
/dev/longhorn/pvc-854b28ea-3684-4e49-ab1a-583d20e6528c ext4  974M   41M  918M   5% /var/lib/kubelet/pods/26771470-9d2e-4513-9627-751167f96850/volumes/kubernetes.io~csi/pvc-854b28ea-3684-4e49-ab1a-583d20e6528c/mount
/dev/longhorn/pvc-a807fda0-0f05-4c77-ba13-3296f6a66d48 ext4  4.9G   28K  4.9G   1% /var/lib/kubelet/pods/5a3f40f4-2f18-4b1a-8336-7e9124f49ea4/volumes/kubernetes.io~csi/pvc-a807fda0-0f05-4c77-ba13-3296f6a66d48/mount
/dev/longhorn/pvc-04bb904a-3e6d-42b1-b863-e1866d991b3d ext4  974M  8.5M  949M   1% /var/lib/kubelet/pods/255beff9-15ad-4351-9f5c-5fb5c45dc369/volumes/kubernetes.io~csi/pvc-04bb904a-3e6d-42b1-b863-e1866d991b3d/mount
/dev/longhorn/pvc-72276e63-4b68-4dd2-b077-8fa892c9b5a1 ext4  5.4M  2.3M  3.0M  43% /var/lib/kubelet/pods/9e11b9d7-9a61-4563-b494-7e254d34283a/volumes/kubernetes.io~csi/pvc-72276e63-4b68-4dd2-b077-8fa892c9b5a1/mount
/dev/longhorn/pvc-b6f666ae-831b-415b-97eb-498368b427be ext4  974M   61M  897M   7% /var/lib/kubelet/pods/8362214d-20a5-4197-8927-1634f01741a8/volumes/kubernetes.io~csi/pvc-b6f666ae-831b-415b-97eb-498368b427be/mount
/dev/longhorn/pvc-143f8d0b-5dda-4600-8648-b024a4b36a48 ext4   49G  2.1G   47G   5% /var/lib/kubelet/pods/b0b69334-b778-4cac-ae76-044fd3d436b3/volumes/kubernetes.io~csi/pvc-143f8d0b-5dda-4600-8648-b024a4b36a48/mount

witty-jelly-95845

07/05/2024, 10:14 AM

yes my /usr/local is also 20G but this is post-upgrade to 1.3.1 - looks like I might have issues (ha!) with the next upgrade

sticky-summer-13450

07/05/2024, 10:15 AM

I'd already put the annotation

<http://harvesterhci.io/minFreeDiskSpaceGB|harvesterhci.io/minFreeDiskSpaceGB>: '20'

on the Version object. I think I'll wait until I've had a bit more sleep before making an Upgrade object.

sticky-summer-13450

07/05/2024, 10:17 AM

Simon - you don't have a

/var/lib/harvester/defaultdisk

but you do have a

/var/lib/longhorn

. That's curious.

witty-jelly-95845

07/05/2024, 10:17 AM

anyway /usr/local is checked in the second test, we still need to clear some space for /

witty-jelly-95845

07/05/2024, 10:18 AM

Mark - could be a 1.3.1 thing? i don't seem to have noted df output pre-upgrade

👍 1

sticky-summer-13450

07/05/2024, 10:20 AM

Could be - very interesting to know for sure.

sticky-summer-13450

07/06/2024, 11:08 AM

There do seem to be a lot of old versions of images on the node which could be pruned. I wonder why kubelet is not garbage collecting them.

Untitled

sticky-summer-13450

07/06/2024, 11:29 AM

Maybe I need to use a script like this... https://github.com/harvester/upgrade-helpers/blob/main/bin/harv-purge-images.sh

👍 1

sticky-summer-13450

07/06/2024, 12:53 PM

Actually, all I did was to run 20 or 30

sudo /var/lib/rancher/rke2/bin/crictl -r unix:///run/k3s/containerd/containerd.sock rmi <image>:<version>

to dump the older duplicates of the largest images, and that's given me, on one node, more that 10GB more space in the

/usr/local

partition. So the upgrade has started. I think purging the old images and keeping down the number of junk images would be a useful task for the installer to do.

red-king-19196

07/06/2024, 1:08 PM

After v1.2.2 and v1.3.0, further upgrades will have an additional step to purge the old version container images

👍 2

26 Views

Open in Slack

Previous Next