This message was deleted.
# harvester
a
This message was deleted.
m
I believe that level of “homelab” might be worth considering actual enterprise support. In all seriousness, if you build such a complex home lab and you are not a freelancer, consider raising a request for a proper test lab in your current gig. You should not spend more time fixing homelab then having a life - if it’s work related stuff, on that scale, I believe you should do it on work infra and then check for support under your daily hours. I know building your own sandcastle is most satifsiying but in the end, is it still your sandcastle or a free test env for your employer? For the actual issue I’m afraid I cannot provide anything useful here :/
j
Thanks Paul, I know it’s not the usual homelab and I’ve definitely profited from my employer already to get most of the network equipment and some of the drives. I’m also waiting for the next customer to give us back Gen10 or better hardware when they migrate, but I will probably always be 10-15 years late anyway. This is my hobby, I already spend thousands every year in maintenance for the privilege of working with enterprise-grade stuff and I leverage it to serve dozens of people in my entourage (hosting many things for family, friends and non-profit associations) as well as thousands of citizens (with projects like NTP Pool, mirrors, public DNS servers, etc.), but what I cannot afford (and my employer can barely afford for itself) is recent servers with the same specs (terabytes of volatile memory, hundreds of cores, etc). I’ve been doing the same for about 25 years, and for now I’ve always found a way - if I have to rebuild my k8s cluster onto a more flexible operating system I will, but of course I’ll try to bend this one first.
t
One of the things that really hurts disk performance is running longhorn in VMs on Longhorn (Harvester). You are creating 6 writes every time. I would look to consolidating where you can. Also consider setting Longhorn to default to 1 replica. How are you resenting disks to harvester? Hardware raid? External iscsi WILL work with good networking. Can you bond the 10gb nics together?
j
I also have a guest Longhorn, but it's just for CD testing and nothing runs on it, so it's not a problem if its useless. Both the guest and top Longhorn have 2 replicas, and you are right, they could run a single replica... Because the real drives are set in RAID5 by the hardware controller, so they already have disk failure tolerance. 🤔 I'm planning to bond 2x10GbE but I'll have to invest in more network stuff so it'll have to wait. I'm not sure it's going to help though, I'm always using less than 5% of the current bandwidth. I'll reduce some guest clusters etcd storage to a single replica and see if it helps a lot. Thanks!
t
1 Longhorn replica will help with performance.
Which iscsi csi were you using?
j
I tried democratic-csi because it integrates well with TrueNAS, but my issue is with the separate hardware, which only has 6x1GbE LACP and 7200 rpm SATA disks, even the local performance is not great. I just run MinIO on it for backups and Fleet Gitlab bootstrap purposes.
It's basically my DR starter, the old IBM hardware of both machines doesn't even support low level virtualisation.
t
AH..
maybe time to buy some new hardware? lol
😅 1
p
yeah we run "homelab" harvester on ~$300 ryzen beelinks with nvme/ssd + 64GB ram and it's basically fine. imo old server hw is more trouble than its worth except in special circumstances
t
I am a big fan of minisforum um890pro or MS-01, MS-A2!
1
j
All good choices, thank you - but I’ve paid as much for a full HP cluster with the capacity to go over 100 cores and 1TB of RAM, and I’m planning to use them. My scale is just not worth it with mini-pcs. I also have multiple SFP+ ports and the hundred-gigabit backbone to go with, and with the next generation I’ll switch to QSFP+ or better for >40GbE.
p
maybe time to pay for suse support lol
👍 1
t
Maybe
j
I don’t believe Harvester has support yet, or even Elemental. I also find standard SLES prices to be more expensive than my whole infrastructure maintenance (drives, upgrades…) which makes little sense.
f
From my own testing I can say that the disks matter quite lot. We saw Longhorn tank performance by 90%+ on 15K SAS and most consumer SATA SSDs. With enterprise SSD we got much healthier numbers. Especially those with a healthy amount of DRAM and PLP.
👍 1
j
@thousands-advantage-10804, are you sure 1 replica volumes are supported? The pre-check upgrade script warns me about them… “Volume pvc-… is a single-replica volume in running state, it will block node drain. Please consider adjusting its replica count before upgrading.” If I just need to stop the related workloads and detach them during the upgrade it’s OK, but in that case I find the warning message a bit misleading.
All fine, I see that the messages changes to “Volume pvc-… is a single-replica volume in detached state, it will block node drain before v1.4.0. For upgrades starting from v1.4.0, though node draining will not be blocked, the upgrade may still have potential data integrity concerns. Please consider adjusting its replica count before upgrading.“, I’ll take the risk.
t
This is where you have options. What are you optimizing the cluster for. High availability? Simplicity? If This is a lab. You can go one replica. If this is an enterprise production deployment, I would go external san.
1
I stop all vms before an upgrade just to make they are clean
👍 1