This message was deleted.
# k3s
a
This message was deleted.
r
Yeap, that's all I have so far....
Copy code
$ k get nodes
NAME    STATUS   ROLES                       AGE    VERSION
k3s01   Ready    control-plane,etcd,master   159d   v1.24.4+k3s1
k3s02   Ready    control-plane,etcd,master   132d   v1.24.4+k3s1
k3s03   Ready    control-plane,etcd,master   139d   v1.24.4+k3s1
s
Nice, what's is drawbacks, stability problems, k3s and OS lifecycle upgrades ?
r
My suggestion is to avoid disk-heavy workloads on masters which might impact etcd performance. And be aware that etcd is 24x7 constant stream of writes to the drive. My disk LED seem to turn OFF now and then not ON 😁
👍 1
I use 1TB NVMe drives for OS/K3s modified via
hdparm
to be provisioned as 240GB drives. Leaving all that extra space for Host Protected Area for bad sector remapping. Hopefully will improve write durability over the long term. My nodes are 2.5GbE connected with 10GbE uplink to PVC / iSCSI storage for application use.
s
Very good tips Richard, I think that the application that will be more disk-heavy in my case will be Loki for log aggregation. Could you also share which OS are running your cluster ? I am at the moment running a POC on FCOS and will be doing the same on SLE Micro and openSUSE Leap Micro OS.
r
Using Ubuntu 22.04.1 with external containerd and ZFS snapshotter.
b
@refined-toddler-64572 would you elaborate on 'external containerd'?
And your mention of hdparm and host protected area is very interesting, I never knew of such a thing
r
K3s binary includes an embedded "slimmed down" version of containerd. But it does not support everything, I needed to use the ZFS snapshotter as the default overlay filesystem does not work on ZFS. I installed the full version of containerd from Ubuntu repo (not the latest version) and pass a command line option to k3s on where to find it. It works great, no operational issues at all... until you get into Cadvisor / Prometheus / LENS. Metric collected / missing / labels / something at the container level break dashboards. Generally speaking pod level and higher metrics work as expected. It's enough of a headache, that if I was to do it over again. I would try a ZFS ZVOL formatted as Ext4 for the containerd dataset so I could use the K3s embedded containerd. Simpler install. Simpler to find support.
b
very interesting! potentially overkill, but have you heard of
democratic-csi
?
i use it to 'connect' my k3s kubernetes cluster to my truenas NFS share(s) its been very reliable for me
r
As to
hdparm
my understanding is that enterprise level SSDs include a sizable Host Protected Area (HPA) which help enable long term write endurance. Consumer grade SSDs have a much smaller HPA for cost cutting, lower requirements, etc. You can use
hdparm
to more or less, tell it an alternate last sector of the device and assign everything beyond that to the HPA. As an example I tried a batch of WD Blue M.2 SSDs with 24x7 etcd on them and had 100% failure rate within 1 week. Switched to Samsung and very large HPA and has been fine for many weeks. Still testing, but seems to be a good combo. Output from smartctl:
Copy code
Device Model:     Samsung SSD 860 EVO M.2 1TB
User Capacity:    240,049,281,024 bytes [240 GB]
Yes, I use
democratic-csi
iSCSI for PVs needing heavy disk writes. It does work very well. But is a single point of failure for cluster storage. Patching and rebooting TrueNAS now and then requires me to down my applications during that. Not something that can be easily converted to a Ceph cluster, hopefully TrueNAS Scale can help with this in the future.
b
@refined-toddler-64572 sorry for the late response i was unaware of builtin 'Host Protected Area' out of the box with SSDs 1 week is pretty flipping short were you able to warranty? and yes, i do think about that single-point-of-failure with democratic-csi but for my purposes, with my hardware, i can't comfortably distribute storage across my nodes anyway. i've got a ATX case stuffed full of hard drives, and two asrock deskmini. previously i'd used Rancher Longhorn for storage, but at the time it felt kinda.. finnicky.. so also i feel a bit more comfortable with my data living outside of my kubernetes. rook/ceph is on my list of things to investigate id like to use the ceph feature of kubernetes to provide storage to kubernetes. what speed are the links between nodes in your cluster?
r
I sent the failed SSD units back for refunds, wasn't interested in replacements. My plan was to use Longhorn for small storage needs, but then I learned it is unable to reclaim freed disk space within its volumes. It's not acceptable for any write heavy applications. So using democratic-csi for much more than I originally planned. As to your data living inside the cluster concern, keep in mind Longhorn supports scheduled backups to NFS volumes so you can have a copy of your data safe outside the cluster (NFS share on your TrueNAS). My nodes are at 2.5GbE to each other with 10GbE uplink to TrueNAS server.
b
i gotcha, that makes sense i am curious to see if/how truenas scale will ultimately scale they're running k3s so it should definitely be possible to spread truenas scale across multiple nodes
to my knowledge they're using k3s as a way to manage all the services, but its not designed to grow to other nodes right?
r
My understanding is Scale will use some type of ZFS over the gluster filesystem to scale to multiple storage nodes The k3s they use is just for apps running on it. Nothing to do with the ZFS storage itself.
👍 1