This message was deleted.
# k3s
a
This message was deleted.
l
After further testing on a single more powerful node (with the other difference being that it doesn't have swap), it seems like it mostly works fine, outside of some occasional cases when it does saturate the disk, but it recovers. Beyond just running a more powerful server, is there a way to solve this?
c
what’s the source of all your extra disk IO?
I would probably try to figure that out first. is it just all your image pulls?
l
Currently (with the larger node) it's seemingly everything, many processes are using >1MBps, with a k3s server process hitting ~10MBps. Before, if I remember correctly, it was just one k3s server process using 500MBps, but I'll recreate that once I get this fully working This is not during image pulls - during image pulls there's an expected spike from containerd, but it's not significant
c
are you still using k3d to run multiple k3s nodes on a single host?
l
No, k3d was just for local development
now I'm trying to get that to work in a real environment
c
that doesn’t seem right then. There should be only one k3s server process. Are those perhaps thread IDs? What do you see in
ps -auxf
?
l
Yeah, there's only one k3s server
thread IDs then
c
are you using etcd or kine/sqlite?
l
fairly certain I'm using sqlite - I didn't explicitly configure anything otherwise
c
that’s an unusually large amount of IO. Do you have something that is hammering on the apiserver with create/modify/delete requests?
l
My suspicion would be fluxcd, but while testing on k3d and when I was able to access grafana, the cluster always stayed under 100 apiserver requests per second
c
with k3d you might have been using a temporary datastore that was just on tmpfs, instead of real disk
now you’re actually putting load on it and it can’t keep up
l
interesting, so is it normal for k3s to such high throughput with larger volumes of requests hitting apiserver?
(presumably 500MBps is far beyond normal, but I mean the 10MBps I'm currently seeing)
c
its really just driven by what you do with it. The more you change on the apiserver, the more will get written to disk.
l
Ah, I'll try to get what I have barely working now to get to a grafana dashboard and see what's going on...
r
As a note, you mentioned your more successful test not having swap. Kubernetes in general recommends against having swap. You might try a machine spec'd like your original but with no swap partition and see what happens.
(or just comment out your swap partition in /etc/fstab on your original machine and reboot)
l
Oh, that's good to know, thank you
running the original setup without swap somehow ends up even worse, with 1.5GiBps being read...
but I'm starting to suspect that it's just running out of ram
seems like it, created swap and it suddenly dropped to comparatively reasonable numbers
r
It's odd that it worked with k3d. If it worked in k3d but is going haywire in straight k3s the only think I can think of is that maybe k3d has oom killer set a bit faster and you have runaway pods getting restarted or it's purging etcd after a certain size (which with default in-memory sqlite would directly hit RAM).
Having swap on ending up with less disk reads probably means that your cluster is just operating slower.
l
My only theory is it's the difference in resources - I'll try to run it in a few vms locally I'll check k3d logs, but I don't remember anything like this in logs
c
I still think its probably related to k3d etcd datastore being on tmpfs and not real disk
swap may also be related
l
Even if it's on tmpfs I should still be able to see if it's reading large amounts?
Last night I managed to get it to grafana and from what I saw, the request volume was identical to what I see locally
c
no, tmpfs is memory backed. not real disk
I”m talking about the IO from the k3s processes. Grafana’s disk IO will come from different processes.
l
Er, my last message was a bit unrelated, just wanted to point out that I didn't see a massive difference in API request rates Surprised that there's no way to monitor tmpfs, oh well
r
I thought k3s default etcd was in-memory sqlite, I guess that's wrong? But if k3d etcd is all tmpfs-backed and k3s is disk backed, enormous disk load increase for k3s makes perfect sense.
l
It should be sqlite, it's writing to an sqlite db
Increased memory on the server (4->8G) and everything works fine, now I'm interested to know how it fails the way it did...
r
SQLite can point at a file, but you can also do it in-memory. I thought k3s did it in-memory and just wrote the file to disk backing when stopped or every few minutes, but I guess it works with the file directly.
c
If you have more than one server, you're using etcd. Sqlite is single server only. They both write to disk, but if that disk is tmpfs and not an actual physical disk then the IO profile will be different regardless of which one you're using.