04/17/2023, 10:38 AM
hi mates, I'm trying to start etcd cluster with 5 nodes. I was working fine last week. All of a sudden, I found that etcd db size grew over 2GB then etcd cluster stopped work. So the first action was to increase the size by adding --etcd-arg into /etc/systemd/system/k3s.service.
I added the above parameter in k3s.service file on the k3s host where the cluster is initialized. However, after systemctl daemon-reload, systemctl restart k3s, etcd was still reporting that there is no space. So my second action was to restore the etcd server from snapshot from last known good state. I did the restore with guidance posted on website (HA restore). Now what I got is : • ETCD cluster is not running, no active member. • From the k3s log I can see that etcd server is trying to start with a new db size of 6GB. But also it tries to start a instance with 2GB with data dir /<k3s-dir>/etcd-tmp. I'm not sure if its by design or just error. • k3s service failed to start on all 5 nodes. The k3s status is 'activating' • after 3 restore process, I don't see any error related to etcd db size. But etcd still doesn't work at all. Do you guys have any recommendations?