https://rancher.com/ logo
Title
b

bored-rain-98291

09/06/2022, 6:23 PM
Greetings friends. I have a 3 node RKE2 cluster for a group of rascals (developers). I would like to set up monitoring and get alerts if a node goes down e.g. rke2-server process dies for any reason. In “my days” (yeah im older) we used things like monit to watch these processes. I was wonderfing if there were more ‘cloud-native’ ways to do this? thanks in advance!
b

broad-farmer-70498

09/06/2022, 7:06 PM
There are lot's of layers here obviously but setting up some sort of monitor on the systemd unit might be good.
Having said that, k8s itself keeps pretty good track of things and has properties on the
nodes
resource that determine if the node is healthy
b

bored-rain-98291

09/06/2022, 7:08 PM
that is true. Thanks for the help!
b

broad-farmer-70498

09/06/2022, 7:09 PM
k8s provides mechanisms to ‘watch’ for those changes real time or you can poll every so often
b

bored-rain-98291

09/06/2022, 7:09 PM
Yes ive watched a few in the past. Cool!
b

broad-farmer-70498

09/06/2022, 7:10 PM
If you have prometheus or similar running externally and gathering node stats then you could alert with something like grafana/prometheus as well
b

bored-rain-98291

09/06/2022, 7:11 PM
Yes exactly. On this cluster i dont have those tools setup but i should have them setup soon.
b

broad-farmer-70498

09/06/2022, 7:11 PM
that would be things like you haven't received a new scrape in a while or even if you have maybe loads on the node are too low (ie: k8s isn't running) and the like
b

bored-rain-98291

09/06/2022, 7:13 PM
thanks for the great info
b

broad-farmer-70498

09/06/2022, 7:14 PM
If you want simple stuff running inside to check higher level things you could run something like uptime kuma
b

bored-rain-98291

09/06/2022, 7:14 PM
ah! thats one i havent seen yet. I must go look 😉
b

broad-farmer-70498

09/06/2022, 7:14 PM
Maybe put an integration into the cluster to automatically register ingresses with uptime kuma to create/delete monitors automatically
b

bored-rain-98291

09/06/2022, 7:15 PM
Thanks for the tip! this looks like a great tool
b

broad-farmer-70498

09/06/2022, 7:15 PM
It's not enterprise-y whatever that means but it's effective as a simple tool
This is relatively popular for monitoring specific services: https://github.com/stakater/IngressMonitorController
b

bored-rain-98291

09/06/2022, 7:18 PM
fantastic!
b

broad-farmer-70498

09/06/2022, 7:18 PM
The integrated providers don't work for internal only services which is where kuma would come in
b

bored-rain-98291

09/06/2022, 7:18 PM
ah i see
b

broad-farmer-70498

09/06/2022, 7:19 PM
I wish they had a configuration to automate with kuma but alas
b

bored-rain-98291

09/06/2022, 7:19 PM
That would be excellent
b

broad-farmer-70498

09/06/2022, 7:20 PM
Maybe someday I’ll get ambitious and just write it myself, but not today :)
Writing a simple controller to do it on your own would be easy enough frankly.
b

bored-rain-98291

09/06/2022, 7:21 PM
Seems like it. I was a developer for 10+ years before i got into ops. I miss writing code.
b

broad-farmer-70498

09/06/2022, 7:21 PM
Write ops code!
b

bored-rain-98291

09/06/2022, 7:22 PM
so i plan to remedy that
lol awesome
b

broad-farmer-70498

09/06/2022, 7:23 PM
I wrote that. If you're familiar with node red then you will understand the implications and how easily a 'controller’ could be written to automate kuma entries etc
b

bored-rain-98291

09/06/2022, 7:23 PM
that is slick!
b

broad-farmer-70498

09/06/2022, 7:25 PM
Internally we do all kinds of crazy stuff with it honestly.
Custom schedulers, admission controllers, etc
b

bored-rain-98291

09/06/2022, 7:25 PM
Sounds like a fun place
👍 1
Ive had my fill of web development - ops development is the only thing i would be interested in doing anymore