This message was deleted.
# rke2
a
This message was deleted.
c
You should figure out why it's taking so long to get scheduled. Likely you lack sufficient resources on your nodes to schedule the pod somewhere, until something else terminates and frees resources.
l
@creamy-pencil-82913 great thanks for your help! in my opinion, if there is few resource to run the job. the pod could be shown in
kubectl get po
but it’s state should be stuck at
PENDING
. while in my case, the pod shows up slowly, and doesn’t spend much time in
PENDING
state. besides, is there any metrics should I watch on to figure it out. e.g.
pod number limit
in
kubectl describe no
. great thanks!
I have raised that number to 234 rather than the default 110.
c
If the pod doesn't even show up for a long time, then the issue is on the airflow side. You'd want to see why it is delaying creation of the pod. Once it exists in Kubernetes you can monitor its status to see where the delays are in scheduling and running.
234 pods on one node is a lot. How many nodes are in your cluster, and what CPU/memory resources are available on your nodes?
l
I have many small tasks which is running in single pods. I also think it’s not proper, but i think the pod number limit will stop me from creating new tasks. actually, even 110 pods limitation is not violated.
@creamy-pencil-82913 as for airflow side, the airflow kubernetes executor use k8s api to submit tasks which would be result in a k8s pod later. I also think it may be caused by the airflow kubernetes executor. I want to figure out the bottleneck, that is, 1. the airflow kubernetes executor submit task at a slow rate, OR 2. OR the rke2 api server could only receive the request at a slow rate I think the later one may be the root cause, but I don’t know how to benchmark it
c
If the pods aren't getting created then it's on the airflow side
If they're not getting scheduled then it's on the Kubernetes side