https://rancher.com/ logo
Title
f

future-account-50371

07/27/2022, 8:10 AM
Hi there, what can the issue if every pod scheduled to same worker every time we run CI pipeline job? the cluster base 4 workers, has the same resources, there is no taint or nodeSelector configured
t

tall-school-18125

07/27/2022, 11:27 AM
It sounds like normal behavior, since the kubernetes scheduler is not breaking any rules. If you want the pods on each node you could set node selectors on the pods or create a Daemonset.
f

future-account-50371

07/27/2022, 11:31 AM
isnt the scheduler assign where there free resources and do the load balancing?
node selector define which pod assign to which node. i would work default
t

tall-school-18125

07/27/2022, 11:33 AM
the scheduler will assign them to wherever there are free resources
but it doesn't do load balancing
f

future-account-50371

07/27/2022, 11:34 AM
so its weird because we see that it schedule always to a specific worker and reach to high load
although there more 3 idle workers
when the worker was on high load, new pods didnt assign to the others
t

tall-school-18125

07/27/2022, 11:37 AM
can you share the pod spec?
f

future-account-50371

07/27/2022, 11:38 AM
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% worker-01 639m 0% 27327Mi 36% worker-02 2199m 3% 34688Mi 48% worker-03 1128m 1% 28053Mi 38% worker-04 987m 1% 27513Mi 38%
every new pods install assign to worker-02
t

tall-school-18125

07/27/2022, 11:39 AM
if you cordone worker 2, do the pods get scheduled to other nodes?
f

future-account-50371

07/27/2022, 11:42 AM
spec: containers: - args: - while true; do sleep 30; done; command: - /bin/sh - -c image: zuul-image-repo imagePullPolicy: Always name: zuul resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-nllmw readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true nodeName: worker-02 preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300
i can try to condon the node, need to wait for job finishing
t

tall-school-18125

07/27/2022, 11:54 AM
On the spec, I saw that the pod doesn't have a resource quota. You can add a quota which should limit how many of them will get scheduled
f

future-account-50371

07/27/2022, 11:59 AM
maybe its because that pods installed via ansible plaubook and there is no any deployment/daemonset in cluster?
basically the part of the flow is there ansible playbook run create namespace with the pods when finish the namespace deleted. so in this case we see every new pods going to same worker and not assign to other worker nodes
does it make sense normal behavior/
?
t

tall-school-18125

07/27/2022, 12:08 PM
I think it is normal behavior because I can't find anything in the kubernetes documentation that guarantees that the scheduler will distribute pods evenly. There is a feature called topology spread constraints that can also be used to equally distribute nodes. https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/ But I think the easiest way to do it would be to add resource quotas to the pod spec. The resource quotas are a kubernetes best practice, while the topology constraints are a more advanced feature.
f

future-account-50371

08/02/2022, 9:47 AM
if i understand correctly, Resource Quota set resource limit (cpu,mem) for pods