This message was deleted Rancher Users #k3s

Join Slack

This message was deleted.

# k3s

adamant-kite-43734

04/26/2024, 2:26 PM

This message was deleted.

late-barista-75996

04/26/2024, 2:31 PM

It kinda sounds similar to how AKS works: https://stackoverflow.com/questions/64378694/kubernetes-health-checks-failing-with-network-policies-enabled . The health checks appear to be coming from the ::1 IP, but there is no pod with that IP. Am I really expected to hard-code blocks, here?

late-barista-75996

04/26/2024, 2:48 PM

Huh, I tried added allowing ingress:

Copy code

- from:
  - ipBlock:
      cidr: <my cluster IP block>

and it still didn't help 🤔

creamy-pencil-82913

04/26/2024, 3:37 PM

Pod health checks are internal, within the network namespace of the pod. They are not affected by network policy, as they don’t ever actually cross the pod network boundary.

creamy-pencil-82913

04/26/2024, 3:37 PM

Are you externally health-checking the pods or something?

creamy-pencil-82913

04/26/2024, 3:38 PM

or do your health checks depend on something that you’ve blocked?

late-barista-75996

04/26/2024, 3:43 PM

Hmm, I don't think so?

late-barista-75996

04/26/2024, 3:43 PM

As soon as I enable the network policies, I get this:

Copy code

Warning  Unhealthy         2m41s (x3 over 3m1s)   kubelet            Liveness probe failed: Get "http://[fda5:f00d:c:0:2::2c6]:80/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy         2m41s (x3 over 3m1s)   kubelet            Readiness probe failed: Get "http://[fda5:f00d:c:0:2::2c6]:80/health": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

late-barista-75996

04/26/2024, 3:44 PM

It's just the standard probes

late-barista-75996

04/26/2024, 3:45 PM

There are two network policies in question, here: a namespace wide default-deny on both ingress and egress, and then one that allows traffic from the ingress controllers for these particular pods. I'm able to access the pods via the latter, but the probes fail and the pod is restarted

creamy-pencil-82913

04/26/2024, 4:11 PM

hmm that’s odd. What CNI are you using?

creamy-pencil-82913

04/26/2024, 4:12 PM

afaik netpols don’t affect health checks but perhaps that is CNI specific

late-barista-75996

04/26/2024, 7:51 PM

@creamy-pencil-82913 I'm just using the default flannel

late-barista-75996

04/26/2024, 7:53 PM

I'll try to create a minimum reproducer, see if you can duplicate

late-barista-75996

04/26/2024, 8:17 PM

Okay here you go @creamy-pencil-82913 :

Copy code

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        livenessProbe: &probe
          httpGet:
            path: /
            port: 80
        readinessProbe:
          <<: *probe
        startupProbe:
          <<: *probe
---
apiVersion: v1
kind: Service
metadata:
  name: ngnix-service
spec:
  selector:
    app: nginx
  ports:
  - port: 80
---
kind: NetworkPolicy
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
metadata:
  name: default-deny
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress
---
kind: NetworkPolicy
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
metadata:
  name: nginx
spec:
  policyTypes:
    - Ingress
    - Egress
  podSelector:
    matchLabels:
      app: nginx
  ingress:
    # Allow access from ingress controller (i.e. from the internet)
    - from:
      - namespaceSelector:
          matchLabels:
            name: nginx-ingress
        podSelector:
          matchLabels:
            <http://app.kubernetes.io/name|app.kubernetes.io/name>: nginx-ingress

  egress:
    # Allow access to DNS
    - to:
      - namespaceSelector:
          matchLabels:
            <http://kubernetes.io/metadata.name|kubernetes.io/metadata.name>: kube-system
        podSelector:
          matchLabels:
            k8s-app: kube-dns
      ports:
      - protocol: UDP
        port: 53

late-barista-75996

04/26/2024, 8:17 PM

Interestingly, if I leave only the default-deny network policy, the probes appear to work fine

late-barista-75996

04/26/2024, 8:18 PM

But it's the addition of the "nginx" policy, which is intended to open things up a little, that makes them start to fail

late-barista-75996

04/26/2024, 8:19 PM

Am I doing something weird in that policy?

creamy-pencil-82913

04/26/2024, 9:03 PM

do you see the same thing with ipv4, or only ipv6?

creamy-pencil-82913

04/26/2024, 9:04 PM

I wonder if it could be related to https://github.com/k3s-io/k3s/issues/9925

creamy-pencil-82913

04/26/2024, 9:05 PM

https://github.com/k3s-io/k3s/discussions/9807

creamy-pencil-82913

04/26/2024, 9:05 PM

Are you able to try with this week’s releases?

late-barista-75996

04/26/2024, 9:27 PM

@creamy-pencil-82913 sorry, I'm tethered as I'm on the road, I keep losing signal here and there 😛

late-barista-75996

04/26/2024, 9:28 PM

ipv6 is the primary in this cluster, so to the length of my knowledge I can either do dual stack or only ipv6, but not ipv4, and probes always use ipv6 either way

late-barista-75996

04/26/2024, 9:28 PM

Those issues sound awfully related to the mac-address-changing bug I logged last week, in that those packets NOT being dropped in the past would explain why the mac changing was never a problem

late-barista-75996

04/26/2024, 9:29 PM

Sadly, this is a prod cluster, I'm not all that comfortable using a non-stable release on it

late-barista-75996

04/26/2024, 9:30 PM

Interesting data point: I was wrong about the default-deny policy. But the policy needs to be in place before the pod fires up

late-barista-75996

04/26/2024, 9:30 PM

If I put it in place after, the pod is fine until it restarts

late-barista-75996

04/26/2024, 9:30 PM

Then the probes start to fail

late-barista-75996

04/26/2024, 9:39 PM

@creamy-pencil-82913 I went ahead and logged https://github.com/k3s-io/k3s/issues/10030 to make async easier

Open in Slack

Previous Next