This message was deleted.
# rke2
a
This message was deleted.
By default, you can access the ingress via any of your nodes on port 80 or 443
o
interesting. I thought I had everything good to go, but I still can't get ingress + nginx to work
c
note that ingress usually routes requests based on hostnames, so you’d need to point a hostname at the nodes, and match that IP in the ingress resource, in order for it to work.
You can’t just hit the nodes at their IP and have it route it. It is specifically keyed off hostnames
o
can I share what I have with you to see if you see something evident?
c
what part of it is not working? are you getting a 404, or can’t connect, or ?
o
no connection to a webgui we have
unless i hit it via the nodePort work around we have in place
c
what do you mean by no connection
o
when visiting the URL in a browser, we get "This site can't be reached"
so its either gotta be in the ingress conf or nginx.conf no?
c
can you test it with curl, and confirm if you can’t connect, or get a 404, or a 5XX, or what?
o
failed to connect, connection refused
curl: (7) Failed to connect to URL port 80: Connection refused
c
What IP or hostname are you curling? Have you confirmed that there is an instance of the nginx ingress running on that node?
o
i didnt curl the IP, but the hostname
want me to try via IP?
c
and that hostname points at the address of a node that is running an nginx pod?
o
would i verify that with
kubectl get services
or another command?
c
kubectl get pod -A
?
the default deployment is a daemonset that uses port 80/443. It is not a LoadBalancer service.
o
so, it has an nginx pod, and the
rke2-ingress-nginx-controller
(im new to the program, so going through what others have configured, or tried to configure.. so sorry if its a bit wacky) [program being this project im assigned to]
essentially running a single node cluster with multiple services running.. they want to be able to ingress into the cluster and hit a webgui and eventually an api. nginx pod is there, rke2 ingress controller (nginx) is there.. i dont know if something is misconfigured or not, but helm builds it and upgrades it fine.. but when i try to go to http://hostname/mysticportal nada happens although in nginx.conf there is a reverse proxy set up... but when i go to http://hostname:30001 it loads
c
wait is there another nginx reverse proxy in front of the nginx ingress?
o
so we have an nginx pod that houses the nginx.conf manipulations via configmap that establishes proxy passes for locations on the node, and then we have the rke2 nginx ingress controller
is that first part not supposed to be in place?
c
no, that’s what the ingress itself is supposed to do
You’re supposed to just hit the ingress directly. If you need to map things to services by paths or hostnames, that is specifically what the ingress does.
don’t put another layer in front of that, you’re just making more work for yourself
o
interesting... and that's under the
ingress
call out in the
values.yaml
i would presume?
c
no. the values configures the ingress CONTROLLER. You configure request routing via the ingress resources themselves. https://kubernetes.io/docs/concepts/services-networking/ingress/#the-ingress-resource
o
hmmm so i need to create individual resources per route?
c
The controller reads ingress resources, and configures nginx to route the requests as desired
yes. that is how ingress works.
o
so under
ingress
in the
values.yaml
file, that feeds back into the
ingress
template i presume? right now our values file has this section for ingress:
c
no
o
Copy code
ingress:
  enabled: true
  className: ""
  annotations:
    <http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: nginx
    # <http://kubernetes.io/tls-acme|kubernetes.io/tls-acme>: "true"
  hosts:
    - host: projectHostName
      paths:
        - path: /mysticportal
          pathType: ImplementationSpecific
  tls: []
  #  - secretName: chart-example-tls
  #    hosts:
  #      - chart-example.local
c
You don’t manage ingress resources via the ingress controller config. You configure ingress as desired for the cluster via values.yaml, and then you (somewhere else) create ingress resources to map requests for specific hosts or paths, to your service.
You’re confusing ingress controller configuration, with ingress resources
o
okay, so should i convert the path back to just
/
and leave the host there as the hostname... and then create ingress resources?
yeah, im trying to wrap my head around it all.. i just came from a project where everything like that was managed via nginx as a reverse proxy in nginx.conf.. not like this
c
You probably don’t need to touch the ingress controller configuration at all. Just leave it alone, and create ingress resources as described in the Kubernetes docs, to route requests for your hostname, to the service.
The whole point of an ingress controller is to decouple the controller configuration, from the routing configuration.
All the request routing, that you would previously have done in nginx.conf, is done via Ingress resources.
the idea is that you can create ingress resources to describe how you want to route traffic, and those same resources work regardless of whether you are using ingress-nginx, or traefik, or istio. You don’t have to know how to write nginx.conf files or anything else. You just create ingress resources, and the controller figures it out.
o
alright, ill give it a shot and see what happens! i still don't quite get it all, but maybe diving in will help. i just hope something isn't already configured in a way that's going to screw us up
after creating the resource, is there a way that the ingress controller automatically finds it? or do i need to upgrade the controller for it to see?
ive created the resource, sent a
kubectl create -f nginx/mproute.yaml
(my file i just made for the ingress) got that it was created.. but when i try to hit
<http://hostname/mysticportal>
it fails via curl with the same connection refused bit
@creamy-pencil-82913
Copy code
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
kind: Ingress
metadata:
  name: mysticportal-route
  annotations:
    <http://ingress.kubernetes.io/rewrite-target|ingress.kubernetes.io/rewrite-target>: /
spec:
  rules:
  - host: hostNameOfProject
    http:
      paths:
      - path: /mysticportal
        pathType: ImplementationSpecific
        backend:
          service:
            name: mysticportal-service
            port:
              number: 4200
the frontend aka webgui is served via port 4200 according to what i could find
Copy code
[jlong@vm-jlong helm]$ curl -kL <http://hostNameOfProject/mysticportal>
curl: (7) Failed to connect to hostNameOfProject port 80: Connection refused
when running a
netstat -plant | grep 80
there are no services or anything listening on port 80.. maybe that's why the curl is failing?
c
after creating the resource, is there a way that the ingress controller automatically finds it?
watching for changes to resources, and reacting to those changes, is specifically what controllers do. You do not need to restart anything
the controller should run with hostPorts for 80 and 443. You won’t see anything listening on that port in the netstat output, that’s not how kubernetes works. Traffic to 80 and 443 on the host is forwarded into the pod via iptables rules.
If there is an nginx pod running on that node, are you sure you don’t have any other firewall on the node blocking access? firewalld, ufw, or so on?
o
an nginx pod is running on that node, yes
looks like firewalld is running
c
can you do a simple
curl -vs <http://x.x.x.x>
to the node’s IP?
if not, you might try disabling the firewalld service, and restarting the node.
o
yeah, i got a connection refused again
restart the nginx node?
or you mean the host server that everything is on?
firewalld was set to public.. so i dont think that was it
when doing a
kubectl get services
i dont see the
rke2-ingress-nginx-controller
running, but i see the pod running? is that an issue?
and/or should i clear out the nginx.conf stuff, maybe that's conflicting somehow?
only reason i ask is its showing our nginx pod as a NodePort, running on 8080:30000, and 8443:32705
c
the default config should set up controller.hostPort.enabled=true, with ports 80 and 443 mapped: https://github.com/rancher/rke2-charts/blob/main/charts/rke2-ingress-nginx/rke2-ingress-nginx/4.8.200/values.yaml#L87-L94
Was that changed in your rke2-ingress-ngxinx chart values?
If you changed that to 8080/8443 that would explain what you’re seeing
o
The 8080/8443 is being called out from nginx.conf from the nginx pod. I will have to see if someone changed the values prior to me coming here, I think I should be able to find that somewhere local on the machine
c
I suspect you’re looking at the port for the admission webhook?
if you describe the ingress-nginx pod, what do you see for ports?
o
I didn’t look at that file, no. I’m looking at the ports returned after running a ‘kubectl show services’ and the result next to the nginx pod. I don’t see anything for the ingress controller
I tried running a describe on the rke2 ingress nginx controller pod and it wouldn’t return anything regardless of calling it a pod or node
c
you’re not using services
you’re using pods with hostport, at no point are you going through a service to access the ingress
o
That’s fine, I was just trying to identify any port usage I could.
c
> I tried running a describe on the rke2 ingress nginx controller pod and it wouldn’t return anything Are you specifying the namespace? It’s running in the kube-system namespace, you have to do
kubectl describe pod -n kube-system POD-NAME-HERE
o
Ahh okay I’ll try that (unfortunately don’t have access to my VM at work from home). Essentially it should show me the port usage and if it’s something wonky I should be able to change the values back and update the pod?
c
You might spend some more time in the basic kubernetes docs before going in and trying to deep-dive in the nginx config files. Those are so many layers down from what you need to be looking at.
o
Well it’s my only task to make it work.. but I’ll see what I can dig up. Thanks for the pointers
@creamy-pencil-82913 sorry to ping you again.. i have deleted the nginx pod we had running simultaneously with the ingress-nginx bit from rke2 to eliminate any additional layers/conflicts. i verified the chart in place for the
rke2-ingress-nginx-controller
is in fact how it comes from the github repo. i've created ingress resources for two different services we have running in the cluster (api-service and frontend-service). the frontend-service is running as a type
NodePort
and the api-service is running as a
ClusterIP
. none of the services are exposed to the node's IP (i.e. all
EXTERNAL-IP
spaces are <none>). i did previously expose one service to the IP of the host node and it allowed for connectivity, but that's not what i'm trying to achieve understandably. from my understanding after reading a majority of the day, the service port declaration in the ingress should be pointed at the
targetPort
that's declared by the service. with
firewalld
disabled, when i run a
curl <http://localhost>
i still get connection refused an example ingress i created is:
Copy code
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
kind: Ingress
metadata:
  name: api-route
  annotations: {}
spec:
  rules:
   - host: <http://mrtkub.corp.domain.com|mrtkub.corp.domain.com>
     http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 8080
when attempting to visit
<http://mrtkub.corp.domain.com/api>
i get connection refused as well
c
It still sounds like you have some problem with your nodes? Here is what you should see
Copy code
brandond@dev01:~$ kubectl get node -o wide
NAME            STATUS   ROLES                       AGE     VERSION          INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
rke2-server-1   Ready    control-plane,etcd,master   3m36s   v1.28.4+rke2r1   172.17.0.7    <none>        Ubuntu 22.04.3 LTS   6.2.0-1014-aws   <containerd://1.7.7-k3s1>

brandond@dev01:~$ kubectl get pod -A -o wide
NAMESPACE     NAME                                                   READY   STATUS      RESTARTS   AGE     IP           NODE            NOMINATED NODE   READINESS GATES
kube-system   cloud-controller-manager-rke2-server-1                 1/1     Running     0          3m38s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   etcd-rke2-server-1                                     1/1     Running     0          3m12s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   helm-install-rke2-canal-42ldl                          0/1     Completed   0          3m25s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   helm-install-rke2-coredns-hflgs                        0/1     Completed   0          3m25s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   helm-install-rke2-ingress-nginx-q84tw                  0/1     Completed   0          3m25s   10.42.0.10   rke2-server-1   <none>           <none>
kube-system   helm-install-rke2-metrics-server-6747c                 0/1     Completed   0          3m25s   10.42.0.4    rke2-server-1   <none>           <none>
kube-system   helm-install-rke2-snapshot-controller-6lb56            0/1     Completed   2          3m25s   10.42.0.5    rke2-server-1   <none>           <none>
kube-system   helm-install-rke2-snapshot-controller-crd-j59d9        0/1     Completed   0          3m25s   10.42.0.9    rke2-server-1   <none>           <none>
kube-system   helm-install-rke2-snapshot-validation-webhook-2v9bd    0/1     Completed   0          3m25s   10.42.0.2    rke2-server-1   <none>           <none>
kube-system   kube-apiserver-rke2-server-1                           1/1     Running     0          3m37s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   kube-controller-manager-rke2-server-1                  1/1     Running     0          3m39s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   kube-proxy-rke2-server-1                               1/1     Running     0          3m33s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   kube-scheduler-rke2-server-1                           1/1     Running     0          3m39s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   rke2-canal-b77wv                                       2/2     Running     0          3m19s   172.17.0.7   rke2-server-1   <none>           <none>
kube-system   rke2-coredns-rke2-coredns-6b795db654-pjns5             1/1     Running     0          3m19s   10.42.0.7    rke2-server-1   <none>           <none>
kube-system   rke2-coredns-rke2-coredns-autoscaler-945fbd459-d5tfc   1/1     Running     0          3m19s   10.42.0.3    rke2-server-1   <none>           <none>
kube-system   rke2-ingress-nginx-controller-gzwjj                    1/1     Running     0          2m39s   10.42.0.12   rke2-server-1   <none>           <none>
kube-system   rke2-metrics-server-544c8c66fc-jmtqr                   1/1     Running     0          2m50s   10.42.0.8    rke2-server-1   <none>           <none>
kube-system   rke2-snapshot-controller-59cc9cd8f4-czq2h              1/1     Running     0          2m32s   10.42.0.14   rke2-server-1   <none>           <none>
kube-system   rke2-snapshot-validation-webhook-54c5989b65-79sv8      1/1     Running     0          2m52s   10.42.0.6    rke2-server-1   <none>           <none>

brandond@dev01:~$ kubectl describe pod -n kube-system rke2-ingress-nginx-controller-gzwjj
Name:         rke2-ingress-nginx-controller-gzwjj
Namespace:    kube-system
Priority:     0
Node:         rke2-server-1/172.17.0.7
Start Time:   Fri, 15 Dec 2023 19:58:53 +0000
Labels:       <http://app.kubernetes.io/component=controller|app.kubernetes.io/component=controller>
              <http://app.kubernetes.io/instance=rke2-ingress-nginx|app.kubernetes.io/instance=rke2-ingress-nginx>
              <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
              <http://app.kubernetes.io/name=rke2-ingress-nginx|app.kubernetes.io/name=rke2-ingress-nginx>
              <http://app.kubernetes.io/part-of=rke2-ingress-nginx|app.kubernetes.io/part-of=rke2-ingress-nginx>
              <http://app.kubernetes.io/version=1.9.3|app.kubernetes.io/version=1.9.3>
              controller-revision-hash=775748958d
              <http://helm.sh/chart=rke2-ingress-nginx-4.8.200|helm.sh/chart=rke2-ingress-nginx-4.8.200>
              pod-template-generation=1
Annotations:  <http://cni.projectcalico.org/containerID|cni.projectcalico.org/containerID>: 1d4e845a6670d754f2f81ab0f5ae6a56a61089e2e3ee928a0d37ea2688b35d33
              <http://cni.projectcalico.org/podIP|cni.projectcalico.org/podIP>: 10.42.0.12/32
              <http://cni.projectcalico.org/podIPs|cni.projectcalico.org/podIPs>: 10.42.0.12/32
Status:       Running
IP:           10.42.0.12
IPs:
  IP:           10.42.0.12
Controlled By:  DaemonSet/rke2-ingress-nginx-controller
Containers:
  rke2-ingress-nginx-controller:
    Container ID:  <containerd://f441d1b80847e01f09cff9850e29dd2561107e4bd4e5e612db5a6c13680bd5f>5
    Image:         rancher/nginx-ingress-controller:nginx-1.9.3-hardened1
    Image ID:      <http://docker.io/rancher/nginx-ingress-controller@sha256:40b389fcbfc019e1adf2e6aa9b1a75235455a2e78fcec3261f867064afd801cb|docker.io/rancher/nginx-ingress-controller@sha256:40b389fcbfc019e1adf2e6aa9b1a75235455a2e78fcec3261f867064afd801cb>
    Ports:         80/TCP, 443/TCP, 8443/TCP
    Host Ports:    80/TCP, 443/TCP, 0/TCP
    Args:
      /nginx-ingress-controller
      --election-id=rke2-ingress-nginx-leader
      --controller-class=<http://k8s.io/ingress-nginx|k8s.io/ingress-nginx>
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/rke2-ingress-nginx-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --watch-ingress-without-class=true
    State:          Running
      Started:      Fri, 15 Dec 2023 19:59:08 +0000
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   90Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       rke2-ingress-nginx-controller-gzwjj (v1:metadata.name)
      POD_NAMESPACE:  kube-system (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8mvk9 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  rke2-ingress-nginx-admission
    Optional:    false
  kube-api-access-8mvk9:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <http://kubernetes.io/os=linux|kubernetes.io/os=linux>
Tolerations:                 <http://node.kubernetes.io/disk-pressure:NoSchedule|node.kubernetes.io/disk-pressure:NoSchedule> op=Exists
                             <http://node.kubernetes.io/memory-pressure:NoSchedule|node.kubernetes.io/memory-pressure:NoSchedule> op=Exists
                             <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists
                             <http://node.kubernetes.io/pid-pressure:NoSchedule|node.kubernetes.io/pid-pressure:NoSchedule> op=Exists
                             <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists
                             <http://node.kubernetes.io/unschedulable:NoSchedule|node.kubernetes.io/unschedulable:NoSchedule> op=Exists
Events:
  Type    Reason     Age    From                      Message
  ----    ------     ----   ----                      -------
  Normal  Scheduled  2m43s  default-scheduler         Successfully assigned kube-system/rke2-ingress-nginx-controller-gzwjj to rke2-server-1
  Normal  Pulling    2m43s  kubelet                   Pulling image "rancher/nginx-ingress-controller:nginx-1.9.3-hardened1"
  Normal  Pulled     2m28s  kubelet                   Successfully pulled image "rancher/nginx-ingress-controller:nginx-1.9.3-hardened1" in 14.648s (14.648s including waiting)
  Normal  Created    2m28s  kubelet                   Created container rke2-ingress-nginx-controller
  Normal  Started    2m28s  kubelet                   Started container rke2-ingress-nginx-controller
  Normal  RELOAD     2m26s  nginx-ingress-controller  NGINX reload triggered due to a change in configuration

brandond@dev01:~$ curl -vs <http://172.17.0.7>
*   Trying 172.17.0.7:80...
* Connected to 172.17.0.7 (172.17.0.7) port 80 (#0)
> GET / HTTP/1.1
> Host: 172.17.0.7
> User-Agent: curl/7.86.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Date: Fri, 15 Dec 2023 20:01:52 GMT
< Content-Type: text/html
< Content-Length: 146
< Connection: keep-alive
<
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
* Connection #0 to host 172.17.0.7 left intact
note that I am making a request to the node on port 80 or 443. You can see that the pod is using these host ports in the describe output:
Copy code
Containers:
  rke2-ingress-nginx-controller:
    Container ID:  <containerd://f441d1b80847e01f09cff9850e29dd2561107e4bd4e5e612db5a6c13680bd5f>5
    Image:         rancher/nginx-ingress-controller:nginx-1.9.3-hardened1
    Image ID:      <http://docker.io/rancher/nginx-ingress-controller@sha256:40b389fcbfc019e1adf2e6aa9b1a75235455a2e78fcec3261f867064afd801cb|docker.io/rancher/nginx-ingress-controller@sha256:40b389fcbfc019e1adf2e6aa9b1a75235455a2e78fcec3261f867064afd801cb>
    Ports:         80/TCP, 443/TCP, 8443/TCP
    Host Ports:    80/TCP, 443/TCP, 0/TCP
if you can’t even hit the node on 80 or 443, then I’m not sure what to tell you. Do you have external firewalls in front of your host?
o
Copy code
Containers:
  rke2-ingress-nginx-controller:
    Container ID:  <containerd://9c344d6e8847dd78525286e786b6bbdbacd492757fab65449e81089c92bfdde>f
    Image:         rancher/nginx-ingress-controller:nginx-1.9.3-hardened1
    Image ID:      <http://docker.io/rancher/nginx-ingress-controller@sha256:40b389fcbfc019e1adf2e6aa9b1a75235455a2e78fcec3261f867064afd801cb|docker.io/rancher/nginx-ingress-controller@sha256:40b389fcbfc019e1adf2e6aa9b1a75235455a2e78fcec3261f867064afd801cb>
    Ports:         80/TCP, 443/TCP, 8443/TCP
    Host Ports:    80/TCP, 443/TCP, 0/TCP
looks like everything is fine / running on the ports it should be
Copy code
[jlong@vm-jlong nginx]$ sudo kubectl get node -o wide
[sudo] password for jlong: 
NAME                       STATUS   ROLES                       AGE   VERSION           INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                               KERNEL-VERSION                 CONTAINER-RUNTIME
<http://mrtkub.corp.domain.com|mrtkub.corp.domain.com>   Ready    control-plane,etcd,master   16d   v1.26.10+rke2r2   10.1.2.216    <none>        Red Hat Enterprise Linux 8.8 (Ootpa)   4.18.0-477.27.1.el8_8.x86_64   <containerd://1.7.7-k3s1>
so curling the internal-ip:
Copy code
[jlong@vm-jlong nginx]$ curl -vs <http://10.1.2.216>
* Rebuilt URL to: <http://10.1.2.216/>
*   Trying 10.1.2.216...
* TCP_NODELAY set
* connect to 10.1.2.216 port 80 failed: Connection refused
* Failed to connect to 10.1.2.216 port 80: Connection refused
* Closing connection 0
as far as firewalls in front of the host... i'll look into that. it's interesting that if i add the external IP to the service, it connects just fine.. so that makes me think that there isn't anything firewall related blocking it.. but i've seen weirder things
c
can you do that same curl test from the node itself?
o
yes, one sec
Copy code
[jlong@mrtkub charts]$ curl -vs <http://10.1.2.216>
* Rebuilt URL to: <http://10.1.2.216/>
*   Trying 10.1.2.216...
* TCP_NODELAY set
* connect to 10.1.2.216 port 80 failed: Connection refused
* Failed to connect to 10.1.2.216 port 80: Connection refused
* Closing connection 0
same result
c
you’re sure there’s no other host endpoint security agent running on that node that’s blocking things?
you’re not just getting a timeout, something is actively rejecting the connection
did you deploy any network policies, perhaps?
kubectl get netpol -A
o
Copy code
[jlong@mrtkub charts]$ kubectl get netpol -A
NAMESPACE             NAME                POD-SELECTOR   AGE
cattle-fleet-system   default-allow-all   <none>         11d
c
hmm. Or maybe just some default-deny rules in your iptables ruleset, managed by a security team?
something else is going on with this host. the RKE2 stuff looks good, you’ll need to figure out what else is installed or configured that’s refusing your connection.
o
so, when i checked for iptables as a service, ufw, and firewalld.. only firewalld returned and its disabled.
iptables --list
spit out a ton of rules, seemingly cattle and kube based.. but iirc firewalld pulls from iptables, so with it disabled it should be fine i thought.. but i could be wrong
Copy code
Chain KUBE-EXTERNAL-SERVICES (2 references)
target     prot opt source               destination         

Chain KUBE-NODEPORTS (1 references)
target     prot opt source               destination         

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination         
REJECT     tcp  --  anywhere             10.43.76.21          /* cattle-system/cattle-cluster-agent:https-internal has no endpoints */ reject-with icmp-port-unreachable
REJECT     tcp  --  anywhere             10.43.76.21          /* cattle-system/cattle-cluster-agent:http has no endpoints */ reject-with icmp-port-unreachable

Chain KUBE-FORWARD (1 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             ctstate INVALID
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding rules */
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding conntrack rule */ ctstate RELATED,ESTABLISHED

Chain KUBE-PROXY-FIREWALL (3 references)
target     prot opt source               destination         

Chain KUBE-PROXY-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination         
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT
DROP       all  --  anywhere             anywhere             /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination
^^ kube related
then a ton of cali* rules
Copy code
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
cali-INPUT  all  --  anywhere             anywhere             /* cali:Cz_u1IQiXIMmKD4c */
KUBE-FIREWALL  all  --  anywhere             anywhere            
KUBE-PROXY-FIREWALL  all  --  anywhere             anywhere             ctstate NEW /* kubernetes load balancer firewall */
KUBE-NODEPORTS  all  --  anywhere             anywhere             /* kubernetes health check service ports */
KUBE-EXTERNAL-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes externally-visible service portals */

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
cali-FORWARD  all  --  anywhere             anywhere             /* cali:wUHhoiAYhphO9Mso */
KUBE-PROXY-FIREWALL  all  --  anywhere             anywhere             ctstate NEW /* kubernetes load balancer firewall */
KUBE-FORWARD  all  --  anywhere             anywhere             /* kubernetes forwarding rules */
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
KUBE-EXTERNAL-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes externally-visible service portals */
FLANNEL-FWD  all  --  anywhere             anywhere             /* flanneld forward */
ACCEPT     all  --  anywhere             anywhere             /* cali:S93hcgKJrXEqnTfs */ /* Policy explicitly accepted packet. */ mark match 0x10000/0x10000
MARK       all  --  anywhere             anywhere             /* cali:mp77cMpurHhyjLrM */ MARK or 0x10000

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
cali-OUTPUT  all  --  anywhere             anywhere             /* cali:tVnHkvAo15HuiPy0 */
KUBE-FIREWALL  all  --  anywhere             anywhere            
KUBE-PROXY-FIREWALL  all  --  anywhere             anywhere             ctstate NEW /* kubernetes load balancer firewall */
KUBE-SERVICES  all  --  anywhere             anywhere             ctstate NEW /* kubernetes service portals */
and these are all accepted
i dont see any explicit deny things related to 80/443
c
if you run
iptables-save | grep CNI-HOSTPORT
you should see it there
o
that returned nothing
c
at least with the default CNI
I’m using canal, all of the CNIs name their rules a little differently
o
if i just grep for PORT, i get some things but probably not what you're referring to. i wish i was the original guy on this project
Copy code
[jlong@mrtkub charts]$ sudo iptables-save | grep PORT
:KUBE-NODEPORTS - [0:0]
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/frontend-service:http" -j KUBE-EXT-C33WELXONXR5N5TG
-A KUBE-NODEPORTS -p tcp -m comment --comment "default/frontend-service-testonly" -j KUBE-EXT-SN7IDGDQA3JDNQKQ
:KUBE-NODEPORTS - [0:0]
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
c
I don’t think you mentioned it anywhere, but it sounds like you’re using calico instead of canal for the CNI?
o
i believe that is correct
c
Even with that you should see CNI-DN rules
o
running an
iptables-save | grep CNI
returned nothing
c
Copy code
root@rke2-server-1:/# kubectl get pod -A -o wide
NAMESPACE         NAME                                                   READY   STATUS      RESTARTS   AGE     IP             NODE            NOMINATED NODE   READINESS GATES
calico-system     calico-kube-controllers-854bdcbfb-dc6r6                1/1     Running     0          2m6s    10.42.241.71   rke2-server-1   <none>           <none>
calico-system     calico-node-d9hnj                                      1/1     Running     0          2m6s    172.17.0.7     rke2-server-1   <none>           <none>
calico-system     calico-typha-6b4965c695-qk2f8                          1/1     Running     0          2m6s    172.17.0.7     rke2-server-1   <none>           <none>
kube-system       cloud-controller-manager-rke2-server-1                 1/1     Running     0          2m33s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       etcd-rke2-server-1                                     1/1     Running     0          119s    172.17.0.7     rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-calico-crd-zhlqj                     0/1     Completed   0          2m21s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-calico-whnsn                         0/1     Completed   1          2m21s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-coredns-pmlfn                        0/1     Completed   0          2m21s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-ingress-nginx-wj87s                  0/1     Completed   0          2m21s   10.42.241.70   rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-metrics-server-b7cwb                 0/1     Completed   0          2m21s   10.42.241.66   rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-snapshot-controller-crd-h8cpp        0/1     Completed   0          2m21s   10.42.241.74   rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-snapshot-controller-sw9zr            0/1     Completed   1          2m21s   10.42.241.73   rke2-server-1   <none>           <none>
kube-system       helm-install-rke2-snapshot-validation-webhook-j2m4r    0/1     Completed   0          2m21s   10.42.241.67   rke2-server-1   <none>           <none>
kube-system       kube-apiserver-rke2-server-1                           1/1     Running     0          2m28s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       kube-controller-manager-rke2-server-1                  1/1     Running     0          2m25s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       kube-proxy-rke2-server-1                               1/1     Running     0          2m27s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       kube-scheduler-rke2-server-1                           1/1     Running     0          2m25s   172.17.0.7     rke2-server-1   <none>           <none>
kube-system       rke2-coredns-rke2-coredns-6b795db654-hlkjq             1/1     Running     0          2m15s   10.42.241.65   rke2-server-1   <none>           <none>
kube-system       rke2-coredns-rke2-coredns-autoscaler-945fbd459-49z7w   1/1     Running     0          2m15s   10.42.241.72   rke2-server-1   <none>           <none>
kube-system       rke2-ingress-nginx-controller-pds4q                    1/1     Running     0          90s     10.42.241.77   rke2-server-1   <none>           <none>
kube-system       rke2-metrics-server-544c8c66fc-fn28q                   1/1     Running     0          101s    10.42.241.68   rke2-server-1   <none>           <none>
kube-system       rke2-snapshot-controller-59cc9cd8f4-gs79b              1/1     Running     0          95s     10.42.241.76   rke2-server-1   <none>           <none>
kube-system       rke2-snapshot-validation-webhook-54c5989b65-l2zm7      1/1     Running     0          101s    10.42.241.69   rke2-server-1   <none>           <none>
tigera-operator   tigera-operator-59d6c9b46-smmhc                        1/1     Running     0          2m12s   172.17.0.7     rke2-server-1   <none>           <none>
root@rke2-server-1:/# iptables-save | grep 443
-A CNI-DN-4bd8677afc642d2a5aac4 -s 10.42.241.77/32 -p tcp -m tcp --dport 443 -j CNI-HOSTPORT-SETMARK
-A CNI-DN-4bd8677afc642d2a5aac4 -s 127.0.0.1/32 -p tcp -m tcp --dport 443 -j CNI-HOSTPORT-SETMARK
-A CNI-DN-4bd8677afc642d2a5aac4 -p tcp -m tcp --dport 443 -j DNAT --to-destination 10.42.241.77:443
-A CNI-HOSTPORT-DNAT -p tcp -m comment --comment "dnat name: \"k8s-pod-network\" id: \"870dd0b2f029ad490c0c3ad7f6d6ea45da366ad1b2feb0612df5b29ad1a4f277\"" -m multiport --dports 80,443 -j CNI-DN-4bd8677afc642d2a5aac4
o
Copy code
[jlong@mrtkub charts]$ sudo iptables-save | grep 443
-A KUBE-SVC-46CB5Z6CZH2WFWVP -m comment --comment "kube-system/rke2-ingress-nginx-controller-admission:https-webhook -> 10.42.0.13:8443" -j KUBE-SEP-V76QII4E7SINJISZ
-A KUBE-SEP-V76QII4E7SINJISZ -p tcp -m comment --comment "kube-system/rke2-ingress-nginx-controller-admission:https-webhook" -m tcp -j DNAT --to-destination 10.42.0.13:8443
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https -> 10.1.2.216:6443" -j KUBE-SEP-33WVS54JD4GQ6L6Z
-A KUBE-SEP-33WVS54JD4GQ6L6Z -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 10.1.2.216:6443
-A KUBE-SVC-J3KDMBYV4FOFPYZL -m comment --comment "kube-system/rke2-snapshot-validation-webhook:https -> 10.42.0.11:8443" -j KUBE-SEP-BILYSDUPRM7UV575
-A KUBE-SEP-BILYSDUPRM7UV575 -p tcp -m comment --comment "kube-system/rke2-snapshot-validation-webhook:https" -m tcp -j DNAT --to-destination 10.42.0.11:8443
-A KUBE-SVC-BOJCKSINR77JVVPB -m comment --comment "cattle-system/rancher-webhook:https -> 10.42.0.53:9443" -j KUBE-SEP-7L7UOD6GS66G5KPZ
-A KUBE-SEP-7L7UOD6GS66G5KPZ -p tcp -m comment --comment "cattle-system/rancher-webhook:https" -m tcp -j DNAT --to-destination 10.42.0.53:9443
c
Are all of your CNI pods working properly? Are there other configuration changes you’ve made beyond just switching from the default CNI?
What release of RKE2 are you using?
o
i actually dont see any running 😐
Copy code
[jlong@mrtkub charts]$ kubectl get pods -A
NAMESPACE             NAME                                                    READY   STATUS             RESTARTS         AGE
cattle-fleet-system   fleet-agent-855db48487-dk8gj                            1/1     Running            1 (2d6h ago)     11d
cattle-system         cattle-cluster-agent-f545444ff-ppr8c                    0/1     CrashLoopBackOff   643 (4m6s ago)   11d
cattle-system         rancher-webhook-85b57d6bf8-kzh8n                        1/1     Running            1 (9d ago)       11d
default               api-service-pod                                         1/1     Running            0                15d
default               mysticraven-frontend-pod                                1/1     Running            0                3d23h
default               test-ubuntu                                             1/1     Running            0                16d
gitlab-runners        gitlab-runner-6c6965dc44-86b6t                          1/1     Running            21 (8h ago)      9d
gitlab-runners        gitlab-runner-6c6965dc44-9c796                          1/1     Running            21 (8h ago)      9d
gitlab-runners        gitlab-runner-6c6965dc44-kd65f                          1/1     Running            22 (8h ago)      9d
gitlab-runners        mysticraven-frontend-pod                                1/1     Running            0                138m
kube-system           <http://cloud-controller-manager-mrtkub.corp.icr-team.com|cloud-controller-manager-mrtkub.corp.icr-team.com>       1/1     Running            122 (38m ago)    17d
kube-system           <http://etcd-mrtkub.corp.icr-team.com|etcd-mrtkub.corp.icr-team.com>                           1/1     Running            0                17d
kube-system           helm-install-rke2-canal-m9jcc                           0/1     Completed          0                17d
kube-system           helm-install-rke2-coredns-g8nnn                         0/1     Completed          0                17d
kube-system           helm-install-rke2-ingress-nginx-nnrkb                   0/1     Completed          0                17d
kube-system           helm-install-rke2-metrics-server-jcn6n                  0/1     Completed          0                17d
kube-system           helm-install-rke2-snapshot-controller-crd-f2czs         0/1     Completed          0                17d
kube-system           helm-install-rke2-snapshot-controller-snlws             0/1     Completed          1                17d
kube-system           helm-install-rke2-snapshot-validation-webhook-mvkcc     0/1     Completed          0                17d
kube-system           <http://kube-apiserver-mrtkub.corp.icr-team.com|kube-apiserver-mrtkub.corp.icr-team.com>                 1/1     Running            0                17d
kube-system           <http://kube-controller-manager-mrtkub.corp.icr-team.com|kube-controller-manager-mrtkub.corp.icr-team.com>        1/1     Running            115 (38m ago)    17d
kube-system           <http://kube-proxy-mrtkub.corp.icr-team.com|kube-proxy-mrtkub.corp.icr-team.com>                     1/1     Running            0                17d
kube-system           <http://kube-scheduler-mrtkub.corp.icr-team.com|kube-scheduler-mrtkub.corp.icr-team.com>                 1/1     Running            118 (38m ago)    17d
kube-system           rke2-canal-r7vq8                                        2/2     Running            0                17d
kube-system           rke2-coredns-rke2-coredns-565dfc7d75-zltwx              1/1     Running            0                17d
kube-system           rke2-coredns-rke2-coredns-autoscaler-6c48c95bf9-jb276   1/1     Running            0                17d
kube-system           rke2-ingress-nginx-controller-9hq6q                     1/1     Running            0                17d
kube-system           rke2-metrics-server-c9c78bd66-trbf5                     1/1     Running            0                17d
kube-system           rke2-snapshot-controller-6f7bbb497d-wh5l8               1/1     Running            61 (38m ago)     17d
kube-system           rke2-snapshot-validation-webhook-65b5675d5c-hkbwp       1/1     Running            0                17d
local-path-storage    local-path-provisioner-8bc8875b-5php8                   1/1     Running            0                11d
oh i see canal
i blanked that
c
oh ok so you are using canal, not calico
o
rke2 version 1.26.10
c
I would look for errors in the canal pod logs, and in the containerd.log on the node. It can’t really think of any reason why you’d be missing all the hostport rules.
o
from the canal pod, I had to grep for WARN since it was a ton of INFO spam
The latest warning (which tere are a lot about out-of-sync calico chain)
Copy code
2023-12-15 20:26:48.233 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{""} chainName="FORWARD" expectedRuleIDs=[]string{"wUHhoiAYhphO9Mso", "", "S93hcgKJrXEqnTfs", "mp77cMpurHhyjLrM"} ipVersion=0x4 table="filter"
2023-12-15 20:26:48.233 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-pri-_P_DLYv72G3ond4Ecw8" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.233 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-tw-cali9fb1672d406" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.233 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-tw-calic6367f3daa3" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.233 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-to-wl-dispatch-5" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.233 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-from-wl-dispatch-f" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.233 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-to-wl-dispatch" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-fw-cali465b0151bf2" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-pri-ksa.default.default" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-tw-calid3e965123b7" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-pri-_GyLFhtf5u9n-v9Ckd7" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-pro-kns.cattle-system" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-OUTPUT" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-tw-cali59b3cac0933" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-INPUT" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-pri-_XnQ5h_hZf854SLqzqE" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-pro-_kvQu8xaXYEM2wqqPSH" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-to-hep-forward" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-tw-califd8aa92dcb1" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{""} chainName="OUTPUT" expectedRuleIDs=[]string{"tVnHkvAo15HuiPy0", ""} ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-tw-cali18da3c2efe6" ipVersion=0x4 table="filter"
2023-12-15 20:26:48.234 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-from-host-endpoint" ipVersion=0x4 table="filter"
2023-12-15 20:26:52.320 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{"", "", "", "Cz_u1IQiXIMmKD4c", ""} chainName="INPUT" expectedRuleIDs=[]string{"Cz_u1IQiXIMmKD4c", "", "", "", ""} ipVersion=0x4 table="filter"
2023-12-15 20:26:52.321 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{"", "", "", "", "wUHhoiAYhphO9Mso", "", "S93hcgKJrXEqnTfs", "mp77cMpurHhyjLrM"} chainName="FORWARD" expectedRuleIDs=[]string{"wUHhoiAYhphO9Mso", "", "", "", "", "", "S93hcgKJrXEqnTfs", "mp77cMpurHhyjLrM"} ipVersion=0x4 table="filter"
2023-12-15 20:26:52.321 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{"", "", "tVnHkvAo15HuiPy0", ""} chainName="OUTPUT" expectedRuleIDs=[]string{"tVnHkvAo15HuiPy0", "", "", ""} ipVersion=0x4 table="filter"
2023-12-15 20:27:05.558 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{} chainName="PREROUTING" expectedRuleIDs=[]string{"6gwbT8clXdHdC1b1"} ipVersion=0x4 table="raw"
2023-12-15 20:27:05.559 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{} chainName="OUTPUT" expectedRuleIDs=[]string{"tVnHkvAo15HuiPy0"} ipVersion=0x4 table="raw"
2023-12-15 20:27:05.559 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-rpf-skip" ipVersion=0x4 table="raw"
2023-12-15 20:27:05.559 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-from-host-endpoint" ipVersion=0x4 table="raw"
2023-12-15 20:27:05.559 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-to-host-endpoint" ipVersion=0x4 table="raw"
2023-12-15 20:27:05.559 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-PREROUTING" ipVersion=0x4 table="raw"
2023-12-15 20:27:05.559 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-OUTPUT" ipVersion=0x4 table="raw"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-POSTROUTING" ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-OUTPUT" ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-nat-outgoing" ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-fip-dnat" ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{""} chainName="PREROUTING" expectedRuleIDs=[]string{"6gwbT8clXdHdC1b1", ""} ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-PREROUTING" ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{""} chainName="OUTPUT" expectedRuleIDs=[]string{"tVnHkvAo15HuiPy0", ""} ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 660: Detected out-of-sync Calico chain, marking for resync chainName="cali-fip-snat" ipVersion=0x4 table="nat"
2023-12-15 20:27:27.445 [WARNING][53] felix/table.go 654: Detected out-of-sync inserts, marking for resync actualRuleIDs=[]string{"", ""} chainName="POSTROUTING" expectedRuleIDs=[]string{"O3lYWMrLQYEMJtB5", "", ""} ipVersion=0x4 table="nat"
not sure if its from disabling firewalld, but looks like those are names of rules or rulesets
in the kube-flannel logs, this is spammed:
Copy code
2023-12-15T13:25:10.522054508-07:00 stderr F I1215 20:25:10.521878       1 iptables.go:421] Some iptables rules are missing; deleting and recreating rules
2023-12-15T13:25:10.529988791-07:00 stderr F I1215 20:25:10.529843       1 iptables.go:283] bootstrap done
2023-12-15T13:25:11.00278908-07:00 stderr F I1215 20:25:11.002604       1 iptables.go:421] Some iptables rules are missing; deleting and recreating rules
2023-12-15T13:25:11.01859547-07:00 stderr F I1215 20:25:11.018445       1 iptables.go:283] bootstrap done
2023-12-15T13:26:36.306865684-07:00 stderr F I1215 20:26:36.305915       1 iptables.go:421] Some iptables rules are missing; deleting and recreating rules
2023-12-15T13:26:36.367077333-07:00 stderr F I1215 20:26:36.366937       1 iptables.go:283] bootstrap done
2023-12-15T13:26:40.714437277-07:00 stderr F I1215 20:26:40.714228       1 iptables.go:421] Some iptables rules are missing; deleting and recreating rules
2023-12-15T13:26:40.721815679-07:00 stderr F I1215 20:26:40.721525       1 iptables.go:283] bootstrap done
also verified with company security that they don't block anything internally.. so maybe its the fact that a rule is missing? although even with firewalld being disabled, no rules should be in place.. and all traffic should flow. this is making me feel dumber by the minute
i know its not preferred.. but should/could i set a nodePort for the ingress controller so things still come in and hit it?
like set the nodePort to 80 for http, 443 for https.. and use that as a work around? probably not best practice
@creamy-pencil-82913 also another thought.. should there be something (like a service?) running in conjunction with
rke2-ingress-nginx-controller
that allows connections to occur? everything im reading about vanilla nginx-ingress controllers seems to spin up services to allow connections. if it helps, i dont think i ever clarified that im running this in our own infrastructure, not cloud-based or anything
c
no, we run it as a daemonset instead of as a service
did you ever get anything from the containerd log? Something is going on that is preventing the hostport stuff from working right
o
I did not. It seems this system doesn’t have an independent logging for it, and the regular messages only had basic containerd hits in it
Admittedly I wasn’t the one who started up rke2 from the start.. but the guy that did sits next to me, so shouldn’t be too hard to prod him for info
c
there’s a log file at
/var/lib/rancher/rke2/agent/containerd/containerd.log
o
Okay, I’ll check that and report back tomorrow!
nothing evident in the logs, even when grepping for
warning
and
error
Copy code
[root@mrtkub admin-jwilliams]# /var/lib/rancher/rke2/bin/crictl inspect -o json 9c344d6e8847d | jq -r '.info.runtimeSpec.linux.namespaces[] | select(.type="network") | .path'
null
/proc/4041102/ns/ipc
/proc/4041102/ns/uts
null
/proc/4041102/ns/net
[root@mrtkub admin-jwilliams]# nsenter --net=/proc/4041102/ns/net ip a s
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0@if174: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
link/ether d2:90:d1:8d:7b:89 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.42.0.13/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::d090:d1ff:fe8d:7b89/64 scope link
valid_lft forever preferred_lft forever
[root@mrtkub admin-jwilliams]# nsenter --net=/proc/4041102/ns/net netstat -anl
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:8181 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:10245 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:10246 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:10247 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:10246 127.0.0.1:51770 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:33096 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:46246 TIME_WAIT
tcp 0 0 10.42.0.13:54354 10.43.0.1:443 ESTABLISHED
tcp 0 0 127.0.0.1:10246 127.0.0.1:33110 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:37620 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:55010 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:41828 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:42954 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:41842 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:55000 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:42964 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:46252 TIME_WAIT
tcp 0 0 127.0.0.1:10246 127.0.0.1:51758 TIME_WAIT
tcp6 0 0 :::80 :::* LISTEN
tcp6 0 0 :::8181 :::* LISTEN
tcp6 0 0 :::8443 :::* LISTEN
tcp6 0 0 :::443 :::* LISTEN
tcp6 0 0 :::10254 :::* LISTEN
tcp6 0 0 10.42.0.13:10254 10.1.2.216:53712 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:56196 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:48602 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:53714 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:59240 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:47644 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:47638 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:59242 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:37596 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:56194 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:37612 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:46736 TIME_WAIT
tcp6 0 0 10.42.0.13:10254 10.1.2.216:48606 TIME_WAIT
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node Path
unix 2 [ ACC ] STREAM LISTENING 233566202 /tmp/nginx/prometheus-nginx.socket
unix 3 [ ] STREAM CONNECTED 668122688
unix 3 [ ] STREAM CONNECTED 668122674
unix 3 [ ] STREAM CONNECTED 668122680
unix 3 [ ] STREAM CONNECTED 668122686
unix 3 [ ] STREAM CONNECTED 668122689
unix 3 [ ] STREAM CONNECTED 668122684
unix 3 [ ] STREAM CONNECTED 668122673
unix 3 [ ] STREAM CONNECTED 668122677
unix 3 [ ] STREAM CONNECTED 668122685
unix 3 [ ] STREAM CONNECTED 668122678
unix 3 [ ] STREAM CONNECTED 668122687
unix 3 [ ] STREAM CONNECTED 668122679
unix 3 [ ] STREAM CONNECTED 668122676
unix 3 [ ] STREAM CONNECTED 668122675
unix 3 [ ] STREAM CONNECTED 668122690
unix 3 [ ] STREAM CONNECTED 668122682
unix 3 [ ] STREAM CONNECTED 668122681
unix 3 [ ] STREAM CONNECTED 668122683
Active Bluetooth connections (servers and established)
Proto Destination Source State PSM DCID SCID IMTU OMTU Security
Proto Destination Source State Channel
From another worker here: Looking at the ingress container, it looks like it is listening on 443/80, but thats not being mapped to the host? ^log output above
we had to restart the host, so i re-checked the containerd.log, i see this warning a couple of times when grepping for
nginx
... is there anything in particular i should try to pin point?
Copy code
2023-12-19 07:30:59.452 [WARNING][7753] k8s.go 540: CNI_CONTAINERID does not match WorkloadEndpoint ConainerID, don't delete WEP. ContainerID="5a1853bb865285028d5b8b073457298a304b8ae5c8e33f3bec46539b62103e2f" WorkloadEndpoint=&v3.WorkloadEndpoint{TypeMeta:v1.TypeMeta{Kind:"WorkloadEndpoint", APIVersion:"<http://projectcalico.org/v3|projectcalico.org/v3>"}, ObjectMeta:v1.ObjectMeta{Name:"mrtkub.domain.com-k8s-rke2--ingress--nginx--controller--9hq6q-eth0", GenerateName:"rke2-ingress-nginx-controller-", Namespace:"kube-system", SelfLink:"", UID:"1af4b4b3-574d-448c-a5f3-0d4c7b19b11d", ResourceVersion:"9001880", Generation:0, CreationTimestamp:time.Date(2023, time.November, 28, 13, 12, 37, 0, time.Local), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"<http://app.kubernetes.io/component|app.kubernetes.io/component>":"controller", "<http://app.kubernetes.io/instance|app.kubernetes.io/instance>":"rke2-ingress-nginx", "<http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>":"Helm", "<http://app.kubernetes.io/name|app.kubernetes.io/name>":"rke2-ingress-nginx", "<http://app.kubernetes.io/part-of|app.kubernetes.io/part-of>":"rke2-ingress-nginx", "<http://app.kubernetes.io/version|app.kubernetes.io/version>":"1.9.3", "controller-revision-hash":"6858c9cf76", "<http://helm.sh/chart|helm.sh/chart>":"rke2-ingress-nginx-4.8.200", "pod-template-generation":"1", "<http://projectcalico.org/namespace|projectcalico.org/namespace>":"kube-system", "<http://projectcalico.org/orchestrator|projectcalico.org/orchestrator>":"k8s", "<http://projectcalico.org/serviceaccount|projectcalico.org/serviceaccount>":"rke2-ingress-nginx"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v3.WorkloadEndpointSpec{Orchestrator:"k8s", Workload:"", Node:"<http://mrtkub.corp.icr-team.com|mrtkub.corp.icr-team.com>", ContainerID:"1033f1abcbe48ec0b6be3fd70aeed9b7bc9db416b9db6c73afd5cf3ccadd12fd", Pod:"rke2-ingress-nginx-controller-9hq6q", Endpoint:"eth0", ServiceAccountName:"rke2-ingress-nginx", IPNetworks:[]string{"10.42.0.248/32"}, IPNATs:[]v3.IPNAT(nil), IPv4Gateway:"", IPv6Gateway:"", Profiles:[]string{"kns.kube-system", "ksa.kube-system.rke2-ingress-nginx"}, InterfaceName:"cali6ef98d8884d", MAC:"", Ports:[]v3.WorkloadEndpointPort{v3.WorkloadEndpointPort{Name:"http", Protocol:numorstring.Protocol{Type:1, NumVal:0x0, StrVal:"TCP"}, Port:0x50, HostPort:0x50, HostIP:""}, v3.WorkloadEndpointPort{Name:"https", Protocol:numorstring.Protocol{Type:1, NumVal:0x0, StrVal:"TCP"}, Port:0x1bb, HostPort:0x1bb, HostIP:""}, v3.WorkloadEndpointPort{Name:"webhook", Protocol:numorstring.Protocol{Type:1, NumVal:0x0, StrVal:"TCP"}, Port:0x20fb, HostPort:0x0, HostIP:""}}, AllowSpoofedSourcePrefixes:[]string(nil)}}
or do I need to enable hostNetwork?
starting up a test box to test a from-scratch install on
alright, also getting connection refused when running a brand new, fresh install of rke2 @creamy-pencil-82913
c
on what Linux distro again? and what specific release of RKE2?
o
rhel 8.7
no cloud provisioners either, this would be considered "bare metal"
(standalone VM)
i installed fresh from the quick install script, so that'd be rke-server-1.26.11~rke2r1
c
Just to be clear, you’re intentionally using 8.7 instead of 8.9?
o
its what's installed on this VM, built by the company's general IT group.. so, it's what I have
i can see what's available, and upgrade to 8.9 if the deviation between 8.7 and 8.9 is enough to change troubleshooting processes
c
I can try to reproduce with a generic rocky 8.7 image, but i still suspect that the root cause is something applied by your IT folks
o
yeah looks like in our current state, i have this test running on rhel 8.7 and rhel 8.8
rhel 8.8 variant is what all of our previous discussions were based on
rhel 8.7 is when i spun up the new rke2 fresh
that was a dumb question i already knew the answer to... just running through cycles of info in my head
c
fresh install of Rocky Linux 8.7 from the minimal ISO. Logged in, updated nothing, installed RKE2
I think you need to have a discussion with your IT security folks
o
for sure.. i can see that now
something's gotta give
appreciate your help
@creamy-pencil-82913 random question.. i spun up an http server on the host using python (
python3 -m http.server 80
) and was able to curl it then.. would that negate the thought of port 80 being blocked?
c
I suspect it’s something with a conflicting iptables rule or something blocking the portmap plugin from working properly.
o
okay, cool.. just figured i'd ask
c
the CNI-HOSTPORT-DNAT and CNI-DN-xxx rules that you’re missing are created by the portmap plugin when the pod has a host port. That seems to be the bit that is getting blocked for some reason. https://www.cni.dev/plugins/current/meta/portmap/
o
awesome, on phone with someone who seems semi versed in this, so fingers crossed
apparently during testing, they blew away the rke2-ingress-nginx pod.. is there a way to just re-grab that?
c
that’s managed by the daemonset, it should get recreated automatically
unless they deleted the whole ds?
o
i dont know? it was helm uninstalled
i restarted rke2-server and it didnt repopulate
c
oh, that’s more than just deleting the pod…
o
oh lovely, that doesnt sound good lol
so i basically need to uninstall and reinstall rke2-server then?
or re-apply the rke2-ingress-nginx.yaml?
c
nah, just do this:
Copy code
kubectl apply -f - <<EOF
apiVersion: <http://helm.cattle.io/v1|helm.cattle.io/v1>
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: "# placeholder"
EOF
you have to change the chart config to get it to reinstall
that should kick off a new helm job pod to install it
assuming they didn’t disabled it entirely
o
yep that did it. thank you
guess what we found out... 😐 they had puppet in place on our servers forcing firewalld to re-activate every 10 seconds
c
hahaha
always something like that
o
of course.. so now they have permanently disabled it for us
and i get the 404
and trying to get my ingress to not 503... but PROGRESS
sorry for all the headaches, i promise im not as dumb as i seem
c
working with IT managed systems is always fun, you never know what they’ve got going on that they forgot to tell you about
o
yeah, and this person i worked with happens to be on the IT team... BUT also has her own rke2 instance running within the infrastructure so she was like "oh yeah its this"
lol