Has anyone here set their rke2 config.yaml to have...
# rke2
c
Has anyone here set their rke2 config.yaml to have "ingress: traefik" and then successfully exposed the traefik dashboard? I'm missing something when trying to do that.
c
… what have you tried?
c
Well I guess to start with I'm a little confused as to how traefik is configured by default when setting it as the default ingress. Do you know if it has the typical "web" and "websecure" entrypoints?
I think I need to enable it first now that I am looking at rke2-traefik charts on rke2-charts.
It is the same as the upstream chart, with a couple different default values to make it work better out-of-the-box on rke2.
c
ok, I'll keep digging some more and follow up if I get stuck but thank you for responding 🙂
If you could sanity check me and tell me if I'm on the right path, I would need to create a yaml file with content like so:
Copy code
apiVersion: <http://helm.cattle.io/v1|helm.cattle.io/v1>
kind: HelmChartConfig
metadata:
  name: rke2-traefik
  namespace: kube-system
spec:
  valuesContent: |-
    ingressRoute:
      dashboard:
        enabled: true
And then can I kubectl apply that or do I have to place it in /var/lib/rancher/rke2/server/manifests/rke2-traefik-config.yaml and restart the whole rke2-service on each node?
c
you can do either. Neither one requires a restart.
Note that you also need to expose the traefik entrypoint, which is not exposed by default as it does not have any auth on it… https://github.com/traefik/traefik-helm-chart/blob/master/traefik/values.yaml#L202-L205
You’re not really supposed to expose the Traefik dashboard via Traefik itself.
c
yep, just wanted to make sure I was heading down the right path to configuring
right now I just want the dashboard to be able to see if my ingress routes are getting picked up correctly and mapping to services
I will need to use this helmchartconfig stuff to set the default ssl cert for traefik to serve as well so this is helpful all around
c
that’s not a great idea either. Ideally you would configure TLS in the ingress resource, instead of setting it up in Traefik itself via the default cert
use cert-manager, or traefik’s built-in letsencrypt support, or a TLS secret.
c
ok I will look at doing it that way. I'm in a completely air-gapped environment so have to deal with all self signed and no let's encrypt support
I am still stuck (ignoring the traefik dashboard for now). I have pgadmin running in a statefulset with a service exposing the pod's port 80. I have an ingress defined that should be pointing my dns request to the service but all I get is 404 not found. If I
k port-forward service/pgadmin-service 5000:80 -n test
I can curl localhost:5000 and get the page to render. However, if on one of the nodes I try to curl the clusterip assigned to the service, I get nothing (Failed to connect, couldn't connect to host). I'm not sure what to look at to troubleshoot at this point.
c
do you have ufw/firewalld enabled on the nodes?
but it sounds like you have multiple issues
404 not found most likely indicates your ingress isn’t configured properly
the failed to connect suggests a problem with firewalls or something else that is blocking connections to the clusterip service
if you describe the service, does it have endpoints?
c
no firewall installed
yes the service has an endpoint
I "think" the port-forward to the service wouldn't work if it didn't have endpoints
what's annoying about this is that I've had everything working on other network(s) where I just left the default nginx-ingress be there and I've installed traefik separately, gave it its own ports to listen on and its own ingressClass. I figured it was dumb to do it that way when rke2 supports having traefik be the default ingress out of the box (though it's not documented as far as I can tell)
I really appreciate your help, btw. If this goes further than you intended to get involved I can post back in the main channel. Did not mean to use up so much of your time.
c
if you can’t hit the service’s ClusterIP from a node within the cluster that suggests something else is wrong.
c
I don't disagree, just don't know what to look for that could be wrong
nor do I understand why a kubectl port-forward would work from a non-node to that service just fine
c
port-forward take a different path than accessing the clusterip directly.
you are SURE that ufw and firewalld and any other security agents are not active on this node?
if you have endpoints for the service, and the node is a member of the cluster, you should be able to reach that ClusterIP. Unless you are using network policy and have rules that would block connectivity from nodes?
c
I did not create any network policies. I have no firewalls installed. This pretty much a set of 3 brand new debian 12 vm's with rke2-server setup using the airgap tar files with the install.sh script method. I have deployed the nfs-subdir-provisioner, postgres, and pgadmin4 (the thing I'm trying to access)
my config.yaml was pre-populated on all 3 nodes to define the node name, the "ingress: traefik", a pre-shared token, and a few tls sans
I also had a root ca cert that was placed in the /var/lib/rancher/rke2/server/tls as root-ca.pem and root-ca.key as documented so that certs were generated based off of that
I don't think there's anything else. I did rke2-uninstall and re-install on here a few times because I had something wrong but as long as rke2-uninstall works (and it always appeared to) then that SHOULDN'T be an issue
k get all -n kube-system shows everything running or completed
oh, postgres and pgadmin are in a namespace called "test"
that's the only other thing I can think of
going to try deploying just a basic nginx image as a deployment + service to see if I can hit its test page
same symptoms 😞
c
can you access it from the node where the pod is running?
c
will try
I did just try doing the same nginx deployment/service on a different network and cluster I have and everything worked as expected
c
If you’ve been through a couple different iterations of configurations and installs/uninstalls on these nodes, maybe just give em all a reboot?
c
ok you're onto something.. I CAN access it via curl on the node the pod is running on (via service's cluster ip)
I can reboot them, yes... though I think I've done that since a couple times but willing to try again. I'll do the full cordon/uncordon dance and reboot them
ok, progress.. post reboot, curl from all 3 nodes to the service ip works for pgadmin and the nginx test
ingress still does not work though
c
you sure the ingress config is correct?
like I said, 404 suggests that it’s not matching the requested hostname or path
c
checking now.. at least this time (wasn't before) I am seeing 404 errors in the traefik logs as in it's seeing the requests
Copy code
apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
kind: Ingress
metadata:
  name: pgadmin-ingress
  namespace: test
spec:
  rules:
    - host: "<http://pgadmin.kubernetes.mydomain.com|pgadmin.kubernetes.mydomain.com>"
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: pgadmin-service
                port:
                  number: 80
that matches what I have on the other network minus the ingressClassName which was required because I have the default nginx ingress installed on that cluster but traefik also deployed
my helmchartconfig looks like:
Copy code
apiVersion: <http://helm.cattle.io/v1|helm.cattle.io/v1>
kind: HelmChartConfig
metadata:
  name: rke2-traefik
  namespace: kube-system
spec:
  valuesContent: |-
    ingressRoute:
      dashboard:
        enabled: true
    logs:
      access:
        enabled: true
      general:
        lebel: DEBUG
and I know that's successfully applied because I'm getting the debug logs that I wasn't getting before
c
and what do you get when you
curl -v <http://pgadmin.kubernetes.mydomain.com>
You pointed
<http://pgadmin.kubernetes.mydomain.com|pgadmin.kubernetes.mydomain.com>
at one of the nodes, correct? so that it’s hitting the ingress controller on the daemonset nodeport?
c
I have a reverse proxy in front of everything but took that out of the loop by adding an entry to /etc/hosts that points that domain to one of the nodes, yes (traefik log is showing the request as happening). The curl -v says Trying (node ip address here):80, Connected to pgadmin.kubernetes.mydomain.com (ip address) port 80 and then some http generic stuff and 404 Not found, connneciton left in tact
the Host header is set
which is the important part
I can also try curling the node ip directly and setting the host header
same result, tried all 3 nodes' ip addresses
I know it's going to end up coming down to something stupid but I keep checking over and over for any spelling mistakes or anything like that
c
the service and the ingress resource are in the same namespace, correct? and the service uses port 80?
c
yes
I did just find something interesting...
k describe ingress pgadmin-ingress -n test
shows what would mostly be normal except: "Default backend: default-http-backend:80 (error: endpoints "default-http-backend" not found"
c
That isn’t something traefik would do
it sounds like the other ingress you deployed is also trying to handle that ingress
c
I don't have any other ingress deployed
is there a resource type I can check for?
c
sorry I thought you said you also have ingress-nginx on here?
c
sorry for the confusion.. that's on the other cluster that is working
this was one where I was trying not to have both since rke2 can have traefik out of box as default
c
what do you get from
kubectl get ingress -o yaml -A --show-managed-fields
c
I may have found the problem though that might go back to early days of my troubleshooting (been struggling with this for days before reaching out) , does traefik out of the box with rke2 default install create an ingressClass called traefik?
if the answer is "no", then I might have deployed an ingressClass resource a couple days ago thinking I was setting it up like the other server where I had to
and that might be the problem
and isDefaultClass is set to true if traefik is the default ingress controller for your cluster
c
ok
let me run your other command, one sec
doesn't like the
--show-managed-fields
argument
c
how old of a kubectl version are you on!?
kubectl version
c
v1.19.3 (keep in mind that I'm on an airgapped network so keep things fully up to date is always a challenge)
but it is showing managed fields in the output
without that argument
I can upgrade kubectl though if needed, will just take me a few minutes... gotta see if I may already have that repo mirrored. On a separate note, what are you looking for in that output, it shows the one ingress for pgadmin
c
Oh man that’s ancient. Don’t do that. Why don’t you just use the kubectl binary that comes with RKE2? It’ll be the same version as whatever RKE2 release you’re using.
c
yep, was just going to grab that
c
export PATH=/var/lib/rancher/rke2/bin:$PATH
c
btw, working offline sucks... the number of issues I have to deal with constantly regarding self-signed certs and not being connected to the internet with things reaching out is crazy 😕
c
but anyways, if you have the managed fields info in the output it should show you what’s been touching that ingress. traefik doesn’t use a
default-http-backend
service as far as I know, so that status suggests that ingress-nginx took ownership of that ingress resource at some point.
c
ok, upgraded kubectl, going to run the command again with that argument and see what that shows
c
can you share the output here?
c
give me a minute to change a bit of the text (domain name mostly). It's showing the last update was from kubectl-clientside-apply
Copy code
apiVersion: v2
items:
- apiVersion: <http://networking.k8s.io/v1|networking.k8s.io/v1>
  kind: Ingress
  metadata:
    annotations:
      <http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
        {"apiVersion":"<http://networking.k8s.io/v1|networking.k8s.io/v1>","kind":"Ingress","metadata":{"annotations":{},"name":"pgadmin-ingress","namespace":"test"},"spec":{"rules":[{"host":"<http://pgadmin.kubernetes.mydomain.com|pgadmin.kubernetes.mydomain.com>","http":{"paths":[{"backend":{"service":{"name":"pgadmin-service","port":{"number":80}}},"path":"/","pathType":"Prefix"}]}}]}}
    creationTimestamp: "2025-05-15T19:13:21Z"
    generation: 3
    managedFields:
    - apiVersion: <http://networking.k11s.io/v1|networking.k11s.io/v1>
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: {}
        f:spec:
          f:rules: {}
      manager: kubectl-client-side-apply
      operation: Update
      time: "2025-05-19T17:56:23Z"
    name: pgadmin-ingress
    namespace: test
    resourceVersion: "1422005"
    uid: dcd9fa98-b10a-4bd6-a22b-3d723d46b1c4
  spec:
    rules:
    - host: <http://pgadmin.kubernetes.mydomain.com|pgadmin.kubernetes.mydomain.com>
      http:
        paths:
        - backend:
            service:
              name: pgadmin-service
              port:
                number: 80
          path: /
          pathType: Prefix
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""
c
Where are you seeing that default backend error? In an event perhaps? If so from what? Can you do
kubectl get event -n test -o yaml
?
c
it shows up when I do the describe, not the get
it was showing when I did the describe of the ingress but it is not showing now so maybe that was a remnant of old version of kubectl
I'll run the event command now
nothing returned with the get event
c
no, describe also shows events related to the resource
c
right but it was showing up under "Default backend: " field
and now it just says "<default>"
c
huh
it really looks like traefik isn’t picking it up
have you tried just setting ingressclassname on the resource?
c
traefik logs show it though but it's like it's not routing it to the service
no but I will do that, set it to "traefik" correct?
c
traefik will get the request because you’ve pointed the DNS record at your node. Doesn’t mean traefik knows what to do with the request.
yes, just traefik
c
Okkk.... well that "worked". I saw in the traefik logs that it saw and added the route... it added it to websecure instead of web entrypoint but that's fine and probably an annotation or something else I gotta do for that but regardless, https://pgadmin.kubernetes.mydomain.com now loads!
so... this makes me think that the default class isn't set properly?
the ingress class appears to be annotated correctly to be the default
c
it seems like that is just how traefik works. If you tell it what its ingressclass is, it only watches that class. https://github.com/rancher/rke2-charts/blob/main/charts/rke2-traefik/rke2-traefik/34.2.002/values.yaml#L286-L287
c
ah.. and that's the default.. so I didn't screw it up I just didn't realize it was set and required
that makes me feel better
aside from it costing me many hours/days of troubleshooting and not finding that
c
we should be setting the annotation that makes the ingressclass the default though… which means that Kubernetes should set the ingressClassName if it’s not set in the YAML, when you create it https://github.com/rancher/rke2-charts/blob/main/charts/rke2-traefik/rke2-traefik/34.2.002/templates/ingressclass.yaml#L6
Do you see that annotation set on the IngressClass?
c
I do see it BUT, I think maybe the problem is that it's in "true" in quotes
instead of the identifier true
c
annotation values can only be strings, so thats correct
c
ok
but, just to counter that for a second.. none of the other annotations show the "
only that one
c
that is how yaml works, yes
c
so that one only shows up in " because it would be a keyword otherwise?
c
things will only be quoted if they need to be quoted to distinguish them from a bool/int/float
c
got it
c
then I'm not sure why it didn't work
c
Setting the
<http://ingressclass.kubernetes.io/is-default-class|ingressclass.kubernetes.io/is-default-class>
annotation to true on an IngressClass resource will ensure that new Ingresses without an
ingressClassName
specified will be assigned this default
IngressClass
c
ok, another theory then
since I moved this yaml over from the other cluster where I had to use a class name, I might have first deployed it with a class
c
Did you perhaps accidentally remove that field by re-applying the ingress manifest?
c
then deleted it or updated it without a class name
c
ah that’d do it
c
when I realized I didn't need it
c
the defaulting only applies if you create it with the field unset
c
yep, tracking
well sir, I've learned a lot today and you have been SUPER helpful and I very much appreciate it! If there's a manager or someone I can reach out to and let them know how helpful you've been if that would help you in some way?
c
thanks, glad we got to the bottom of it! just part of being maintainer of an OSS project ;)
👍 1