This message was deleted Rancher Users #rke2

Join Slack

This message was deleted.

# rke2

adamant-kite-43734

06/15/2023, 5:19 AM

This message was deleted.

magnificent-vr-88571

06/15/2023, 5:21 AM

I have observer rke2-metric server is set with

hostnetwork as  true

by default. I tried making it as false. Though

rke2-metric-server

came up it ended up in different issue.

magnificent-vr-88571

06/15/2023, 5:22 AM

None of pods logs are viewable or cannot execute shell etc.. everywhere its showing following or similar error message.

Copy code

Error from server: Get "https://<ip addr1>:10250/containerLogs/anamespace/lb987c5bc4-4vlzp/api?follow=true": x509: certificate is valid for 127.0.0.1, not <ip addr1>

magnificent-vr-88571

06/15/2023, 5:23 AM

would like to know how to overcome this issue.

magnificent-vr-88571

06/15/2023, 5:26 AM

Also I have observed following behavior, looking forward guidance to resolve

Copy code

>>> k get nodes
E0615 14:24:39.179242   83498 memcache.go:287] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.188277   83498 memcache.go:287] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.190885   83498 memcache.go:121] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.193944   83498 memcache.go:121] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.195687   83498 memcache.go:121] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
NAME                 STATUS   ROLES                       AGE   VERSION
server-1          Ready    control-plane,etcd,master   35d   v1.24.12+rke2r1

creamy-pencil-82913

06/15/2023, 4:45 PM

What else is using that port on your nodes? All the messages you're getting are due to metrics server not being able to run.

magnificent-vr-88571

06/15/2023, 9:16 PM

On the node while checking netstat… it says kube-api-server is on 10250 port

creamy-pencil-82913

06/15/2023, 9:21 PM

did you customize the metrics server deployment or something?

creamy-pencil-82913

06/15/2023, 9:22 PM

by default it doesn’t run with host network. https://github.com/rancher/rke2-charts/blob/main/charts/rke2-metrics-server/rke2-metrics-server/2.11.100-build2023051508/values.yaml#L21-L27

magnificent-vr-88571

06/15/2023, 9:23 PM

No… I didn’t… I did upgraded environment from v1.22.x to v1.23.x to current environment

creamy-pencil-82913

06/15/2023, 9:24 PM

was there a problem with the upgrade? can you confirm that the metrics-server deployment was successfully upgraded to the latest release?

magnificent-vr-88571

06/15/2023, 9:25 PM

yes… I came across https://github.com/rancher/rke2-charts/blob/main/charts/rke2-metrics-server/rke2-metrics-server/2.11.100-build2023051508/values.yaml#L21-L27 and made host network as false and ending up with the certificate error :(

creamy-pencil-82913

06/15/2023, 9:25 PM

It’s false by default, you shouldn’t have to change it

🆗 1

creamy-pencil-82913

06/15/2023, 9:27 PM

what do you get from

kubectl get helmchart -n kube-system rke2-metrics-server -o yaml |grep chart-url

and

kubectl get pod -n kube-system -l <http://helmcharts.helm.cattle.io/chart=rke2-metrics-server|helmcharts.helm.cattle.io/chart=rke2-metrics-server>

creamy-pencil-82913

06/15/2023, 9:27 PM

Did you by some chance deploy a different metrics-server to the cluster?

magnificent-vr-88571

06/15/2023, 9:32 PM

Copy code

<http://helm.cattle.io/chart-url|helm.cattle.io/chart-url>: <https://rke2-charts.rancher.io/assets/rke2-metrics-server/rke2-metrics-server-2.11.100-build2022101107.tgz>

creamy-pencil-82913

06/15/2023, 9:33 PM

that’s the current one, and it has it set to false. What about the other command (get pods) and also

kubectl get helmchartconfig -A

magnificent-vr-88571

06/15/2023, 9:33 PM

Copy code

kubectl get pod -n kube-system -l <http://helmcharts.helm.cattle.io/chart=rke2-metrics-server|helmcharts.helm.cattle.io/chart=rke2-metrics-server>

this returns empty

magnificent-vr-88571

06/15/2023, 9:33 PM

Copy code

>> kubectl get pod -n kube-system | grep metric
rke2-metrics-server-8787c959c-z862k                     0/1     CrashLoopBackOff   196 (2m23s ago)   16h

creamy-pencil-82913

06/15/2023, 9:34 PM

no pods for the helm job? What about

kubectl get job -n kube-system -l <http://helmcharts.helm.cattle.io/chart=rke2-metrics-server|helmcharts.helm.cattle.io/chart=rke2-metrics-server>

magnificent-vr-88571

06/15/2023, 9:35 PM

Copy code

NAME                               COMPLETIONS   DURATION   AGE
helm-install-rke2-metrics-server   1/1           15s        57d

creamy-pencil-82913

06/15/2023, 9:35 PM

how long ago did you upgrade the cluster? That indicates that the chart deployment was last updated 57 days ago.

creamy-pencil-82913

06/15/2023, 9:36 PM

can you also get the helmchartconfig list?

magnificent-vr-88571

06/15/2023, 9:37 PM

yeah i did this upgrade quite a while

magnificent-vr-88571

06/15/2023, 9:41 PM

Copy code

helmVersion: 3
  info:
    description: Upgrade complete
    firstDeployed: '2022-09-23T23:32:37Z'
    lastDeployed: '2023-05-17T03:30:10Z'

creamy-pencil-82913

06/15/2023, 9:44 PM

hmm that lines up

creamy-pencil-82913

06/15/2023, 9:45 PM

still not seeing

kubectl get helmchartconfig -A

output to confirm that you didn’t set it to host network true at some point

magnificent-vr-88571

06/15/2023, 9:46 PM

it returns

Copy code

No resources found

creamy-pencil-82913

06/15/2023, 9:47 PM

ok. So I don’t see why the pod would be running with host network

creamy-pencil-82913

06/15/2023, 9:47 PM

what do you get from

kubectl get deployment -n kube-system rke2-metrics-server -o yaml

magnificent-vr-88571

06/30/2023, 10:11 PM

figured out the issue with

Copy code

>>> k get nodes
E0615 14:24:39.179242   83498 memcache.go:287] couldn't get resource list for <http://metrics.k8s.io/v1beta1|metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.188277   83498 memcache.go:287] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.190885   83498 memcache.go:121] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.193944   83498 memcache.go:121] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request
E0615 14:24:39.195687   83498 memcache.go:121] couldn't get resource list for <http://custom.metrics.k8s.io/v1beta1|custom.metrics.k8s.io/v1beta1>: the server is currently unable to handle the request

I did upgrade RKE2 from 1.22 to 1.24. post that tried to upgrade prometheus stack too, however got failed and some how 3 selectors got inserted in few of the prometheus services which prevented service to attach with appropriate pods. upon fixing service selector got fixed the issue.

397 Views

Open in Slack

Previous Next