stale-painting-80203
05/17/2022, 2:33 PMmsg="failed to get CA certs: Get \"<https://127.0.0.1:6444/cacerts>\": read tcp 127.0.0.1:48636->127.0.0.1:6444: read: connection reset by peer
Curl gives following error:
curl -v <https://127.0.0.1:6444/cacerts>
* Trying 127.0.0.1:6444...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 6444 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:6444
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:6444
Anyone know what this might be caused by?
curl -k https://127.0.0.1:6443 gives a response.
rke2-server.service - Rancher Kubernetes Engine v2 (server)
Loaded: loaded (/etc/systemd/system/rke2-server.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2022-05-16 20:46:15 PDT; 4min 55s ago
Docs: <https://github.com/rancher/rke2#readme>
Process: 48957 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited>
Process: 48960 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 48961 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 48962 (rke2)
Tasks: 205
CGroup: /system.slice/rke2-server.service
├─ 2652 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 2673 /pause
├─ 2707 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 2711 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 2737 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 2758 /pause
├─ 2778 /pause
├─ 2813 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 2836 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 2841 /pause
├─ 2866 /pause
├─ 2882 kube-proxy --cluster-cidr=10.42.0.0/16 --conntrack-max-per-core=0 --conntrack-tcp-timeout-close-wa>
├─ 3069 etcd --config-file=/var/lib/rancher/rke2/server/db/etcd/config
├─ 3761 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 3782 /pause
├─ 3807 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 3835 /pause
├─ 3863 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 3890 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 3920 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─ 3940 /pause
├─ 3941 /pause
├─ 4073 /coredns -conf /etc/coredns/Corefile
├─38373 kube-apiserver --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --allow-privileged>
├─42523 /cluster-proportional-autoscaler --namespace=kube-system --configmap=rke2-coredns-rke2-coredns-aut>
├─43471 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─43493 /pause
├─43738 /opt/bin/flanneld --ip-masq --kube-subnet-mgr
├─45408 /usr/sbin/runsvdir -P /etc/service/enabled
├─45734 runsv felix
├─45735 runsv monitor-addresses
├─45736 runsv allocate-tunnel-addrs
├─45737 runsv node-status-reporter
├─45738 runsv cni
├─45739 calico-node -monitor-addresses
├─45740 calico-node -felix
├─45741 calico-node -allocate-tunnel-addrs
├─45742 calico-node -status-reporter
├─45743 calico-node -monitor-token
├─46226 /var/lib/rancher/rke2/data/v1.22.9-rke2r2-88ecb1441384/bin/containerd-shim-runc-v2 -namespace k8s.>
├─46261 /pause
├─48962 /opt/rke2/bin/rke2 server
├─48995 containerd -c /var/lib/rancher/rke2/agent/etc/containerd/config.toml -a /run/k3s/containerd/contai>
├─49051 kubelet --volume-plugin-dir=/var/lib/kubelet/volumeplugins --file-check-frequency=5s --sync-freque>
└─49452 kube-scheduler --permit-port-sharing=true --authentication-kubeconfig=/var/lib/rancher/rke2/server>
May 16 20:50:57 suse-vm rke2[48962]: E0516 20:50:57.578889 48962 memcache.go:101] couldn't get resource list for metr>
May 16 20:51:05 suse-vm rke2[48962]: I0516 20:51:04.992233 48962 trace.go:205] Trace[411526421]: "Reflector ListAndWa>
May 16 20:51:05 suse-vm rke2[48962]: Trace[411526421]: ---"Objects listed" 61358ms (20:51:03.401)
May 16 20:51:05 suse-vm rke2[48962]: Trace[411526421]: [1m2.51858886s] [1m2.51858886s] END
May 16 20:51:08 suse-vm rke2[48962]: time="2022-05-16T20:51:08-07:00" level=info msg="Event(v1.ObjectReference{Kind:\"A>
May 16 20:51:09 suse-vm rke2[48962]: E0516 20:51:09.360628 48962 memcache.go:196] couldn't get resource list for metr>
May 16 20:51:09 suse-vm rke2[48962]: E0516 20:51:09.931690 48962 memcache.go:101] couldn't get resource list for metr>
May 16 20:51:10 suse-vm rke2[48962]: time="2022-05-16T20:51:10-07:00" level=info msg="Starting /v1, Kind=Secret control>
May 16 20:51:10 suse-vm rke2[48962]: time="2022-05-16T20:51:10-07:00" level=info msg="Starting /v1, Kind=Node controlle>
May 16 20:51:10 suse-vm rke2[48962]: I0516 20:51:10.690858 48962 leaderelection.go:248] attempting to acquire leader
straight-fireman-71417
05/17/2022, 5:49 PMmagnificent-vr-88571
05/18/2022, 10:08 AMrke2 --version
rke2 version v1.21.5+rke2r2 (9e4acdc6018ae74c36523c99af25ab861f3884da)
go version go1.16.6b7
I have created registries.yaml with aws ECR as private registry
mirrors:
<http://2341234234.dkr.ecr.yyyyy.amazonaws.com|2341234234.dkr.ecr.yyyyy.amazonaws.com>:
endpoint:
- "<https://2341234234.dkr.ecr.yyyyy.amazonaws.com>"
configs:
"<http://2341234234.dkr.ecr.yyyyy.amazonaws.com|2341234234.dkr.ecr.yyyyy.amazonaws.com>:
auth:
username: xxxxx
password: xxxxx
tls:
insecure_skip_verify: true
While I create a pod to pull registry with following image name in pod.yaml, it fails with image pull error.
image: <http://2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1|2341234234.dkr.ecr.yyyyy.amazonaws.com/image-name:v1>
Looking forward some insightsvictorious-analyst-3332
05/18/2022, 2:03 PMkube-controller-manager
metrics on port 10252 are not exposed in RKE2 deployments. We are currently testing v1.22.9+rke2r2
deployed via Rancher v2.6.5
and are seeing the following differences from RKE deployments. Thanks a lot.
RKE cluster:
# netstat -tulpn | grep kube-contro
tcp6 0 0 :::10252 :::* LISTEN 17396/kube-controll
tcp6 0 0 :::10257 :::* LISTEN 17396/kube-controll
RKE2 cluster:
# netstat -tulpn | grep kube-contro
tcp 0 0 127.0.0.1:10257 0.0.0.0:* LISTEN 39961/kube-controll
gifted-cricket-25537
05/19/2022, 10:54 AMcni: "calico"
disable-kube-proxy: true
and since there's no kube-proxy
installed the tigera-operator
can't reach the API
2022/05/19 07:50:02 [ERROR] Get "<https://10.43.0.1:443/api?timeout=32s>": dial tcp 10.43.0.1:443: i/o timeout
I know that I could override it using the kubernetes-services-endpoint
CM, although I can't inject that during the installation. Any clues?faint-airport-83518
05/19/2022, 11:17 PMcurved-caravan-26314
05/20/2022, 12:22 AMalert-oxygen-95787
05/20/2022, 10:40 AMcurved-caravan-26314
05/20/2022, 12:05 PMshy-zebra-53074
05/20/2022, 10:39 PMRule Title: The Kubernetes Controller Manager must use TLS 1.2, at a minimum, to protect the confidentiality of sensitive data during electronic dissemination.
Discussion: The Kubernetes Controller Manager will prohibit the use of SSL and unauthorized versions of TLS protocols to properly secure communication.
The use of unsupported protocol exposes vulnerabilities to the Kubernetes by rogue traffic interceptions, man-in-the-middle attacks, and impersonation of users or services from the container platform runtime, registry, and key store. To enable the minimum version of TLS to be used by the Kubernetes Controller Manager, the setting "tls-min-version" must be set.
Check Text: Change to the /etc/kubernetes/manifests/ directory on the Kubernetes Master Node. Run the command:
grep -i tls-min-version *
If the setting "tls-min-version" is not configured in the Kubernetes Controller Manager manifest file or it is set to "VersionTLS10" or "VersionTLS11", this is a finding.
Fix Text: Edit the Kubernetes Controller Manager manifest file in the /etc/kubernetes/manifests directory on the Kubernetes Master Node. Set the value of "--tls-min-version" to "VersionTLS12" or higher.
However I don’t see the tls-min-version
flag set for the kube-controller-manager
service:
kube-controller-manager --flex-volume-plugin-dir=/var/lib/kubelet/volumeplugins --terminated-pod-gc-threshold=1000 --permit-port-sharing=true --cloud-provider=aws --cloud-config= --allocate-node-cidrs=true --authentication-kubeconfig=/var/lib/rancher/rke2/server/cred/controller.kubeconfig --authorization-kubeconfig=/var/lib/rancher/rke2/server/cred/controller.kubeconfig --bind-address=127.0.0.1 --cluster-cidr=10.0.0.0/12 --cluster-signing-kube-apiserver-client-cert-file=/var/lib/rancher/rke2/server/tls/client-ca.crt --cluster-signing-kube-apiserver-client-key-file=/var/lib/rancher/rke2/server/tls/client-ca.key --cluster-signing-kubelet-client-cert-file=/var/lib/rancher/rke2/server/tls/client-ca.crt --cluster-signing-kubelet-client-key-file=/var/lib/rancher/rke2/server/tls/client-ca.key --cluster-signing-kubelet-serving-cert-file=/var/lib/rancher/rke2/server/tls/server-ca.crt --cluster-signing-kubelet-serving-key-file=/var/lib/rancher/rke2/server/tls/server-ca.key --cluster-signing-legacy-unknown-cert-file=/var/lib/rancher/rke2/server/tls/client-ca.crt --cluster-signing-legacy-unknown-key-file=/var/lib/rancher/rke2/server/tls/client-ca.key --feature-gates=JobTrackingWithFinalizers=true --kubeconfig=/var/lib/rancher/rke2/server/cred/controller.kubeconfig --profiling=false --root-ca-file=/var/lib/rancher/rke2/server/tls/server-ca.crt --secure-port=10257 --service-account-private-key-file=/var/lib/rancher/rke2/server/tls/service.key --use-service-account-credentials=true
thankful-plastic-94519
05/22/2022, 1:08 AMmagnificent-vr-88571
05/23/2022, 12:49 PMEvents:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 21s default-scheduler Successfully assigned default/ecr-pod to sv11
Normal BackOff 8s kubelet Back-off pulling image "private-registry/utils-img:latest"
Warning Failed 8s kubelet Error: ImagePullBackOff
Normal Pulling <invalid> (x2 over 8s) kubelet Pulling image "private-registry/utils-img:latest"
Warning Failed <invalid> (x2 over 8s) kubelet Failed to pull image "private-registry/utils-img:latest": rpc error: code = NotFound desc = failed to pull and unpack image "private-registry/utils-img:latest": failed to unpack image on snapshotter overlayfs: unexpected media type text/html for sha256:6066282cb389d5bee17ec9f08335850ece266457a557511d3f5e78eabc34df21: not found
Warning Failed <invalid> (x2 over 8s) kubelet Error: ErrImagePull
I am trying to add rewrite as mentioned https://github.com/k3s-io/k3s/issues/5502#issuecomment-1109107238 to overcome this issue.
After adding rewrite /etc/rancher/rke2/registries.yaml
its not reflected in the environment, would like to know rewrite in /var/lib/rancher/rke2/agent/etc/containerd/config.toml
to overcome this issue.
correct me if am wrong.ambitious-butcher-83703
05/23/2022, 11:39 PMbillions-kite-9416
05/25/2022, 10:34 AMrun.sh
is not found
May 25 13:18:53 rancher-system-agent[4497]: time="2022-05-25T13:18:53+03:00" level=info msg="Pulling image <http://index.docker.io/rancher/system-a|index.docker.io/rancher/system-a>
gent-installer-rke2:v1.23.6-rke2r2"
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=info msg="[Applyinator] Running command: sh [-c run.sh]"
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=info msg="[48304be41fe2284adc2b5937c9d5c9c7b09009b2cae77
6320036286c2f0c3191_0:stderr]: sh: 1: run.sh: not found"
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=info msg="[Applyinator] Command sh [-c run.sh] finished
with err: <nil> and exit code: 127"
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=error msg="error executing instruction 0: <nil>"
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=error msg="error encountered during parsing of last run
time: parsing time \"\" as \"Mon Jan _2 15:04:05 MST 2006\": cannot parse \"\" as \"Mon\""
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=info msg="[Applyinator] No image provided, creating empt
y working directory /var/lib/rancher/agent/work/20220525-131853/48304be41fe2284adc2b5937c9d5c9c7b09009b2cae776320036286c2f0c3191_0"
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=info msg="[Applyinator] Running command: sh [-c rke2 etc
d-snapshot list --etcd-s3=false 2>/dev/null]"
May 25 13:18:55 rancher-system-agent[4497]: time="2022-05-25T13:18:55+03:00" level=info msg="[Applyinator] Command sh [-c rke2 etcd-snapsho
t list --etcd-s3=false 2>/dev/null] finished with err: <nil> and exit code: 127"
May 25 13:18:56 rancher-system-agent[4497]: time="2022-05-25T13:18:56+03:00" level=error msg="error loading x509 client cert/key for probe
kube-apiserver (/var/lib/rancher/rke2/server/tls/client-kube-apiserver.crt//var/lib/rancher/rke2/server/tls/client-kube-apiserver.key): open /var/lib/
rancher/rke2/server/tls/client-kube-apiserver.crt: no such file or directory"
future-monitor-61871
05/25/2022, 7:39 PMhundreds-evening-84071
05/26/2022, 12:50 PMshy-zebra-53074
05/26/2022, 4:35 PMkube-apiserver-arg
kube-controller-manager-arg
and kube-scheduler-arg
However, I would like to ensure that a certain config is NOT set that is currently being set by default. For example I would like to unset the --hostname-override
flag for the kubelet
service and make sure that flag is not passed when starting the kubelet servicebored-rain-98291
05/27/2022, 6:18 PMgreat-photographer-94826
05/30/2022, 8:24 AMbillowy-animal-38808
05/31/2022, 2:39 PMpending
state. Can someone explain to me what to do to enable this option?sparse-businessperson-74827
06/01/2022, 1:28 PMJun 1 15:20:33 node-1 rke2[35837]: time="2022-06-01T15:20:33+02:00" level=error msg="Failed to connect to proxy" error="unexpected EOF"
Jun 1 15:20:33 node-1 rke2[35837]: time="2022-06-01T15:20:33+02:00" level=error msg="Remotedialer proxy error" error="unexpected EOF"
bored-rain-98291
06/02/2022, 1:14 PMnarrow-noon-75604
06/02/2022, 3:47 PM$ kubectl logs -f pod/calico-node-r2wj8 -n calico-system
Error from server (InternalError): Internal error occurred: Authorization error (user=kube-apiserver, verb=get, resource=nodes, subresource=proxy)
Created a new clusterrolebinding with "system:kubelet-api-admin" role but that did not help. Please help us with the steps to fix this issue and let me know if you need any more detailssalmon-dress-41836
06/07/2022, 5:38 AMnarrow-noon-75604
06/07/2022, 2:28 PMwrite-kubeconfig-mode: "0644"
tls-san:
- "<http://master.167.254.204.58.nip.io|master.167.254.204.58.nip.io>"
node-label:
- "nodetype=master"
cluster-cidr: "10.42.0.0/16"
service-cidr: "10.43.0.0/16"
cluster-dns: "10.43.0.10"
cluster-domain: "<http://master.xxx.xxx.xxx.xxx.nip.io|master.xxx.xxx.xxx.xxx.nip.io>"
cni:
- calico
disable:
- rke2-canal
- rke2-kube-proxy
curved-caravan-26314
06/07/2022, 4:30 PMhundreds-hairdresser-46043
06/08/2022, 9:22 AMhundreds-evening-84071
06/08/2022, 5:32 PMc:\var\lib\rancher
as directory...
is there an option - or a way I can specify something other than c-drive ? perhaps like d:\var\lib\rancher
?
reason I want to do this is to keep users-stuff off the c-drive to prevent c-drive from filling up and crashing the OS.narrow-noon-75604
06/08/2022, 7:18 PMkubectl get nodes
NAME STATUS ROLES AGE VERSION
rke2-master Ready,SchedulingDisabled control-plane,etcd,master 112m v1.23.6+rke2r2
rke2-worker1 Ready <none> 108m v1.23.6+rke2r2
rke2-worker2 Ready <none> 108m v1.23.6+rke2r2
rke2-worker3 Ready <none> 108m v1.23.6+rke2r2
All the pods under "kube-system" & "calico-system" namespace are up & Running properly.
kubectl get po -n calico-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-69cb5d9c7b-xllwx 1/1 Running 0 113m
calico-node-86pcd 1/1 Running 0 110m
calico-node-dxtjk 1/1 Running 0 110m
calico-node-lb8n7 1/1 Running 0 113m
calico-node-ld8mf 1/1 Running 0 110m
calico-typha-86c8b747c4-rqfhk 1/1 Running 0 113m
calico-typha-86c8b747c4-xs7zs 1/1 Running 0 110m
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
cloud-controller-manager-rke2-master 1/1 Running 0 114m
etcd-rke2-master 1/1 Running 0 114m
helm-install-rke2-calico-crd-j44qb 0/1 Completed 0 114m
helm-install-rke2-calico-hv4xg 0/1 Completed 1 114m
helm-install-rke2-coredns-btsgx 0/1 Completed 0 114m
helm-install-rke2-ingress-nginx-c99sw 0/1 Completed 0 114m
helm-install-rke2-metrics-server-j5gjq 0/1 Completed 0 114m
kube-apiserver-rke2-master 1/1 Running 0 113m
kube-controller-manager-rke2-master 1/1 Running 0 114m
kube-proxy-rke2-master 1/1 Running 0 114m
kube-proxy-rke2-worker1 1/1 Running 0 110m
kube-proxy-rke2-worker2 1/1 Running 0 110m
kube-proxy-rke2-worker3 1/1 Running 0 110m
kube-scheduler-rke2-master 1/1 Running 0 114m
rke2-coredns-rke2-coredns-69c8f974c-gvwqg 1/1 Running 0 7m56s
rke2-coredns-rke2-coredns-69c8f974c-qmq8m 1/1 Running 0 7m56s
rke2-coredns-rke2-coredns-autoscaler-65c9bb465d-4g4sw 1/1 Running 0 114m
rke2-ingress-nginx-controller-2bhbc 1/1 Running 0 113m
rke2-ingress-nginx-controller-gnpk5 1/1 Running 0 110m
rke2-ingress-nginx-controller-znh8z 1/1 Running 0 110m
rke2-ingress-nginx-controller-ztpjf 1/1 Running 0 110m
rke2-metrics-server-6564db4569-m5kcc 1/1 Running 0 113m
I have deployed kafka but the pods are going to "CrashLoopBackOff" with a dns error,
java.net.UnknownHostException: zookeeper.msgbus.svc: Temporary failure in name resolution
The logs of coredns pods are throwing a lot of errors as,
[ERROR] plugin/errors: 2 4686730224998678655.5699833978585684595. HINFO: read udp xx.xx.xx.xx:51070->yy.yy.yy.yy:53: read: no route to host
[ERROR] plugin/errors: 2 4686730224998678655.5699833978585684595. HINFO: read udp xx.xx.xx.xx:42332->yy.yy.yy.yy:53: read: no route to host
[ERROR] plugin/errors: 2 4686730224998678655.5699833978585684595. HINFO: read udp xx.xx.xx.xx:57092->yy.yy.yy.yy:53: read: no route to host
I have opened port 53 to allow both tcp and udp traffic and am not about the cause of this issue. Any suggestions would be appreciated and let me know if you need any more debug logsbrief-mouse-13981
06/09/2022, 5:59 AM