polite-piano-74233
06/15/2023, 7:30 PMacceptable-belgium-2684
06/15/2023, 8:47 PMlittle-hair-13922
06/15/2023, 9:53 PMwhite-branch-93180
06/15/2023, 9:55 PMcold-boots-35236
06/16/2023, 7:16 AMVERSION: v1.22.4
For kind: Service
which is type: LoadBalancer
we wanted to set externalTrafficPolicy: Local
so that we only have one registered target under AWS TargetGroup where the actual pod is running.cold-boots-35236
06/16/2023, 7:26 AMVERSION: v1.22.4
ON AWS
OS: Amazon Linux 2 4.14 KERNEL
<docker://20.10.4>
For kind: Service
which is type: LoadBalancer
we wanted to set externalTrafficPolicy: Local
so that we only have one registered target under AWS TargetGroup where the actual pod is running.
we figured that the name of the node should marry up with what kube-proxy thinks the hostname is. in our case when we run kubectl get nodes we have NAME like - ip-10-103-149-15.eu-west-2.compute.internal and the actual hostname is different (as per below snapshot)
in order the update the kube-proxy hostname-override we are changing cluster yaml like below.
kube-proxy:
extra_args:
hostname-override: "$(curl <http://169.254.169.254/latest/meta-data/hostname>)"
however we are not able to use hostname-override option by editing cluster.yaml
what we have tested is that if we delete the kube-proxy container and run it as below everything works fine -
docker run --net=host --privileged -v /etc/kubernetes:/etc/kubernetes -v /lib/modules:/lib/modules -v /var/lib/docker/volumes/b9be15df5114090439353e899bbe66bff17e7ebffe979edaaebf8e9055703f/_data:/opt/rke-tools -v /run:/run --name kube-proxy --entrypoint "/opt/rke-tools/entrypoint.sh" rancher/hyperkube:v1.22.4-rancher1 kube-proxy --cluster-cidr=10.70.0.0/16 --kubeconfig=/etc/kubernetes/ssl/kubecfg-kube-proxy.yaml --healthz-bind-address=127.0.0.1 --v=2 --bind-address=10.108.151.7 --hostname-override=ip-10-103-149-15.eu-west-2.compute.internal
pls note the --hostname-verride=_*ip-10-103-149-15.eu-west-2.compute.internal*_
however RKE updates the kube-proxy after sometime, and we are not sure from where we can update it.swift-sunset-4572
06/16/2023, 8:30 AMCluster API
RKE2 is built on the Cluster API framework, which affects the way cluster configurations are managed and changes are made. Unlike RKE1, RKE2 clusters are more susceptible to node reprovisioning when changes are made to the cluster configuration. This behavior is controlled by the CAPI controllers and not by Rancher itself. Certain changes, such as enabling drain before delete or scaling down the number of nodes, may result in the deletion of existing nodes and the creation of new ones. This is a known issue in the Cluster API that will be fixed in a future release, and Rancher will update the documentation accordingly.
I just want to know whether this issue is resolved in 2.7.4 ?? or is there any workaround to prevent the reprovisioning? Any suggestions and the guidance will be appreciated.crooked-cat-21365
06/16/2023, 10:32 AMdry-france-47468
06/16/2023, 12:07 PMagreeable-pager-80720
06/16/2023, 12:23 PMlittle-hair-13922
06/16/2023, 3:22 PMbored-painting-68221
06/16/2023, 9:30 PMrancher/system-upgrade-controller
, please feel free to redirect me if I'm in the wrong place π
I'm writing a mutating admission controller to dynamically increase the ActiveDeadlineSeconds for specific batchv1.Jobs submitted by the system-upgrade-controller beyond what is set in system-upgrade-controller's chart's configmap.
I think what I'm experiencing though is after I mutate the create request, system-upgrade-controller updates the batch.Job and it resets to the ActiveDeadlineSeconds to what the configmap is set to. At least, that's my primary suspect of where the change is coming from since the kubernetes audit logs show a bunch of PATCH requests from the system-upgrade-controller to the batchv1.Job resource.
I updated my mutator to also act on Update events and in doing so I'm able to force the ActiveDeadlineSeconds to be what I want but it doesn't seem like a harmonious approach on my end to have the mutator wrestle against the system-upgrade-controller like this.
I'd appreciate any advice on how I can improve this approach.fancy-barista-28405
06/17/2023, 9:20 PMwonderful-rain-13345
06/18/2023, 1:56 PMimportant-petabyte-50139
06/19/2023, 12:26 AMnarrow-honey-55422
06/19/2023, 7:32 AMimportant-petabyte-50139
06/19/2023, 11:06 AMbrief-exabyte-15397
06/19/2023, 12:28 PMbrief-exabyte-15397
06/19/2023, 12:39 PMbrief-exabyte-15397
06/19/2023, 12:39 PMcool-vase-61909
06/19/2023, 2:19 PMkubectl get nodes
to generate the HAProxy config, so that my DNS entries can point to a static IP. Is there a better way to do this that I'm missing?brief-exabyte-15397
06/19/2023, 2:39 PMlimited-spoon-19913
06/19/2023, 7:11 PM<https://rancher.prod.ops.example.net/k8s/clusters/c-kasjdfi/api/v1/namespaces/prometheus/services/http:prometheus-kube-prometheus-prometheus:9090/proxy/#/dashboard>
I tried (via grafana ui) using a bearer token and also basic auth without success. How does one authenticate to reach that ?narrow-rainbow-58629
06/19/2023, 7:49 PMnarrow-rainbow-58629
06/19/2023, 7:51 PMvictorious-guitar-37937
06/19/2023, 8:34 PMdocker run --volumes-from goofy_sinoussi -v $PWD:/backup busybox sh -c "rm /var/lib/rancher/* -rf && tar pzxvf /backup/rancher-data-backup-v2.7.4-2023-06-19.tar.gz"
Rancher ist starting, I can Login, the cluster shows up but "unavailable" and "Cluster agent is not connected". I know that there is no agent connected, how can i connect an agent ?important-petabyte-50139
06/20/2023, 4:48 AMworried-advantage-93871
06/20/2023, 9:45 AMnarrow-rainbow-58629
06/20/2023, 12:13 PMcareful-restaurant-19876
06/20/2023, 1:57 PMflytectl demo start
i am getting following error :
π§βπ Bootstrapping a brand new flyte cluster... π¨ π§
delete existing sandbox cluster [y/n]: y
π Going to use Flyte v1.7.0 release with image <http://cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84|cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84>
π pulling docker image for release <http://cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84|cr.flyte.org/flyteorg/flyte-sandbox-bundled:sha-1ae254f8683699b68ecddc89d775fc5d39cc3d84>
π§βπ booting Flyte-sandbox container
π¬ Something went wrong: Failed to start Sandbox container π, Please check your docker client and try again.
Error: Error response from daemon: driver failed programming external connectivity on endpoint flyte-sandbox (b4e1199dc143e2a67410bf4ea35b1c75dc256cc21c3862833ffb3cb6cf36b179): Error starting userland proxy: listen tcp4 0.0.0.0:6443: bind: address already in use
But when i run the command with option n
i get following message :
π§βπ Bootstrapping a brand new flyte cluster... π¨ π§
delete existing sandbox cluster [y/n]: n
Existing details of your sandboxπ¨βπ» Flyte is ready! Flyte UI is available at <http://localhost:30080/console> π π π
βοΈ Run the following command to export sandbox environment variables for accessing flytectl
export FLYTECTL_CONFIG=/Users/vipul.goswami@schibsted.com/.flyte/config-sandbox.yaml
π¨βπ» Flyte is ready! Flyte UI is available at <http://localhost:30080/console> π π π
βοΈ Run the following command to export demo environment variables for accessing flytectl
export FLYTECTL_CONFIG=/Users/vipul.goswami@schibsted.com/.flyte/config-sandbox.yaml
π Flyte sandbox ships with a Docker registry. Tag and push custom workflow images to localhost:30000
π The Minio API is hosted on localhost:30002. Use <http://localhost:30080/minio/login> for Minio console
But none of the links are working here.
Also when running following command gives following error:
pyflyte run --remote example.py training_workflow --hyperparameters '{"C": 0.1}'
Failed with Exception Code: SYSTEM:Unknown
RPC Failed, with Status: StatusCode.UNAVAILABLE
details: failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:30080: Failed to connect to remote host: Connection refused
Debug string UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:30080: Failed to connect to remote host: Connection refused {created_time:"2023-06-20T15:46:32.394087+02:00", grpc_status:14}
This is the issue with rancher as when i run the these same command with docker-desktop it work fine so could you guys please help whats the issue under the hood