This message was deleted Rancher Users #general

Join Slack

This message was deleted.

# general

adamant-kite-43734

06/07/2023, 4:03 AM

This message was deleted.

big-hydrogen-97240

06/07/2023, 4:07 PM

This usually means there is an issue starting the

eks-operator

in the Rancher cluster. The CRD and the operator are created at the same time. There would be logs in Rancher indicating what the problem is.

faint-orange-82402

06/09/2023, 11:55 AM

will check... thanks for the response

faint-orange-82402

06/09/2023, 2:50 PM

leaderelection lost for cattle-controllers

faint-orange-82402

06/09/2023, 2:50 PM

Any clue about this error? this is from rancher pod logs

big-hydrogen-97240

06/09/2023, 3:05 PM

That error shouldn't be an issue. That would mean that a Rancher pod was the leader, lost the leader election, and needed to be restarted. The only issue would be if that error was happening continuously.

faint-orange-82402

06/09/2023, 3:05 PM

my all pods are in crashed state

faint-orange-82402

06/09/2023, 3:05 PM

2023/06/09 15:03:20 [INFO] Handling backend connection request [172.27.20.4]

2023/06/09 15:03:44 [ERROR] error syncing 'p-nw7q8/u-b4qkhsnliz-admin-cluster-owner': handler auth-prov-v2-rb: namespaces "p-nw7q8" not found, requeuing

2023/06/09 15:03:44 [ERROR] error syncing 'p-f68k4/u-b4qkhsnliz-admin-cluster-owner': handler auth-prov-v2-rb: namespaces "p-f68k4" not found, requeuing

2023/06/09 15:03:44 [ERROR] error syncing 'p-f68k4/local-fleet-local-owner-cluster-owner': handler auth-prov-v2-rb: namespaces "p-f68k4" not found, requeuing

2023/06/09 15:03:44 [ERROR] error syncing 'p-f68k4/cluster-owner': handler auth-prov-v2-role: namespaces "p-f68k4" not found, requeuing

2023/06/09 15:03:44 [ERROR] error syncing 'p-nw7q8/cluster-owner': handler auth-prov-v2-role: namespaces "p-nw7q8" not found, requeuing

2023/06/09 15:03:44 [ERROR] error syncing 'p-nw7q8/local-fleet-local-owner-cluster-owner': handler auth-prov-v2-rb: namespaces "p-nw7q8" not found, requeuing

2023/06/09 15:03:48 [ERROR] error syncing 'local': handler global-admin-cluster-sync: failed to get GlobalRoleBinding for 'globaladmin-user-5rt7n': %!!(MISSING)w(<nil>), requeuing

faint-orange-82402

06/09/2023, 3:05 PM

these are logs from another pod

faint-orange-82402

06/09/2023, 3:06 PM

Actually I just deployed rancher deployment using image....I mean I am not running "helm install rancher/rancher-stable"

faint-orange-82402

06/09/2023, 3:07 PM

And looks like pod is expecting few namespaces and resources which were deployed with helm install <release>

big-hydrogen-97240

06/09/2023, 3:08 PM

Those namespaces are "project" namespaces. They aren't created when install Rancher via Helm.

faint-orange-82402

06/09/2023, 3:09 PM

okay.......but I am not getting errors ...what pod is lloking for like p-nw7q8/local-fleet-local-owner-cluster-owne etc

faint-orange-82402

06/09/2023, 3:16 PM

Does rancher has dependency on namespaces "cattle-fleet-clusters-system" "cattle-fleet-local-system" few more ?

faint-orange-82402

06/09/2023, 3:16 PM

Now I just have pods, svc. ingress in my namespace rancher

big-hydrogen-97240

06/09/2023, 3:17 PM

Yes, but Rancher should create the namespaces it needs when it starts up.

faint-orange-82402

06/09/2023, 3:19 PM

I believed those namespaces got created when I executed helm install <release? So I have deleted them now when installed rancher as deployment manifest

faint-orange-82402

06/09/2023, 3:19 PM

Did I messed the things?

faint-orange-82402

06/09/2023, 3:20 PM

You are right, I can see those namespaces again

big-hydrogen-97240

06/09/2023, 3:20 PM

I don't believe those namespaces get created when you install with Helm. I believe the Rancher pod creates them. You can restart the Rancher pod and they should come back.

faint-orange-82402

06/09/2023, 3:20 PM

yeah you are right.....

faint-orange-82402

06/09/2023, 3:20 PM

I see those namespaces again

faint-orange-82402

06/09/2023, 3:22 PM

my pods status is running now....don't know how but logs has many errors

faint-orange-82402

06/09/2023, 3:25 PM

BTW I have deployed my pods in custom namespace )Not in cattle-system) is that Ok? I mean I hope that's not mandatory to launch in cattle-system ns/

big-hydrogen-97240

06/09/2023, 3:28 PM

Last I investigated this, it was only mandatory for Rancher to be running the in the

cattle-system

namespace in one scenario. SUSE calls it the "hosted Rancher" setup. If you aren't running Rancher on a cluster that is managed by another Rancher instance, then I think you're fine.

faint-orange-82402

06/09/2023, 3:29 PM

okay....Its on EKS

faint-orange-82402

06/09/2023, 3:31 PM

2023/06/09 15:02:55 [ERROR] Failed to handle tunnel request from remote address x.x.x.x:37578: response 400: cluster not found

2023/06/09 15:03:00 [INFO] Handling backend connection request [172.27.16.140]

2023/06/09 15:03:00 [ERROR] Failed to handle tunnel request from remote address x.x.x.x:37590: response 400: cluster not found

2023/06/09 15:03:05 [ERROR] Failed to handle tunnel request from remote address x.x.x.x:42438: response 400: cluster not found

2023/06/09 15:03:10 [ERROR] Failed to handle tunnel request from remote address x.x.x.x:42440: response 400: cluster not found

2023/06/09 15:03:15 [ERROR] Failed to handle tunnel request from remote address x.x.x.x:41740: response 400: cluster not found

faint-orange-82402

06/09/2023, 3:31 PM

x.x.x.x is in my vpc range

faint-orange-82402

06/09/2023, 3:31 PM

any idea what are these ports?

faint-orange-82402

06/09/2023, 3:32 PM

Do I need to open some ports in cluster /worker SG?

big-hydrogen-97240

06/09/2023, 3:35 PM

I believe that means there is a cluster that used to be managed by this Rancher instance that it doesn't know about. There is a pod running in a cluster somewhere trying to connect to this Rancher server. Maybe one you previously imported? If so, then you can remove the

cluster-agent

on that old cluster and these logs should go away.

faint-orange-82402

06/09/2023, 3:37 PM

yes I did import one cluster but I had deleted cluster-agent pod few hours back

faint-orange-82402

06/09/2023, 3:43 PM

Updating TLS secret for cattle-system/serving-cert (count: 8): map[<http://field.cattle.io/projectId:local:p-nlgkz|field.cattle.io/projectId:local:p-nlgkz>

<http://listener.cattle.io/cn-127.0.0.1:127.0.0.1|listener.cattle.io/cn-127.0.0.1:127.0.0.1>

<http://listener.cattle.io/cn-x.x.x.x:x.x.x.x|listener.cattle.io/cn-x.x.x.x:x.x.x.x>

<http://listener.cattle.io/cn-x.x.x.x:x.x.x.x|listener.cattle.io/cn-x.x.x.x:x.x.x.x>

<http://listener.cattle.io/cn-x.x.x.x:x.x.x.x|listener.cattle.io/cn-x.x.x.x:x.x.x.x>

<http://listener.cattle.io/cn-localhost:localhost|listener.cattle.io/cn-localhost:localhost>

<http://listener.cattle.io/cn-rancher-poc.sw.abc.com:rancher-poc.sw.abc.com|listener.cattle.io/cn-rancher-poc.sw.abc.com:rancher-poc.sw.abc.com>

<http://listener.cattle.io/cn-rancher.cattle-system:rancher.cattle-system|listener.cattle.io/cn-rancher.cattle-system:rancher.cattle-system>

<http://listener.cattle.io/fingerprint:SHA1=dhdhdhdhdhdhdhdhdhdhdhdhd|listener.cattle.io/fingerprint:SHA1=dhdhdhdhdhdhdhdhdhdhdhdhd>]

faint-orange-82402

06/09/2023, 3:43 PM

this is another error.....

<http://rancher-poc.sw.abc.com:rancher-poc.sw.abc.com|rancher-poc.sw.abc.com:rancher-poc.sw.abc.com> isthe one I have deployed in another namespace

faint-orange-82402

06/09/2023, 3:44 PM

I am curious how come my curren pod/rancher server have logs related to another rancher server in another namespace?

faint-orange-82402

06/09/2023, 3:44 PM

though they are on same EKS cluster

big-hydrogen-97240

06/09/2023, 3:45 PM

You have two Rancher pods running in two different namespaces in the same cluster?

faint-orange-82402

06/09/2023, 3:45 PM

yes

big-hydrogen-97240

06/09/2023, 3:45 PM

Oh, that won't work.

faint-orange-82402

06/09/2023, 3:46 PM

oops

big-hydrogen-97240

06/09/2023, 3:47 PM

I don't think anything is actually broken. You should just only run one Rancher deployment. You can have multiple replicas in that deployment, but they have to know about and interact with each other. There is leader election, sharding, etc.

faint-orange-82402

06/09/2023, 3:47 PM

let me clean up old namespace and see

faint-orange-82402

06/09/2023, 3:48 PM

I am totally new to rancher....I am sorry for many questions and stupid ones

faint-orange-82402

06/09/2023, 3:48 PM

But you are too good man

faint-orange-82402

06/09/2023, 3:58 PM

Earlier I was able to create Cloud Credentials, but that option is deactivated now, any idea?

faint-orange-82402

06/09/2023, 4:19 PM

BTW Thank you for all the help so far.....I am thinking Rancher pod taking time to setup all required namespaces/other resources

big-hydrogen-97240

06/09/2023, 4:51 PM

Not sure why you aren't able to create the credential there. But on the left-hand side, you see the "Cloud Credential" tab. You can try there instead.

👍 1

faint-orange-82402

06/12/2023, 7:13 AM

Copy code

Cluster: cp-qa-ore Waiting
Namespace: fleet-default
Age: 4.8 mins
Wh i
Waiting for API to be available
Provisioner: Amazon EKS
There are 0 nodes available to run the cluster agent. The cluster will not become active until at least one node is available.
Machine Pools
Provisioning Log
Registration
Conditions
Recent Events
Related Resources
You should not import a cluster which has already been connected to another instance of Rancher as it will lead to data corruption.
Run the kubectl command below on an existing Kubernetes cluster running a supported Kubernetes version to import it into Rancher:
kubectl apply -f <https://rancher-poc.sw.abc.com/v3/import/brcrsvr54tk88d4qcfjgtmh6vpqq55lmq7ddpfn24zzczs7xvgjvnr_c-zqq45.yaml>

faint-orange-82402

06/12/2023, 7:14 AM

While importing a cluster, I am not getting why it is showing old server(rancher-poc.sw.abc.com) which I have launched in another namespace

faint-orange-82402

06/12/2023, 7:15 AM

I have already removed all resources corresponding to old server (rancher-poc)

big-hydrogen-97240

06/12/2023, 11:57 AM

Do you still have Rancher pods running in different namespaces?

faint-orange-82402

06/12/2023, 1:30 PM

no...I cleaned up all

faint-orange-82402

06/12/2023, 1:31 PM

and Now I launched rancher in cattle-system ns to avoid ambiguity

faint-orange-82402

06/12/2023, 1:38 PM

What if I update cluster-agent yaml and apply on imported cluster? Will it work?

faint-orange-82402

06/12/2023, 1:46 PM

Copy code

kind: Deployment
apiVersion: apps/v1
metadata:
  name: rancher
  namespace: cattle-system
  labels:
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: rancher
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: rancher
spec:
  replicas: 3
  selector:
    matchLabels:
      app: rancher
  template:
    metadata:
      labels:
        app: rancher
        release: rancher
    spec:
      serviceAccountName: rancher
      containers:
      - image: rancher/rancher
        imagePullPolicy: IfNotPresent
        name: rancher
        ports:
        - containerPort: 80
          protocol: TCP
        - containerPort: 443
          protocol: TCP
        args:
        - "--http-listen-port=80"
        - "--https-listen-port=443"
        - "--add-local=true"
        env:
        - name: CATTLE_NAMESPACE
          value: cattle-system
        - name: CATTLE_PEER_SERVICE
          value: rancher
        - name: NO_PROXY
          value: "127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.svc,.cluster.local"
        - name: CATTLE_BOOTSTRAP_PASSWORD
          valueFrom:
            secretKeyRef:
              name: "bootstrap-secret"
              key: "bootstrapPassword"

faint-orange-82402

06/12/2023, 1:46 PM

Hey, If you get few mins, can you review my rancher deployment, I had removed many things from gitHub template

faint-orange-82402

06/12/2023, 2:07 PM

Is NodeAffinity compulsory? I see that in fleet-agent pod yaml

faint-orange-82402

06/12/2023, 2:14 PM

Sorry I am asking too much, I am in a tough spot at the moment

big-hydrogen-97240

06/12/2023, 2:36 PM

I’m not sure what is happening with your setup. It seems that having two Rancher deployments has caused some issues. I’m not sure how much help I would be with debugging the problems.

faint-orange-82402

06/12/2023, 4:51 PM

Fair enough....thank you for all the help

acoustic-accountant-92899

02/22/2024, 10:32 AM

while I'm trying to import existing EKS cluster in rancher with cloud credentials it is showing below error. "There are 0 nodes available to run the cluster agent. The cluster will not become active until at least one node is available." Even i have nodegroup in EKS cluster

713 Views

Open in Slack

Previous Next