This message was deleted Rancher Users #general

Join Slack

This message was deleted.

# general

adamant-kite-43734

11/08/2024, 8:39 AM

This message was deleted.

stocky-account-63046

11/08/2024, 8:54 AM

have a look in the

cattle-fleet-system

namespace for failing fleet deployments and stateful sets

stocky-account-63046

11/08/2024, 8:55 AM

i think though it's generally advised to only run rancher in docker locally / dev world and the upgrade path is not supported.

acoustic-country-78083

11/08/2024, 9:24 AM

Thanks, I've checked on managed cluster and can't see any error.

Copy code

#k get all -n cattle-fleet-system
NAME                               READY   STATUS    RESTARTS   AGE
pod/fleet-agent-6b6cbb454d-qjzq8   1/1     Running   0          10d

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/fleet-agent   1/1     1            1           133d

NAME                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/fleet-agent-6b6cbb454d   1         1         1       133d
replicaset.apps/fleet-agent-8689d9f67d   0         0         0       133d
#k get sts -n cattle-fleet-system
No resources found in cattle-fleet-system namespace.
#k logs pod/fleet-agent-6b6cbb454d-qjzq8 -n cattle-fleet-system | tail -3
time="2024-11-08T09:01:32Z" level=info msg="Deleting orphan bundle ID rke2, release kube-system/rke2-canal"
time="2024-11-08T09:14:57Z" level=info msg="getting history for release fleet-agent-eic-sb"
time="2024-11-08T09:14:57Z" level=info msg="getting history for release fleet-agent-eic-sb"

acoustic-country-78083

11/08/2024, 9:27 AM

as we have so far only few cluster to manage, it was decided to have standalone docker installation for Rancher, but didn't know that upgrades are not supported for this type of setup

stocky-account-63046

11/08/2024, 9:29 AM

https://ranchermanager.docs.rancher.com/v2.9/getting-started/installation-and-upgrade/other-[…]allation-methods/rancher-on-a-single-node-with-docker

stocky-account-63046

11/08/2024, 9:30 AM

are the conditions for those fleet resources ok?

acoustic-country-78083

11/08/2024, 9:37 AM

Yes, we follow this upgrade procedure, these are stilll non-production managed clusters. However not sure what do you mean by conditions for fleet resources, or how to check that.

acoustic-country-78083

11/08/2024, 9:56 AM

I also have very similar setup as a sandbox where it looks OK, Rancher 2.9.2 + RKE2 1.29.9 with access to internet , compared to error reporting Rancher 2.9.2 + RKE2 1.28.10, not sure if that could be a problem. and indeed resource there looks different, with statefullset in cattle-fleet-system

Copy code

# k get all -n cattle-fleet-system
NAME                READY   STATUS    RESTARTS   AGE
pod/fleet-agent-0   2/2     Running   0          10d

NAME                  TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/fleet-agent   ClusterIP   None         <none>        <none>    32d

NAME                           READY   AGE
statefulset.apps/fleet-agent   1/1     32d

stocky-account-63046

11/08/2024, 10:36 AM

conditions can be seen when

describe

ing the resource. though the overall state can be checked in the UI (state column). there could be some helm operations that are failing due to that connection issue as well, maybe that's stopping some of the fleet resources coming up?

acoustic-country-78083

11/08/2024, 10:48 AM

IMHO those looks ok on managed cluster

Copy code

#k describe deployment.apps/fleet-agent -n cattle-fleet-system | sed '/Conditions:/,$!d'
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Progressing    True    NewReplicaSetAvailable
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  fleet-agent-8689d9f67d (0/0 replicas created)
NewReplicaSet:   fleet-agent-6b6cbb454d (1/1 replicas created)
Events:          <none>
#k describe pod/fleet-agent-6b6cbb454d-qjzq8 -n cattle-fleet-system | sed '/Conditions/,/Volumes/!d'
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:

acoustic-country-78083

11/08/2024, 10:50 AM

Also deployment & pod state health looks fine in cattle-fleet-system

stocky-account-63046

11/11/2024, 8:50 AM

what are the conditions on the other cattle-fleet-system resources?

stocky-account-63046

11/11/2024, 8:52 AM

depending on whether the cluster is upstream (contains rancher) or downstream (all others) • upstream - checks stateful set (if available) and deployment ◦ stateful set - cattle-fleet-local-system/fleet-agent ◦ deployment - cattle-fleet-system/fleet-controller • downstream - checks stateful set ◦ stateful set - cattle-fleet-system/fleet-agent

stocky-account-63046

11/11/2024, 8:53 AM

for reference, this is the core block where fleet state is determined https://github.com/rancher/dashboard/blob/v2.9.2/shell/pages/c/_cluster/explorer/index.vue#L312-L340

stocky-account-63046

11/11/2024, 8:53 AM

https://github.com/rancher/dashboard/blob/v2.9.2/shell/pages/c/_cluster/explorer/index.vue#L488-L494

acoustic-country-78083

11/11/2024, 8:58 AM

rancher/upstream runs in docker deployment , where "Fleet" is marked as OK. In downstream clusters, there are is no other resource except deployment, pod & replica set in cattle-fleet-system

Copy code

#k get all -n cattle-fleet-system
NAME                               READY   STATUS    RESTARTS   AGE
pod/fleet-agent-6b6cbb454d-qjzq8   1/1     Running   0          13d

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/fleet-agent   1/1     1            1           136d

NAME                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/fleet-agent-6b6cbb454d   1         1         1       136d
replicaset.apps/fleet-agent-8689d9f67d   0         0         0       136d

acoustic-country-78083

11/11/2024, 8:59 AM

so could it be that if Rancher is checking for statefull_set means that we have older version of fleet-agent ?

acoustic-country-78083

11/11/2024, 8:59 AM

as these systems do not have access to internet

acoustic-country-78083

11/11/2024, 9:03 AM

Copy code

(downstream) #k describe deployment.apps/fleet-agent -n cattle-fleet-system | grep Image
    Image:      rancher/fleet-agent:v0.7.1

(sandbox with internet access where it's OK) # k describe statefulset.apps/fleet-agent -n cattle-fleet-system | grep Image
    Image:      rancher/fleet-agent:v0.10.2

stocky-account-63046

11/11/2024, 9:05 AM

to confirm, where do you see the fleet warning box, in the upstream cluster containing rancher or a downstream cluster?

acoustic-country-78083

11/11/2024, 9:10 AM

on downstream clusters, on all of them

stocky-account-63046

11/11/2024, 9:27 AM

thanks. you might be best taking to #C013SSBKB6U , though i'm not sure many people know or have looked at the ugprade path for docker

👍 1

wonderful-family-31680

11/11/2024, 12:24 PM

acoustic-country-78083

11/11/2024, 12:29 PM

Thanks a lot for your time ,expertise and advice.

👍 1

acoustic-country-78083

11/18/2024, 7:02 AM

For the record, after enabling proxy on docker container where rancher is running, fleet-agent pods were upgraded and rancher check for statefulset resources is green again, except one cluster where Conditions now reporting for Init flee-agent-register container report "[ContainersNotInitialized] containers with incomplete status: [fleet-agent-register]" which I'm trying to check

163 Views

Open in Slack

Previous Next