https://rancher.com/ logo
Title
q

quaint-alarm-7893

02/18/2023, 2:44 AM
hello everyone. i'm trying to use rancher (in docker) that i've linked to a 3-node harvester cluster to create a new RKE2 cluster. i can get it to create an RKE-1 cluster, but K3's and RKE2 keep failing trying to create the vm's in harvester. any pointers? docker log:
rancher-rancher-1  | 2023/02/18 02:42:07 [ERROR] error syncing 'fleet-default/rke-bootstrap-template-8lzmc': handler rke-machine: failed to delete fleet-default/rke-bootstrap-template-8lzmc-machine-bootstrap /v1, Kind=Secret for rke-machine fleet-default/rke-bootstrap-template-8lzmc: secrets "rke-bootstrap-template-8lzmc-machine-bootstrap" not found, failed to delete fleet-default/rke-bootstrap-template-8lzmc-machine-plan /v1, Kind=Secret for rke-machine fleet-default/rke-bootstrap-template-8lzmc: secrets "rke-bootstrap-template-8lzmc-machine-plan" not found, failed to delete fleet-default/rke-bootstrap-template-8lzmc-machine-plan <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>, Kind=Role for rke-machine fleet-default/rke-bootstrap-template-8lzmc: <http://roles.rbac.authorization.k8s.io|roles.rbac.authorization.k8s.io> "rke-bootstrap-template-8lzmc-machine-plan" not found, requeuing
g

great-bear-19718

02/20/2023, 4:53 AM
hard to say.. #general may be a better place for this
i suspect its likely because k3s/rke2 uses v2 provisioning which runs via k8s jobs, and those cant likely be created in a docker based install
q

quaint-alarm-7893

02/20/2023, 4:54 AM
i swear i got it to work once, but cant figure out how.
g

great-bear-19718

02/20/2023, 4:55 AM
i am not questioning that.. but i am not across all of the rancher packaging
q

quaint-alarm-7893

02/20/2023, 4:55 AM
on a side note, i have this odd issue now where a pv is stuck in an attach/detach loop. i dont know why i keep having stability issues, but it's driving me bonkers... feel like taking a peep?
g

great-bear-19718

02/20/2023, 4:55 AM
may be best to check in general
sure please share the support bundle
q

quaint-alarm-7893

02/20/2023, 4:59 AM
supportbundle_9042f568-6514-4a9f-a6c3-96a342641671_2023-02-20T04-56-47Z.zip
it's the pvc: pvc-4fe789bf-4d15-46f0-9318-acee7a122592 for the vm: vdi/josephpennella
for some reason, that pvc only had one valid replica too. found it on this node: harvester-03:/var/lib/harvester/extra-disks/533acf928b914b7fc49e5f104c76e131/replicas/pvc-4fe789bf-4d15-46f0-9318-acee7a122592-d3245087
g

great-bear-19718

02/20/2023, 5:24 AM
those objects are in harvester-longhorn namespace
i dont think that is a default ns for support bundle info collection
can you please trigger another support bundle after updating the setting for support-bundle-kit namespaces
q

quaint-alarm-7893

02/20/2023, 5:24 AM
i have a longhorn bundle. do you want that, or should i pull another harvester one w/ harvester-longhorn added?
g

great-bear-19718

02/20/2023, 5:25 AM
another harvester one would be nice since i have some tools to easily browse it
q

quaint-alarm-7893

02/20/2023, 5:26 AM
longhorn-system sound right?
g

great-bear-19718

02/20/2023, 5:27 AM
the VM's are in namespace
harvester-longhorn
q

quaint-alarm-7893

02/20/2023, 5:27 AM
i dont see one for harvester-longhorn
g

great-bear-19718

02/20/2023, 5:27 AM
longhorn-system is one of the default ns
sorry its in vdi namespace
and that is included..
q

quaint-alarm-7893

02/20/2023, 5:27 AM
that vm is in vdi. and vid is in the list to inclue
g

great-bear-19718

02/20/2023, 5:28 AM
my bad
that was the storage-class
q

quaint-alarm-7893

02/20/2023, 5:28 AM
no worries 🙂
g

great-bear-19718

02/20/2023, 5:30 AM
you only have 1 replica for this pvc.. which is strange since storage class is the default one
q

quaint-alarm-7893

02/20/2023, 5:30 AM
yeah, i know. and i dont recall removing any. so i dont know what happened there.
i've also already rebooted the node too. fyi.
g

great-bear-19718

02/20/2023, 5:31 AM
did you resize this volume?
longhorn-manager-j85tj longhorn-manager 2023-02-20T04:57:47.472220687Z time="2023-02-20T04:57:47Z" level=warning msg="pvc-4fe789bf-4d15-46f0-9318-acee7a122592-e-11eb6755: time=\"2023-02-20T04:57:34Z\" level=warning msg=\"backend <tcp://10.52.0.223:10060> size does not match 107374182400 != 536870912000 in the engine initiation phase\""
q

quaint-alarm-7893

02/20/2023, 5:31 AM
found this in an engine-manager log too. not sure what to do about it.
2023-02-20T04:02:06.983644495Z [pvc-4fe789bf-4d15-46f0-9318-acee7a122592-e-11eb6755] time="2023-02-20T04:02:06Z" level=warning msg="backend tcp://10.52.0.139:10105 size does not match 107374182400 != 536870912000 in the engine initiation phase"
lol jinks?
it was 50gb, expanded to 100gb
if i remember correctly atleast
g

great-bear-19718

02/20/2023, 5:32 AM
let me ask our longhorn team too
how did you resize the pvc?
q

quaint-alarm-7893

02/20/2023, 5:35 AM
wait... this volume is 100gb, becuase i created it last week w/ dd when i cloned it form one vol to another. so it should be 100gb
i basically did a bunch of dd cloning last week because i didnt like the way the backing image were working.
NAME                          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS         AGE
josephpennella-disk-2-8r3nb   Bound    pvc-4fe789bf-4d15-46f0-9318-acee7a122592   100Gi      RWX            harvester-longhorn   7d10h
g

great-bear-19718

02/20/2023, 5:37 AM
q

quaint-alarm-7893

02/20/2023, 5:37 AM
-rw------- 1 root root         4096 Feb 19 22:32 revision.counter
-rw-r--r-- 1 root root 536870912000 Feb 19 22:33 volume-head-002.img
-rw-r--r-- 1 root root          161 Feb 19 22:33 volume-head-002.img.meta
-rw-r--r-- 1 root root 107374182400 Feb 19 22:32 volume-snap-expand-536870912000.img
-rw-r--r-- 1 root root          210 Feb 19 22:33 volume-snap-expand-536870912000.img.meta
-rw-r--r-- 1 root root 107374182400 Feb 19 22:32 volume-snap-f794fa03-6bea-4365-8e3c-863a013f00d8.img
-rw-r--r-- 1 root root          158 Feb 19 17:53 volume-snap-f794fa03-6bea-4365-8e3c-863a013f00d8.img.meta
-rw-r--r-- 1 root root          179 Feb 20 05:37 volume.meta
g

great-bear-19718

02/20/2023, 5:37 AM
let me ask the longhorn team
q

quaint-alarm-7893

02/20/2023, 5:41 AM
good find on the bug. i didnt see anything when i searched. bad google-fu tonight i guess 😞
messing w/ it. i'll let you know.
so it seems in the case of the bug, it was able to recover because of the other two replicas, but with me having only one, i'm not sure what to do.
g

great-bear-19718

02/21/2023, 6:02 AM
can you please provide me the longhorn bundle
or upload it on the same issue you commented on?
q

quaint-alarm-7893

02/21/2023, 2:10 PM
sure, here you go. the PVC in question is: pvc-4fe789bf-4d15-46f0-9318-acee7a122592 for the vm: vdi/jp-old.
i'll post on that comment as well.