https://rancher.com/ logo
Title
b

bitter-tailor-6977

03/22/2023, 5:43 AM
hi All, i have 10 node cluster and by using longhorn created the dynamic storage ….. to check the resilience bit i tried to down two nodes and checked the replica is creating in another node…. but still it is not creating replicas in other available nodes ?
2
i

icy-agency-38675

03/22/2023, 1:35 PM
Can you check the information in the longhorn-manager log?
b

bitter-tailor-6977

03/23/2023, 4:24 AM
https://github.com/longhorn/longhorn/issues/5558 can you check on this, with latest commends
i

icy-agency-38675

03/23/2023, 4:24 AM
Are you using RWX volumes?
b

bitter-tailor-6977

03/23/2023, 4:41 AM
yes, Derek
for RWX volumes also workaround will work…??
i

icy-agency-38675

03/23/2023, 4:46 AM
Yes, check my update.
b

bitter-tailor-6977

03/23/2023, 4:48 AM
will do and keep posted u
unable to delete node: could not delete node ip-172-31-9-232.ec2.internal with node ready condition is False, reason is KubernetesNodeNotReady, node schedulable false, and 0 replica, 0 engine running on it
i hve deleted unknown pod and by using longhorn ui tried to delete the node above message was happening.. my cluster is in k3s
i

icy-agency-38675

03/23/2023, 4:57 AM
No, I mean try deleting instancemanger resource
Check the second workaround . Don’t need to delete the down node
b

bitter-tailor-6977

03/23/2023, 4:59 AM
ok fine
will do
👍 1
it happens again
even replica is up but not picking up with that
i

icy-agency-38675

03/23/2023, 5:08 AM
Can you provide the support bundle?
b

bitter-tailor-6977

03/23/2023, 5:11 AM
sure
no change in longhorn.yaml fine
i

icy-agency-38675

03/23/2023, 5:17 AM
Sorry, I mean generate a support bundle via UI (bottom left side)
b

bitter-tailor-6977

03/23/2023, 5:19 AM
cool doing that..
i

icy-agency-38675

03/23/2023, 5:22 AM
Sorry, I don’t see the support bundle in the ticket. You can generate a support bundle
tar file
from the UI page.
b

bitter-tailor-6977

03/23/2023, 5:23 AM
just curious to know, support zip file consist of any sensitive information ?
@icy-agency-38675 confirm on this and i got zip file downloaded in my local
i

icy-agency-38675

03/23/2023, 5:24 AM
b

bitter-tailor-6977

03/23/2023, 5:25 AM
this -just curious to know, support zip file consist of any sensitive information ?
i

icy-agency-38675

03/23/2023, 5:26 AM
Well, in general, no. But you can hide the information that are sensitive to you by manual.
b

bitter-tailor-6977

03/23/2023, 5:26 AM
okay sure sending u the zip
send the mail pls do the needful
i

icy-agency-38675

03/23/2023, 5:35 AM
Thank you. Will update if having any finding
b

bitter-tailor-6977

03/23/2023, 5:38 AM
pls do the needful
i

icy-agency-38675

03/23/2023, 5:39 AM
Sorry, which volume?
pvc-869f916a-5edc-4547-a62b-679d9cf1dc5b is running and attached
b

bitter-tailor-6977

03/23/2023, 5:40 AM
yes i have started node again
you can check some logs before attached
i

icy-agency-38675

03/23/2023, 5:42 AM
OK
Do know the boot time of the down node?
b

bitter-tailor-6977

03/23/2023, 5:47 AM
i am using redhat 8.6 in aws
not aware of boottime
i

icy-agency-38675

03/23/2023, 6:14 AM
Got it. I saw the down node became ready again at
2023-03-23T05:13:54Z
The engine is running at
2023-03-23T05:13:51.887241937Z
The volume was just running right before you started the down nodes…
If you encountered the issue again, please execute the workaround and generate a SB before restarting the down node.
b

bitter-tailor-6977

03/23/2023, 6:29 AM
okay trying that
stuck pvc
creating SB right now
i

icy-agency-38675

03/23/2023, 6:45 AM
deleted the unknown instancemanager resources?
b

bitter-tailor-6977

03/23/2023, 6:45 AM
yes
i

icy-agency-38675

03/23/2023, 6:45 AM
OK
b

bitter-tailor-6977

03/23/2023, 6:49 AM
check inbox
i

icy-agency-38675

03/23/2023, 6:49 AM
Received
instance-manager-e-b9b279de634f2725487a4ceee253c408 is still there
Can you show
kubectl -n longhorn-system get instancemanagers
b

bitter-tailor-6977

03/23/2023, 6:52 AM
yes i delete but keep on showing
i

icy-agency-38675

03/23/2023, 6:53 AM
🤔
b

bitter-tailor-6977

03/23/2023, 6:53 AM
xyz ~ % k delete pod instance-manager-e-b9b279de634f2725487a4ceee253c408 instance-manager-r-b9b279de634f2725487a4ceee253c408 --force -n longhorn-system
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "instance-manager-e-b9b279de634f2725487a4ceee253c408" force deleted
pod "instance-manager-r-b9b279de634f2725487a4ceee253c408" force deleted
xyz ~ % kubectl -n longhorn-system get instancemanagers
NAME                                                  STATE     TYPE      NODE                            AGE
instance-manager-e-b9b279de634f2725487a4ceee253c408   unknown   engine    ip-172-31-8-245.ec2.internal    15h
instance-manager-r-b9b279de634f2725487a4ceee253c408   unknown   replica   ip-172-31-8-245.ec2.internal    15h
see i have deleted but still it is showing kubectl
even i used force to delete
so to delete we need any server side SIGNAL, so it is unable to delete ?
i

icy-agency-38675

03/23/2023, 6:56 AM
No, let me check
b

bitter-tailor-6977

03/23/2023, 6:56 AM
once the node is up, the unknow is gone
i

icy-agency-38675

03/23/2023, 6:58 AM
I see. Need to take time to study. Another and I tried the workaround and it worked. Not sure what happen now. You can start the node. Will keep you posted.
Thank you. Can you provide your env information? Platform and k8s distro and version
Interesting… I can delete it.
root@rancher60-master:~/longhorn# kl get instancemanager
NAME                                                  STATE     TYPE      NODE                AGE
instance-manager-r-89ccb5ff5acd0803dc91e2b6355464d5   running   replica   rancher60-worker3   160m
instance-manager-e-89ccb5ff5acd0803dc91e2b6355464d5   running   engine    rancher60-worker3   160m
instance-manager-e-f4731def1cefda4f51631c82ea14ad12   running   engine    rancher60-worker1   160m
instance-manager-r-f4731def1cefda4f51631c82ea14ad12   running   replica   rancher60-worker1   160m
instance-manager-r-21850b3359743fcfc290fc3abf9cae85   unknown   replica   rancher60-worker2   160m
instance-manager-e-21850b3359743fcfc290fc3abf9cae85   unknown   engine    rancher60-worker2   160m
root@rancher60-master:~/longhorn# kl delete instancemanager instance-manager-e-21850b3359743fcfc290fc3abf9cae85
<http://instancemanager.longhorn.io|instancemanager.longhorn.io> "instance-manager-e-21850b3359743fcfc290fc3abf9cae85" deleted
root@rancher60-master:~/longhorn# kl get instancemanager
NAME                                                  STATE     TYPE      NODE                AGE
instance-manager-r-89ccb5ff5acd0803dc91e2b6355464d5   running   replica   rancher60-worker3   160m
instance-manager-e-89ccb5ff5acd0803dc91e2b6355464d5   running   engine    rancher60-worker3   160m
instance-manager-e-f4731def1cefda4f51631c82ea14ad12   running   engine    rancher60-worker1   160m
instance-manager-r-f4731def1cefda4f51631c82ea14ad12   running   replica   rancher60-worker1   160m
instance-manager-r-21850b3359743fcfc290fc3abf9cae85   unknown   replica   rancher60-worker2   160m
b

bitter-tailor-6977

03/23/2023, 7:17 AM
k3s : v1.25.4+k3s1 Worker and Master: Redhat 8.5 Longhorn: 1.4.1
i

icy-agency-38675

03/23/2023, 7:17 AM
Gosh…
I know why
not pod… you should delete instancemanager resource kubectl -n longhorn-system get instancemanagers ==> list unknown resources kubectl -n longhorn-system delete instancemanager <name> ==> delete unknown resources
b

bitter-tailor-6977

03/23/2023, 7:20 AM
ufffff
cool finding buddy
will confirm with u
🙌 1
workaround works man
so this fix will be available from 1.4.2
i

icy-agency-38675

03/23/2023, 11:29 AM
Thank you for the update. Yeah, already merged and will be in v1.4.2.
b

bitter-tailor-6977

03/23/2023, 11:30 AM
great updating the ticket too u can aslo update
👍 1