This message was deleted.
# longhorn-storage
a
This message was deleted.
2
i
Can you check the information in the longhorn-manager log?
b
https://github.com/longhorn/longhorn/issues/5558 can you check on this, with latest commends
i
Are you using RWX volumes?
b
yes, Derek
for RWX volumes also workaround will work…??
i
Yes, check my update.
b
will do and keep posted u
unable to delete node: could not delete node ip-172-31-9-232.ec2.internal with node ready condition is False, reason is KubernetesNodeNotReady, node schedulable false, and 0 replica, 0 engine running on it
i hve deleted unknown pod and by using longhorn ui tried to delete the node above message was happening.. my cluster is in k3s
i
No, I mean try deleting instancemanger resource
Check the second workaround . Don’t need to delete the down node
b
ok fine
will do
👍 1
it happens again
even replica is up but not picking up with that
i
Can you provide the support bundle?
b
sure
no change in longhorn.yaml fine
i
Sorry, I mean generate a support bundle via UI (bottom left side)
b
cool doing that..
i
Sorry, I don’t see the support bundle in the ticket. You can generate a support bundle
tar file
from the UI page.
b
just curious to know, support zip file consist of any sensitive information ?
@icy-agency-38675 confirm on this and i got zip file downloaded in my local
i
b
this -just curious to know, support zip file consist of any sensitive information ?
i
Well, in general, no. But you can hide the information that are sensitive to you by manual.
b
okay sure sending u the zip
send the mail pls do the needful
i
Thank you. Will update if having any finding
b
pls do the needful
i
Sorry, which volume?
pvc-869f916a-5edc-4547-a62b-679d9cf1dc5b is running and attached
b
yes i have started node again
you can check some logs before attached
i
OK
Do know the boot time of the down node?
b
i am using redhat 8.6 in aws
not aware of boottime
i
Got it. I saw the down node became ready again at
2023-03-23T05:13:54Z
The engine is running at
2023-03-23T05:13:51.887241937Z
The volume was just running right before you started the down nodes…
If you encountered the issue again, please execute the workaround and generate a SB before restarting the down node.
b
okay trying that
stuck pvc
creating SB right now
i
deleted the unknown instancemanager resources?
b
yes
i
OK
b
check inbox
i
Received
instance-manager-e-b9b279de634f2725487a4ceee253c408 is still there
Can you show
kubectl -n longhorn-system get instancemanagers
b
yes i delete but keep on showing
i
🤔
b
Copy code
xyz ~ % k delete pod instance-manager-e-b9b279de634f2725487a4ceee253c408 instance-manager-r-b9b279de634f2725487a4ceee253c408 --force -n longhorn-system
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "instance-manager-e-b9b279de634f2725487a4ceee253c408" force deleted
pod "instance-manager-r-b9b279de634f2725487a4ceee253c408" force deleted
xyz ~ % kubectl -n longhorn-system get instancemanagers
NAME                                                  STATE     TYPE      NODE                            AGE
instance-manager-e-b9b279de634f2725487a4ceee253c408   unknown   engine    ip-172-31-8-245.ec2.internal    15h
instance-manager-r-b9b279de634f2725487a4ceee253c408   unknown   replica   ip-172-31-8-245.ec2.internal    15h
see i have deleted but still it is showing kubectl
even i used force to delete
so to delete we need any server side SIGNAL, so it is unable to delete ?
i
No, let me check
b
once the node is up, the unknow is gone
i
I see. Need to take time to study. Another and I tried the workaround and it worked. Not sure what happen now. You can start the node. Will keep you posted.
Thank you. Can you provide your env information? Platform and k8s distro and version
Interesting… I can delete it.
Copy code
root@rancher60-master:~/longhorn# kl get instancemanager
NAME                                                  STATE     TYPE      NODE                AGE
instance-manager-r-89ccb5ff5acd0803dc91e2b6355464d5   running   replica   rancher60-worker3   160m
instance-manager-e-89ccb5ff5acd0803dc91e2b6355464d5   running   engine    rancher60-worker3   160m
instance-manager-e-f4731def1cefda4f51631c82ea14ad12   running   engine    rancher60-worker1   160m
instance-manager-r-f4731def1cefda4f51631c82ea14ad12   running   replica   rancher60-worker1   160m
instance-manager-r-21850b3359743fcfc290fc3abf9cae85   unknown   replica   rancher60-worker2   160m
instance-manager-e-21850b3359743fcfc290fc3abf9cae85   unknown   engine    rancher60-worker2   160m
root@rancher60-master:~/longhorn# kl delete instancemanager instance-manager-e-21850b3359743fcfc290fc3abf9cae85
<http://instancemanager.longhorn.io|instancemanager.longhorn.io> "instance-manager-e-21850b3359743fcfc290fc3abf9cae85" deleted
root@rancher60-master:~/longhorn# kl get instancemanager
NAME                                                  STATE     TYPE      NODE                AGE
instance-manager-r-89ccb5ff5acd0803dc91e2b6355464d5   running   replica   rancher60-worker3   160m
instance-manager-e-89ccb5ff5acd0803dc91e2b6355464d5   running   engine    rancher60-worker3   160m
instance-manager-e-f4731def1cefda4f51631c82ea14ad12   running   engine    rancher60-worker1   160m
instance-manager-r-f4731def1cefda4f51631c82ea14ad12   running   replica   rancher60-worker1   160m
instance-manager-r-21850b3359743fcfc290fc3abf9cae85   unknown   replica   rancher60-worker2   160m
b
k3s : v1.25.4+k3s1 Worker and Master: Redhat 8.5 Longhorn: 1.4.1
i
Gosh…
I know why
not pod… you should delete instancemanager resource kubectl -n longhorn-system get instancemanagers ==> list unknown resources kubectl -n longhorn-system delete instancemanager <name> ==> delete unknown resources
b
ufffff
cool finding buddy
will confirm with u
🙌 1
workaround works man
so this fix will be available from 1.4.2
i
Thank you for the update. Yeah, already merged and will be in v1.4.2.
b
great updating the ticket too u can aslo update
👍 1