This message was deleted.
# longhorn-storage
a
This message was deleted.
l
is
kubectl describe …
on the workloads consuming some of these PVC’s tattle telling something?
Is
kubectl describe
on the PVC’s themselves saying something?
And how many vols do you have? Thinking if this could be a … wait some more and the
instance-manager
and
engine-manager
is done processing
s
The
kubectl describe ...
on the workloads is saying that it's waiting for the volumes to attach and then
timed out waiting for the condition
. The
describe
on the PVs as well as on the PVCs isn't showing anything under
Events
, but I've been at this for a few hours so maybe whatever event showed up there is gone now. I have 15 volumes.
l
15 is kinda nothing … so shouldn’t be that. And nothing obvious in the stdout / container logs of whatever workloads?
s
I'm combing through them but I don't think I'm going to find much because these workloads can't even start without the volumes attached
l
ah yeah okay … fair enough. Makes sense.
And what about if you port-forward to the Longhorn UI … are the nodes Healthy and schedulable?
s
The Longhorn nodes are healthy and scheduleable, but one thing I'm noticing is that all the replicas in all the nodes are either
Failed
,
Stopped
, or
Unknown
...
There's also nothing on the
instance-manager
pods logs, neither in the
-e
or the
-r
ones, which is suspicious...
Also, if I open a given volume, it shows me that the only available replica is stopped but assigned to an instance manager with a name that doesn't match any of the
instance-manager-r-*
pods I see running
Is it possible that during the upgrade the connection between replicas an instance-managers got lost or out of sync and now I have to do it manually on the custom resource objects directly?
Let me start a new thread with this because I think this looks like a potential root cause...
l
yeah new thread