sticky-summer-13450
07/11/2022, 1:29 PMUnable to attach or mount volumes: unmounted volumes=[prometheus-rancher-monitoring-prometheus-db], unattached volumes=[prometheus-rancher-monitoring-prometheus-db nginx-home config-out tls-assets web-config prometheus-rancher-monitoring-prometheus-rulefiles-0 config kube-api-access-hwjgz prometheus-nginx]: timed out waiting for the condition
great-bear-19718
07/12/2022, 9:58 AMsticky-summer-13450
07/12/2022, 5:22 PMgreat-bear-19718
07/20/2022, 7:11 AMinstance-manager-e-d920c0a2
on harvester001, are you able to delete it? longhorn should schedule another one and it should kick the volume back into actionsticky-summer-13450
07/20/2022, 2:35 PM%Cpu(s): 19.4 us, 16.7 sy, 0.0 ni, 0.0 id, 63.6h wa, 0.0 hi, 0.4 si, 0.0 st
But now you come to mention it, the load average is over 3000, I hadn't spotted that... There are over 3000 istances of blkid doing not-a-lot.
root 32750 9360 0 Jul17 ? 00:00:00 blkid -p -s TYPE -s PTTYPE -o export /dev/longhorn/pvc-a5b5fe4c-eca4-4c97-a3db-f9490980c044
so I did
~> sudo /var/lib/rancher/rke2/bin/kubectl --kubeconfig=/etc/rancher/rke2/rke2.yaml delete pod instance-manager-e-d920c0a2 --namespace=longhorn-system
pod "instance-manager-e-d920c0a2" deleted
and all those processes have gone and the load average is coming down.
I'm currently doing this remotely - sshed via a bastion with some port forwarding, and I can't remember how to view the Longhorn UI remotely, so I'll check that out when I get back home.great-bear-19718
07/20/2022, 11:54 PMConditions:
Restore:
Last Transition Time: 2022-02-13T21:57:33Z
Status: False
Type: restore
Scheduled:
Last Transition Time: 2022-07-03T15:58:15Z
Status: True
Type: scheduled
Toomanysnapshots:
Last Transition Time: 2022-07-03T05:26:56Z
Message: Snapshots count is 248 over the warning threshold 100
Reason: TooManySnapshots
Status: True
Type: toomanysnapshots
i suspect that is likely to be due to too many snapshotssticky-summer-13450
07/21/2022, 7:56 AM