I have a node that after having rebuilt it 3 times...
# harvester
s
I have a node that after having rebuilt it 3 times, everytime it works for a while till it reboots.
image.png
b
first guess is your disk is going bad.
When you rebuilt it, did you wipe all the drives?
s
Yes, used disk shredder.
these are brand new SSDs
b
I've seen some pretty sketchy SSDs. ¯\_(ツ)_/¯
s
I would hope not, It cost us 100k.
b
I don't think it's all of them, just your boot disk.
How is it set up?
Are you doing a RAID?
s
Every time I rebuilt it, I changed the boot SSD to a different disk just because I thought the same thing.
just a single ssd for now.
I'll set it up as a raid once it's stable.
b
Hm. Could be the controller/firmware.
s
Weird this is node 3 of 3.
first 2 no issues.
b
Right!
Same hardware right?
s
Yup.
b
Still smells like hardware issues to me.
Even more so.
I'd double check that all the firmware matches the working nodes.
👍 1
Bios/storage controllers/disk.
Then get it up again and check dmesg and journalctl for any additional clues.
s
when ending up in linux maintenance mode I have no issues mounting the drives.