This message was deleted.
# rancher-desktop
a
This message was deleted.
i
FYI, Just restarting RD does 'fix' the issue in that docker commands run again, but I immediately get the same error.
w
and the context is set to rancher-desktop?
i
Yes, I have it set to rancher-desktop.
👍 1
w
is this a big build? could it be an out of memory issue?
i
In my opinion it isn't a big build. It's the final step of 9 running
yarn run build
. Additionally, I've got the VM info in RD set to 24GB of Ram and 6 vCPUs
w
heh i have to get a desktop upgrade! 🤣
🤣 1
i
I don't think it matters, but I am using DOCKER_BUILDKIT=1
w
if you rdctl shell into the VM anything odd inside the VM that you can see during the build?
i
I've not previously done anything like that. Do you have a quick how-to?
w
just type
rdctl shell
and you will be in the Alpine VM where the containers are running. challenge will be limited tooling in there by default. top is in there though
and you can
tail -f /var/log/messages
to see what flies by as you do the build
it seems like some resource issue leading to instability in the VM which is why a restart (reboot of the VM) fixes it temporarily
i
Nothing jumping out at me ATM, but I am seeing this scroll through often
process '/sbin/getty -L 0 ttyS0 vt100' (pid 2047) exited. Scheduling for restart.
. I don't think it's related. I need to reboot RD and try a build again.
So I ran the build and noticed a couple things. First, I am regurally getting the message
alpine auth.err getty[24624]: tcgetattr: I/O error^M
. This is regardless of if the build is running or not and comes up every second or so. After running the build, I noticed a few different messages. When I first started, I saw
/usr/local/bin/rancher-desktop-guestagent, pid 23857, exited with return code 0
, but after the build ran for a little it became
/usr/local/bin/rancher-desktop-guestagent, pid 24620, terminated by signal 9
. While the build was running I saw the message
Child command line: /usr/local/bin/cri-dockerd --container-runtime-endpoint unix:///run/cri-dockerd.sock --network-plugin=cni --cni-bin-dir=/usr/libexec/cni --cni-conf-dir=/etc/cni/net.d --cni-cache-dir=/var/lib/cni/cache
come up twice. Finally, right at the same time the docker build failed with the
unexpected EOF
, I got
Error: exit status 255
and went back to my normal shell. I ran
rdctl shell
near immediately and it went right back in.
w
you seeing anything in
top
i
I'm working on that view now.
Looks like it may be memory related, which is odd. I went ahead and installed htop directly then watched it while the build ran. Nothing out of the ordinary happened with the processes that I saw. I did see a spike where all 6 vCPUs went to 100% for 10 seconds or so, but it dropped off and stayed stable. Right before it crashed and kicked me out again, it was at 50% on all processors. Memory usage is a bit odd though in that it is only showing 2-3 GB of ram. Right before it died, the memory did spike to basically 100% if I'm reading it correctly. Need to double check the color-coding on that, but 90% of the line was green while the least 10% was purple. First I've seen that.
w
i mean 90% of a 24GB vm is a big heap. is it like an ai model or something with a huge working set?
i
I'm not sure I understand the question? Additionally, I'm not convienced it is using the full 24G, as htop reports 2.19G
w
just trying to figure out where all the memory was going. some process is trying to allocate more than it’s fair share, but again hard ot know from this side of the chat
i
Fair. I'm going to try and reset K8S. It's odd to me that it's not reporting the full 24GB:
Keep looking into this and found out that my VM didn't actually have the 24 GB of RAM, even though it was configured for it. Found this: https://rancher-users.slack.com/archives/C0200L1N1MM/p1661873444221709 which basically talks about a limit and needed to upgrade OSX on a M1. I can't do this right now as it's a work laptop, so I'm going to try and add a Swap Space to the image and go from there.
Swap Space Worked